Housekeeping Part 1 – PSA Data, Cube FACT tables and DSO changelogs
Some housekeeping tasks can be done quite easily by the BW team, either manually or as part of the process chains. This section covers those tasks. These tasks can have a direct impact on the retention of staging data and the ability to back out or otherwise manipulate individual dataloads, so you need to be careful to strike the correct balance between the retention of temporary staging data and the size of your database. Your own technical experience and understanding of the individual system and the business in which you operate will help you to make informed decisions about how long to retain staging data before either trimming it or compressing it.
PSA Tables
Some of the biggest tables I have ever seen in a live BW system belonged to the PSA for the 0MATERIAL_ATTR datasource. This table was being loaded from ERP every day using full loads, so there was no need to keep a permanent record of the data in BW for each day in the past. Fifty thousand records a day doesn’t sound like much, but over 5 years or so, it adds up, and when I saw the table it was huge.
Now imagine 20 or 30 PSA tables like that for various datasources and you can see how you can get to a situation where the database is full of unnecessary data. It is using unnecessarily large amounts of storage to hold all that useless data, and the large tables mean it takes longer to add data in & get data out of those tables, thereby increasing your dataload times.
These tables should be trimmed as part of your regular housekeeping
FACT tables
Infocubes have 2 FACT tables. When you first load data into a cube, the data goes into the F-FACT table, and it still holds the request ID which loaded the data. This means that you can back individual dataload requests out of cubes quite simply if they are still in the F-FACT table, but the data takes up more space because of the request ID.
In order to save space and speed up query execution times, you should ‘compress’ your cubes once enough time has elapsed so that you know you’re not going to need to back individual data requests out. (10-30 days is common.) This moves the data from the F-FACT table into the E-FACT table and removes the request ID in the process.
DSO changelog tables
Are you running delta dataloads into one or more DSOs in your system? If so, then you are populating the DSO changelog table each time you load a delta request. This table keeps track of the changes in the data that come about each time a delta request is loaded into a DSO. Once the data has been activated in the DSO, it is unlikely that you will need the changelog data again, so you should trim your DSO changelog tables or they will become very large. It’s normal to trim data older than 10-30 days from each DSO changelog as part of your regular housekeeping tasks.
Find out more about BW Housekeeping. Register for our Knowledge Transfer Session: BW Housekeeping on 28th July.
Guest author: Mark Wheaton