AS400 / Reuse Deleted Records

rhunt01

Well Known Member
We recently went live with JDE9. One of the things that happens during the table conversion is that many of the tables are dropped, recreated, and data is copied back in. Since many of the tables have been recreated, I'm finding that F0902, F0911, F0901, F03B11, etc. are all now set to reuse deleted records instead of the alternative (which is use arrival sequence (all new goes to the end of the table) and records are "marked" as deleted).

After a decade of XE I have all the maintenance jobs prebuilt that I need to manage auotmated reorg physical file member maintenance each weekend for files that pass a deleted percent threshold.

In fact, I kind of liked this approach. I felt like the latest data always stayed close together in the table/file, minimal index seeks needs to occur to find table space to insert into, and I kept indexes dense with RGZPGM as needed.

In the MS SQL world I can speak to exactly how table structure, index approaches, file definitions (like reuse or not) translate to physical query implementations, data density on disk, disk hardware use, etc.

Anytime I try to have this discussion with an AS400 lifer or IBM I typically get quoted on something about single level storage and how trying to manage AS400 dasd to minimize disk latency, head travel etc. are innefective and can hurt AS400 performance.

So, looking for any AS400 uber nerds that have an opinion on whether or not I should set this back to the way XE had it on the AS400.

Thanks

Ryan
 
Ryan,

My own personal opinion is that you really want reuse deleted records.

In the old days (pre-S38/400) you had to worry about where files existed on disk and locality of reference. On the 400/System i if the system notices a 'hot spot' it will rearange the data ON DISK to spread it over multiple heads. It has a resolution of 4K.

Unless you have a very small active data set, the multiple heads solves most of your issues. If you have indexing set up properly, most of the queries will never have to go to the actually record, as queries that can be satisfied via information in in the index never actually get to the physical file and its data.

The more available disk space you have the better the 400 is at spreading data appropriately. The automatic re-use deleted funtion minimizes disk space use, also you don't need the down time required to releast the free space on RGZPFM, of course if you are using the allow cancel function you won't need to go down on a reorg, but it will take significantly longer.

Just one Mans Opinion.

Tom
 
Ryan,

they're right. Now-a-days you can't manage/control where data is stored. Just relax and let the system handle it.
 
My take on this issue is that there are definite advantages and disadvantages to the whole reuse deleted records issue on the iSeries.

One one side you don't need to worry about the cleanup of these deleted records because the 400 will reuse this space. On the other hand there is overhead related to it and you can definitely run into performance issues if you are not aware. One case in point, when you clean out files with DEL *all you basically set all records to reuse and this is slow on a reload. A CLRPFM is a better way to go so there is no overhead to managing this space in the file.

I'm not sure how you had your Xe environment configured but I always remember the files set to reuse deleted records and I would not recommend you change it.
 
Traveler,

Just an FYI, since V5R2 (I think) if you do 'Delete from schema/table' with no where clause, a CLRPFM will be issued if possible, if not THEN it will delete each row and retain the space.

Also, I have run a few benchmarks on a file with 10M deleted records vrs an empty file and loading the 10M records again has <1% difference in CPU time, I'm unable to completely isolate it, but the time is close.

Tom
 
Tom, few questions for you...

1) On your test with 10M deleted vs an empty file, where both files logically empty (select * FROM table produces 0 rows)? I ask because the overhead I expected to be there was for:
a. DB2/400 to have to read through its storage allocation maps to find blocks of storage available for resuse.
b. perform increased random i/o to write to portions of the file that may be split all over the place on the disks.

I ask about whether your file was logically empty because both of the items above might have happend in a sequential I/O pattern (much like finding and writing to the end of a table) as opposed to random. Please note that I say all of this while knowing very little about the algorithms the AS400 uses to manage storage (I'm more familiar with other systems so my thoughts might be tainted)
laugh.gif


2) In the MS SQL world (oh, how I hate to begin statements like that in an AS400 thread) we manage indexes for fragmentation - both logical and physical - in an attempt to make sure as little is read from disk as possible AND that it happened sequentially and not randomly. One of the things i like about the RGZPGM command is it always rebuilt all of my indexes - thereby making me feel all warm and fuzzing inside that my indexes are absolutely as dense as possible.

That being said, do you ever drop and rebuilt or RGZPGM to maintain your indexes?
 
rhunt,

1) Yes both were logically empty.
a. I believe a list is kept of available blocks, so it doesn't really do much of a read.
b. Since we don't control which blocks get allocated for a record, I'm not sure this matters. Anyway the system will move 'hot' spots if you use STRDSKRGZ
2) On the 400 indexes are maintained live and are 'self balancing' If one branch gets too long the index is balanced by the OS (on the 400 the OS is the DB) I almost never drop an index because of the self balancing. I will do a RGZPFM if I want a clustered index (in MS SQL terms). Otherwise I seldom do a reorg, I will if I get a very large number of deleted records that are more or less 'permanent'

Tom
 
Back
Top