Purging and Archiving

Bob_Duben

Bob_Duben

Well Known Member
I know this has been discussed on and off here in the LIST and I have been following for over two years but now it is time.

We have been "collecting" data for over 4 years and our PRDDAT is over 350gigs.

So lets take this very methodically, and use this thread as a clearly defined roadmap to help others develop a plan for storage and archive as well as accessing this historical data.

Lets use the old term OneWorld, even though it is now called EnterpriseOne, just to simplify the discussion. Another note, we use an AS/400 for our Enterprise server so the questions may be skewed to it’s additional idiosyncrasies.

The first question that needs to be asked even before we get into what should be purged or even how much of it, would be the following:

1. How do you handle the relocation and access of the purged data?
a. Do you setup a new separate OneWorld ARCHIVE environment that users must login to gain access to this historical data?
b. Do you setup some sort of link from the live data to this archived data so it is always available from within the Production environment?c. Do you just save it outside of OneWorld and use something like MSAccess to query the data?

2. What storage considerations were you faced with?
a. Leave it on your enterprise server?
b. Use a less expensive storage and access location?
c. Back it off to tape and store it?

3. How can the functionality be maintained with this archived data?
a. How do you identify the OneWorld table relationships so you will be archiving the supporting tables as well so the links are not broken?
b. If a version of OneWorld changes how do I keep this data accessible?

4. Other options to be considered?
a. Should I wait for PeopleSoft to develop purging and archiving tools that actually work?
b. How about I just hire a third party like ARCTOOLS to be done with it and just go to the beach?

My thoughts:

1a. This seems to be the most logical and perhaps the cleanest setup. It will shrink the size of the live production environment and speed up our whole OneWorld environment. However, it opens some new questions like, do we start a new archive environment every year or even every couple of years or do we just keep appending our purges to one archived environment? Then, do we just treat this new environment like any other with updates and such?

1b. If this is possible, it seems to be the most seamless to the users but it begs the question of whether any benefit would be realized by a live table linked to an archived table. Will it really have the effect of the system performance being better since it is reporting from smaller tables or will the queries still have to run across both the live and archived tables thus gaining nothing in speed?

1c. Since the archived data is used mostly for queries why not dump it outside of OneWorld to cheaper disk storage and just let the MSAccess gurus query it all day?

2.a. This is a huge question for us since AS/400 DASD (hard drives) are very expensive yet, this is the simplest approach. I know we could save space by not creating a whole new environment just point an existing environment to this new data.

2.b. I would love to be able to split the archived data off of the AS/400 and use our much cheaper network storage since it changes very little and won’t require the daily backups. Then we get back to the separate archived environment vs linked tables.

2.c. I know we can always back older stuff off to tape but then you have to deal with how and where to restore the data for access. You can’t just restore it into the live production environment so perhaps you would have to either use the PY environment or create an archive retrieval environment for this specific purpose.

3.a. Identifying what data to archive is not near as difficult as discovering what supporting (relational) tables are also needed to maintain the functionality. I am holding my breath that PeopleSoft will enlighten us in the near future.

3.b. The answer here will depend on whether a new separate environment for the archive has been created or it is somehow linked to the archived tables. Separate would be treated like any other environment and updated whereas linked tables are still essentially part of the live environment.

4.a. Can’t wait, out of time and disk space.
4.b. The cost can certainly be justified at the price of AS/400 drives. Who wouldn’t rather go to the beach?
 
Bob: I like 4b best myself :) When I'm done, I'll meet you at the beach and you can buy the first round!

Your 1a (set up an archive 'environment') provides the most 'bang for the buck' - the ability to look up old stuff when needed.

1b is possible, but need to be done selectively. If you did it everywhere, you'd probably be in worse shape than before due to the overhead of gluing to sets of files together. 1c is the cheapest, if you can sell it to the users.

If you leave it on your AS400 you will occupy disk space, but it's just sitting there, invisible to the live environment. Unless the overall disk utilization is driven too high it won't really bother anyone and you don't even have to back it up very often.

Re 3 - in reality, not many shops will go for the linked tables approach. It's more work for an incremental benefit. That leaves the hard part being figuring out what related tables need to stay together. Even after you figure that out, there are other considerations to ponder... It's difficult, to say the least.

Re 4a - I'm ready if you are.
Re 4b - ditto.
 
Hi guys!

Need to get some updates on this topic as we are planning of implementing an archiving and purging solution in phases, meaning we will have to make use of the existing tools from JDE and later on work on a software that does jde archiving. Any comments or new information would be much appreciated. Thanks.
 
1a. This seems to be the most logical and perhaps the cleanest setup. It will shrink the size of the live production environment and speed up our whole OneWorld environment. However, it opens some new questions like, do we start a new archive environment every year or even every couple of years or do we just keep appending our purges to one archived environment? Then, do we just treat this new environment like any other with updates and such?

Of the shops that have implemented this approach, how do you handle updates and new versions with the archived environment? What is your methodology to handle multiple purges (i.e. every couple of years with multiple archive environments?)

Thanks!
 
I would go the 4b route, but I'd head to a golf course first, then the beach later.

The folks at ARCTOOLS have figured out all of the relations between the tables, so that in itself may be worth the price of admission. And if I recall correctly, the software isn't all the expensive anyway -- somewhere around $5-10k. Certainly much less than the cost of any of the other alternatives (except the alternative of doing nothing, which I note you didn't list).
 
Back
Top