how to refresh an enviroment with only a subset of production data?

Jfreyre

Active Member
Years ago I came across a several doubts regarding the process to refresh QA or PY environment over JDE in order to achieve a proper environment for the SDLC, back on those days I use to run versions of the good old R98403 in order to copy the central objects, control tables and Business data, but later as I wasn’t comfortable with the time required to do so, I took a different approach and started to load the data performing backups and restore of the libraries or databases involved to it’s equivalent in the desired environment on iseries or SQL Server. On the other hand with Oracle I started to load the data through imports and export and later with data pump or with inserts from selects trying to avoid the redo produced by such method.

Anyway even though those approaches works pretty well when I’m trying to load all the data in those environments I’m still storing in my hard drives an amount of information for those environments that probably I wouldn’t need for the SLDC, for example all the historical information stored in the F0911, so is there a way to refresh those environments through scripts with only a subset of the data stored on my production environment? Or at the same time it’s there any way to change the amounts for the financial tables or any sensible information on my developments environments? And at the same time I wonder How are you copping with this issue?
 
José,

I believe most people are still doing the full copy, unless there is some third party tool that can be used to get the dates in sync across the system it will be very difficult to do.

I know of sites with large volume of data go the route of data mirroring, where they mirror their PROD data to a QA system so that the data is almost real-time.

This is an interesting situation and I would like to know if others have found other ways around this.
 
Oracle's datapump import utility has filtering capability (Keyword = QUERY) that works quite well on tables such as F0911, etc.

Be aware though that this is a Enterprise Edition feature
frown.gif
 
We analyzed our data and business needs for the refresh process and found the biggest bang for the buck (that we could live with) was to clear the following tables:

F47121, F47122, F47123, F55001Z1, F4311Z1,,F0911Z1
 
Dont forget to clear the F42565 - this is a work table for invoices and if invoicing isn't printing, this table should have 0 rows ! Talk to your business analyst, but often I've seen hundreds of thousands of old rows in this table...
 
Thanks for your answers I will give it a try and post whatever I found while I try to emulate oracle data masking
 
I've found various ways to speed up this process and filter what you want.

This is all for an Oracle environment.

If you're using datapump or imp/exp. I found it best to setup your environment the way you want it first. If you have separate tablespaces for indexes and tables, make sure all the objects are where you want them to begin with in the target environment. Truncate all the target tables. Then do a export of just the constraints & indexes of the target tables. Put that dump file aside for use afterward. Drop all constraints & indexes on all the target tables. Now when you import all the data from your production tables that process should proceed much faster. After the import, do another import using the dump file that has the target constraints & indexes to rebuild all those structures.

There are other settings you can change to speed this process up as well.
- Turn off the tablespace & table level logging (not ever recommended by Oracle in production instances)
- take the database out of archivelog mode while the import is processing.
- alter all the indexes to rebuild online & nologging before rebuilding them.

for our upgrade I've actually created a stored procedure to work in conjunction with a database link to refresh all of our data. That process is actually faster than exp/imp.
 
thanks for your answer Ben that's pretty much what I'm doing when I want to refresh an environment, but the question that eludes me is how to filter the data since I'm not pretty sure what criteria I must follow and in wich tables should I filter.
 
Hey Jose,

Yes, sorry, I lost sight of the question.

As far as limiting the amount of data, you really need to know your tables & data to determine how to go about that. I have done subsets of data based on a range of order numbers using the QUERY option in export/import. You could also do that as well with doing insert .... select * from ....;

If you were using partitioned tables you could look at only moving certain partitions.

just some thoughts..

hth
Ben
 
Back
Top