Recursive DW Version Problems

tpayne

tpayne

Reputable Poster
I started a new contract and the company is having problems with some of it's recursive versions, where the main version ends up getting deleted and/or the contents changed.

Ironically most fo the historical postings on DW recursion on JDEList are several threads of mine from 4+ years ago, long enough for me to forget what caused the problem and what I did to fix it.

The recursion process works by copying the Form/Version being executed to +Form/NewVersion, where the last 4 digits of the version are replaced by a sequence that begins at '0001', or if the original version had 4 digits as the last 4 characters, starting with the next in sequence from that.

This is done in program P98310, when either a menu option is called, or from the DW Version List (P98302) when option '1' is taken to Run a version. It runs interactively.

The program checks that no records exist for +Form/NewVersion on F98301 (DW Header File), looping and adding 0001 to the version number if they do.

If all is ok, it proceeds to copy the old version to the new in F98301, then F9831, F98311, F98312 and F98303.

The problem we are getting is...
P98310 31500 F98311 attempts to write duplicate record (C G S D F)
where for some reason records exist on F98311.

The program does not check for records already existing in the other files before copying, only in F98301.

Problem is, I can't figure out how records are getting left behind in F98311, and what will happen if 2 users try to create the same version at the same time.

In theory, one session should write to F98301, and the second should find that data exists, therefore incrementing the version number.

The Global Recursive cleanup (J/P98305G) is run weekly, and this deletes all recursive records from F98301/303/311/312 using SQL, so it can't be really old data being left behind.

The user is getting an error when this happens, and they have been ignoring it, resulting in the copy of the version failing, and the version being executed being the main one. I was able to prove this using Debug on P98310 and forcing this error to be generated.

If the user presses Enter after receiving the error, the version list then displays the original Form/Version, and the data gets updated in that, which is not desirable.

I also do not understand how the original version is sometimes disappearing, since the CL programs only delete this if the Form begins with "+".

My plan is to check for the existance of the new version on all 5 files before performing the copy, which ought to stop this from happening. It's a fairly simple modification to P98310.

Any thoughts on this from the other DW experts out there?
 
I have hopefully fixed the problem by adding the following code to P98310. I am still very puzzled to know why records have been left behind on F98311 in the past though, as well as how the original version is getting deleted.

0236.00 C *IN99 ANDEQ'1'
0237.00 C GOTO NEWNUM
0238.00 C END
0239.00 C*
0240.00 C DEKY01 READEI98301 9981
0241.00 C *IN99 CABEQ'1' CPYERR
0242.00 C* ----- ------
0243.00 C END
0244.00 TP01 C*
0245.00 TP01 C* CHECK NEW VERSION DOES NOT EXIST ON ALL FILES BEFORE COPYING
0246.00 TP01 C*
0247.00 TP01 C $$KY01 CHAINI9831 81
0248.00 TP01 C 81 $$KY01 CHAINI98311 81
0249.00 TP01 C 81 $$KY01 CHAINI98312 81
0250.00 TP01 C 81 $$KY01 CHAINI98303 81
0251.00 TP01 C *IN81 IFEQ *OFF
0252.00 TP01 C GOTO NEWNUM
0253.00 TP01 C ENDIF
0254.00 C*
0255.00 C* Copy all version records.
 
Tony,

We also added the following code to P98302 if the user exited with submitting the version:

0497.00 C SETON LR
0498.00 C*
0499.00 C* Cancel process requested
0500.00 C*
0501.00 C @@AID IFEQ #FEOJ
0502.00 C MOVE 'E' $RETRN
0503.00 C END
0504.00 JPS01 * THIS FIX IS ALREADY IN P98300 IN A73 CUME 11
0505.00 JPS01 * IF F3 OR F12 FROM A SUBMIT, DELETE RECURSIVE VERSION
0506.00 JPS01C MOVELPSPID RECURS 1
0507.00 JPS01C MOVELPSPID NEWPID 10
0508.00 JPS01C RECURS IFNE '+'
0509.00 JPS01C '+' CAT PSPID:0 NEWPID
0510.00 JPS01C END
0511.00 JPS01 *
0512.00 JPS01C $RETRN IFEQ 'E'
0513.00 JPS01C $RETRN OREQ 'R'
0514.00 JPS01 *
0515.00 JPS01C UNLCKF98301 99
0516.00 JPS01C UNLCKF9831 99
0517.00 JPS01 *
0518.00 JPS01C CALL 'P98305'
0519.00 JPS01C PARM NEWPID
0520.00 JPS01C PARM PSVERS
0521.00 JPS01C END
0522.00 C*
0523.00 C* END MAINLINE PROGRAM
0524.00 C* --------------------

Hope it helps too. Additionally if you have a specific program that continually shows up with a problem you should check at the bottom of the CLP Jxxxxxx to see if the code to cleanup the recursive version. Some people may have added programs to the recursive UDC table when the program is not correctly coded to cleanup after it runs.
 
I would suspect that the cleanup code at the bottom of the CL that George mentioned is likely to be the issue more than not. Just because a DW is put into the recursive version table doesn't mean it was written to spec.
Just my two cents!!
 
Thanks George / Nichelle,

It looks like my fix ought to work, and definitely versions are getting left behind when a user abandons a submit.

In this case the CL is coded correctly to remove the recursive versions, but yes, been there before where it's not too
smile.gif


I found out late yesterday too that there are some job descriptions that are incorrect (maybe lower case or invalid characters), which cause the job to abort as soon as it is started.

However, none of these really explain hw there is data in F98311 but not in F98301. I am trying to track this down...

Really appreciate the help and feedback.
Tony
 
I would place a call to Oracle Support, this is not the way it works in
our old version of A7.3 Cume 14 or new version A9.1. There has to be a
fix from Oracle for this. When I do a submit and bale out without
submitting the job, the recursive version is removed from the files, in
both our old A7.3 and the new A9.1.

On Menu G81 option # 15 can be run when no one is on the system to clean
up left over recursive versions in the F983* files.

Just an FYI.

Jim
 
I would encourage you to report this to JDE support.=C2=A0 This might be a known issue with a SAR available to fix it.=C2=A0 In my opinion the last th ing you should look to do is change JDE code.=C2=A0 Once you do, then you " own" it and have to support it.
=C2=A0
John Dickey
 
Jim/John,

Agreed, this shouldn't happen.

Unfortunately the client no longer has support with Oracle, but it does look like pressing F3 the versions are removed, which is good.

It's just when the recursive files get out of step. I wondered if maybe someone had used SQL to clean them out of F98301 but not the other files. That would cause the abort, since if a version does not exist in F98301, the version copy assumes it does not exist in the other files either. The global delete would fix that of course.
 
I don't know if this is the only reason for what is happening, but it's the major cause.

J98305G, the Global Recursive Version Delete is now running nightly and later than it used to, therefore some locations are already running jobs and the delete aborts due to record locks on F98301.

Because of this, a number of records for recursive versions are deleted from F98301, but not from F9831, F98311, F98312 etc.

The program that creates Recursive Versions, P98310, checks to see if there is a record for a version in F98301. If not, it tries to create that version, but does not check for the presence of recordsin the other files. When it finds that data exists, KA-BOOM! This is causing the submission of recursive jobs to abort.

I am waiting for my changes to P98310 to go into production so that it checks all the files, but now at least I know the cause of the problems, which is a big relief.

I knew of course how J98305G works, but didn't realise that it had been scheduled to run while other JDE jobs were active.
 
Yes, the J98305G can also affect jobs on hold until released at a specific time via either the Submitted Job parameter or the One Time Release via Sleeper. The version is all setup to be run and the J98305G will delete those recursive versions as well. Then when the job attempts to start there are no records to support the process.

We found it best to run the J98305G only once a week and at what we could consider the quietest system time as possible.
 
I didn't realize / forgot that there is another program, P98305, which deletes a single version. This is used to delete the recursive versions at the end of a job.

I am in the process of writing a CL program that will select all Recursive Versions that have an Execution Date of *TODAY-2. This will then call the deletion program, eliminating the need to use P98305G except on odd occasions.

I am tweaking P98310 to make sure that the Date Executed is set to *TODAY in the "R" record on F98301, and also setting the User Id and resetting the Security Field. If a base version is secured, the recursive version is also secured, so if it got left behind and support needed to delete it using the version maintenance program, they couldn't touch it.

All of these changes should make the DW processing a lot smoother in the future. The larger the company, the more problems you have with things like versions getting left behind, and also finding time to run jobs like the Global Delete. My new version can be run at any time, since it wll only delete versions that are 2 days old.
 
Back
Top