V5R2 - Inconsistent Query Performance - Especially on Purchase Receiver

mkmckinn

Active Member
All,
Newbie request here all. Can anyone help us improve some post V5R2 upgrade inconsistent performance (e.g. some UBEs, receivers, taking an order of magnitude longer to process than previously).
Last weekend we upgraded from V4R5 to V5R2. What we observe is inconsistent performance (variable run times) on UBEs and occasionally on some interactive applications.

System Description:
One World Xe, Distribution and Financials (No HR) Update 0 with selected SARs, SP 20, on an iSeries 830 (4200 CPW with 9GB RAM running V5R2) with 5 Terminal Servers (NT SP6 and Win2000 SP3) supporting ~200 simultaneous users.

Observations:
a) The UBE's which have suffered the most are causing significant memory paging and faulting (DB and nonDB) when they run
b) Running these UBE's under different conditions three times can improve their performance significantly in the short term (for the next few minutes), but not for the long term (beyond an hour)
c) Many UBEs run faster than before
d) The CPU isn't bound (averaging <50%)
e) The DASD though high isn't bound (<80% peak utilization)
f) There are some indications that under certain conditions the disk I/O can become bound
g) A background system performance monitoring application seems to have grown in disk I/O intensiveness
h) The network isn't bound
i) We run our operational documents on their single threaded queues
j) Our system (V4R5) was tuned pretty closely to the spec outlined in the JDE Redbook
k) Our subsystem configuration is system autotuned
l) By end of day today, our SQL packages had all stopped growing in size, except those for the control tables (UDC tables and the like).

Our Current Ideas:
1) Problem: too many jobs in a single subsystem - solution: build a new subsystem and shunt jobs to it.
2) Problem: new dispatcher and SQE subsystem not performing admirably - solution: turn off dispatcher and SQE (possible?)
3) Problem: SQL packages created for selected jobs are faulty because system autotune adjusting configuration too rapidly post upgrade - solution: manually tune subsystems, blow away SQL packages, let production run to let SQL packages settle, then turn back on autotune.
4) Problem: we missed something in the object conversion process of the upgrade - solution: find the non updated object and nail it.

Closing Request:
Anyone with thoughts on our listed potential solutions, and/or possible alternatives would be most welcome. Even other V5R2 war stories (preferably with happy endings) might prove cathartic. :) Seriously, any thoughts or advice would be most welcome.

Cheers,
--Malcolm.
 

jstooge

VIP Member
Malcolm,

First and foremost, delete those SQL packages (not the ones beginning with Q). This is recommended SOP for any OS/400 or OW upgrade utilizing the iSeries platform.

Next, you need to try to target the UBE's or processes that are experiencing the performance issues, or determine that it is system wide.

I'm going to assume that you've installed the required IBM and JDE/PSFT PTF's. They are presented in two different places and I know a new OW APAR came out just yesterday.

Next, make sure you have the proper fixes available for iSeries Access. In many cases, performance issues are caused by a .DLL that can be replaced, but this would depend on which "flavor" of iSeries access you are running. (in many cases it's the CWBODBC.DLL that causes the issues).

Finally, and especially if you find that it's not system wide, but only affecting certain UBE's or processes, rebuild the indexes on the tables the UBE or process is hitting. If that is of no help, run the UBE or process with logging turned on and run the logs through Performance Workbench, available on the KG. Also would probably be a good idea to turn on the Query Analyzer on the iSeries. These two tools should allow you to see if perhaps a new index would help.

Of course, I'll also assume you've already had a open call with JDE/PSFT and IBM regarding this.

Good luck, and regards,

Jim
 

mkmckinn

Active Member
Jim,
First, thanks for the super quick reply.

On to your notes:
a) Delete SQL packages - I've heard this and have done this historically, but don't understand why this "makes sense" for OW installations. That said, we did it last night. Of note is that the night we didn't do it, we found performance improved the next day.
b) latest IBM PTFs (ones from yesterday - downloadable installed, CD only ones in transit). JDE SP20 installed to be sure. Application level PTFs being tested.
c) Downloaded what we thought were the latest CWDBODBC.DLLs but have not checked in last 3 days.
d) Reinstalled MDAC after upgrading client access as well (thank you JDE list).
e) Tried the index route on night two, but only at the DB level, not the Edwards level - out of interest, will this have a sigificant bearing on the result? Saw some improvement but not a tremendous amount The rebuilds were only done as far as the objects were effected. Good idea - we'll plan some time for rebuilds tonight.
f) we're in conversations with the "experts" yes, but always welcome suggestions from the front line. :)

Cheers,
--Malcolm.
 

jstooge

VIP Member
Malcolm,

I do know of situations where the OW table definition had indexes that were not defined within DB2, so additional indexes would be created when regenerated from within, rather than at the DB2 level. That may or may not be an issue.

Realistically, during OW upgrades, table definitions will be altered, new tables added, and therefore it's a good idea to delete the SQL packages. On the other side, since you went from V4RX to V5R2, there were many OS and DB changes that I'd think that it would be a good idea as well.

I think you've covered all your bases right now. Perhaps while you wait on the "experts", you could run the logs through the Performance Workbench. This will identify which SQL statements are having issues or running too long.

What MDAC level are you using? Not the latest I hope.

I believe you've done everything I'd suggest without seeing logs and other specific issues.

Sorry I can't offer anything better...

Jim
 

mkmckinn

Active Member
Jim,
Gotcha on the changes to SQL packages if you make frequent changes to the underlying structure. Makes sense.
As for MDAC, I know we've avoided 2.7, but I can't remember if its 2.5 or 2.6 (the latter I believe).
Thanks again for the quick response.
BTW, we noticed the SQL packages grow causing pretty significant disk I/O but nothing insurmountable. Then this svc director job kicks in and whomp, the paging and faults jump. Don't know if that provides much more from a lead perspective.
Finally, the index rebuild is good idea... we'll figure that one into our plan.
 

jean_driscoll

VIP Member
The SQL packages hold the access path for a particular query. If you delete the package, then the As/400 goes through the query optimizer to recreate the package, finding the best route possible.

For specific ube issues, I work directly with IBM to determine the issue. You can first run the Database monitor in order to determine if you've created all of the required indices. But after that, IBM can work with the iDoctor and determine exactly what is happening in the execution of the UBE.
 

Tom_Davidson

VIP Member
Malcom,

If you can verify that service director is active (actually running/paging) when performance degrades I would consider running my QDZASOINIT jobs in a separate pool, I am normally runing 2000+ of these at any one time and give them a 7G pool. If you read some of the redbooks IBM recommends significantly more thant he 3.5M I give each job, but it seems to work for me.

We actually use 3 separate pools, 1 for the JDE Kernels, one for the UBE's and one for the QZDASOINIT jobs. I run about 73000 UBEs per week and we use the 'W' environmet with BSFN's mapped back where it makes sense.

By doing this we have significantly reduced the amount of 'thrashing' that occurs in our memory pools.

Hope this helps.

Tom
 

mkmckinn

Active Member
All,
Thanks for the responses before. It turns out that the V5R2 query dispatcher / CQE / SQE treat the baseline receiver query differently. That was our root cause. Similarly, the large, non-downloadable (sent by CD) PFTs are important as well for query performance. Finally, table rebuilds and storage reclaiming helped as did the holding of the SQL packages.

Cheers,
--Malcolm.
 
Top