Full package compile speed problem...


Reputable Poster
I'm having a problem that I can't seem to figure out. I have three servers, all running AIX 7.1. Server A has the data and objects for PD, Server B has the data and objects for PY and DV, and Server C is just an application server for PD. When I build a full PY package on B, the compilation of the business functions takes about 40 minutes. When I build a full PY package on A or C, it takes more than 12 hours. The compiler version and the BSFN Build section of the jde.ini files are the same on all the servers. A should be a faster server than B, and while it has more processes going on, it never looks overly heavily loaded even with the package building. Anyone have any ideas what's causing this?
What machine do you actually run the build on? The Deployment server? A workstation? If a workstation is it the same workstation for all environments?
I've done it from various places... the two that I just did to compare were on two separate but identical except for pathcode workstations (Windows 7 VMs set up on our ESX servers that are only being used for this).
Compare each SvrPkgBuild.log to determine which phase of the build is slower than expected. Here is an excerpt:

Sat Nov 29 13:08:21 - Server initialized now sending message back to the build machine.
Sat Nov 29 13:08:21 - Message sent.
Sat Nov 29 13:16:27 - Transferring File(s) to this enterprise server.
Sat Nov 29 13:20:19 - Build started.
Sat Nov 29 13:20:19 - Sending message back to build machine.
Sat Nov 29 14:57:49 - Transfer the SvrPkgBuild.log to the Deployment server.
Sat Nov 29 14:57:49 - Build is finished.

In this example this server spent roughly:
* 8 minutes waiting for the transfer to start
* 4 minutes receiving files
* 97 minutes compiling business functions

Compile speed is largely dependent on the number and speed of processors available to the system/lpar. Setting the jde.ini entry for SimultaneousBuilds > number of available cores (or zero) can increase build times as each build process competes for resources and the number of context switches increases.

All the time seems to be spent compiling. The machines are LPARs on the same physical box... the PY one has one physical processor worth of CPU allocated, which it sees as 16 CPUs, while the PD one has 4 physical processors worth that it sees as 24 CPUs. The results seem to be proportionally the same whether I have 5 or 20 simultaneous builds.
The symptoms described sound like processor saturation. The compile process incurs very little i/o wait so it's difficult for the OS to take advantage of the additional virtual CPUs. You mentioned server A never looks heavily loaded when the build is in progress. Is this logical processor load or physical processor load for the LPAR?

Also, I once encountered an iSeries configured such that the production LPAR was in capped mode while the test LPAR was uncapped. The test environment often performed better, even though it was allocated less resources, because of the ability to take advantage of unused capacity on the physical device.

Hope this helps.
They're both uncapped... and the actual physical processors used on A seemed to be below the allocated while the compile was going on, but I'm going to do it again and watch more carefully.
I just went back and checked my nmon graphs from when I was building the package and it was below 4 CPUs (which is the number of physical CPUs allocated) almost the entire time.
I just went back and checked my nmon graphs from when I was building the package and it was below 4 CPUs (which is the number of physical CPUs allocated) almost the entire time.


Would you mind supplying output of "lparstat -i" from the affected LPAR, as well as "ulimit -a" for the process owner of the E1 user on the same LPAR? /etc/security/limits output may also be helpful. You can email it to me if you would like. [email protected]

That would be enough to at least get started.

From what you've described you may have an odd value set for logical CPU in the poorly performing LPAR, but nmon output is definitely helpful, especially if you have every LPAR graphed. This should include VIOS LPAR's if you have them.

Which model server are you running, and do you have shared processor pools defined (or none, meaning all LPAR's have access to all CPU?)
It's best not to have more logical CPU defined than you have physical entitlement to in the box. What is your SMT value? 2, 4, 8?
This is a 740 (8205-E6B), and it has 16 physical processors. The partition with the problem has 6 logical CPUs and SMT 4. It's got an entitlement of 4, but is uncapped, so it should be able to potentially access up to 6 physical processors, since there are more than that in the box, right? I do have nmon every day for every partition except for the VIO servers. The partition where I can build the package faster is on this same physical box, with an entitlement of 1, 4 logical processors, and SMT 4. Last night I changed the ulimits for the jde user on the slow box to match what's on the fast one, and that's made a difference, but it's still over four times slower. Here's the lparstat info for the slow partition:

Type : Shared-SMT-4
Mode : Uncapped
Entitled Capacity : 4.00
Partition Group-ID : 32773
Shared Pool ID : 0
Online Virtual CPUs : 6
Maximum Virtual CPUs : 8
Minimum Virtual CPUs : 2
Online Memory : 57344 MB
Maximum Memory : 81920 MB
Minimum Memory : 16384 MB
Variable Capacity Weight : 128
Minimum Capacity : 0.20
Maximum Capacity : 5.00
Capacity Increment : 0.01
Maximum Physical CPUs in system : 16
Active Physical CPUs in system : 16
Active CPUs in Pool : 16
Shared Physical CPUs in system : 16
Maximum Capacity of Pool : 1600
Entitled Capacity of Pool : 1380
Unallocated Capacity : 0.00
Physical CPU Percentage : 66.67%
Unallocated Weight : 0
Memory Mode : Dedicated
Total I/O Memory Entitlement : -
Variable Memory Capacity Weight : -
Memory Pool ID : -
Physical Memory in the Pool : -
Hypervisor Page Size : -
Unallocated Variable Memory Capacity Weight: -
Unallocated I/O Memory entitlement : -
Memory Group ID of LPAR : -
Desired Virtual CPUs : 6
Desired Memory : 57344 MB
Desired Variable Capacity Weight : 128
Desired Capacity : 4.00
Target Memory Expansion Factor : -
Target Memory Expansion Size : -
Power Saving Mode : Disabled
Sub Processor Mode : -

And the /etc/security/limits values for the user (which are now the same on both):

fsize = -1
core = 2097151
cpu = -1
data = -1
rss = -1
stack_hard = -1
stack = -1
nofiles = 128000
Yes, you're correct. This answers my question, sorry it wasn't clear to me on Friday evening. As you stated, 6 virtual CPU can access up to 6 physical processors even with 4 allocated as long as the partition is uncapped which it is.
Problem solved... I needed to put the license information for xlC in /var/ifor/nodelock. Now the whole compile takes 19 minutes.