Re: RE: RE: Scripting Tool
Theres not really any good Guinness in Denver you know - I'm always ready to be proven wrong on this matter (hee hee)
One of my last roles for WWAT was to counter exactly one of the arguments you made about scalability testing internally for the benchmarking group - I made a decision to eradicate any possibility of bad benchmarking at WWAT by ensuring that all tests and benchmarks were both running the OneWorld code and were honest representations of a true enterprise environment.
Hence the introduction of Macro-scheduler testing that sat above the GUI code and hammered OneWorld as if it were a real user - hitting the same Dr Watsons and Memory issues as a real user would - and because this was happening in a lab environment - Development were forced to help out as much as possible. Believe me, everyone knew of the instability issues of B733 base. B7331 was the 1st Macro scheduler run - and in the lab we discovered hundreds of issues before the release got distributed.
You are correct, however. The 18 standard scripts that JD Edwards used were extremely "simple" - and most important of all, were extremely disjointed. They had no representation of a real customer and after 200,000 sales orders had been entered, nothing was done with them in the application. The 18 scripts did touch on the majority of the GUI however - hence they were a great test of the toolset and foundation.
However, Richard, after you left (and just before I left) - I decided to create a high-watermark benchmark that would once and for all prove JD Edwards scalability - and would actually use the application as if it were running at a customer site.
The idea struck me after the successful running of the Fortune 1 benchmark. In that, we pulled in 8 RS6000 Application Servers and ran UBE's on them to process a large number of configurator sales orders against a huge RS6000 12-processor Database server. We pulled in a large number of application specialists to set up the applications themselves - generated huge amounts of randomized data and ran the data through the RS6000's using OneWorld's UBE's.
When I started the reports on this benchmark - I realized that what we were actually doing was running in parallel huge numbers of kernels against specific sets of data - and each kernel was pumping business function calls. Now - when OneWorld is running in true-physical 3 Tier - the same thing happens with Interactive users. A user will create business function calls from their Client to the Application Server - which in turn will process the data autonomously. The Client itself is doing very little of the business end of the work - the Kernels on the Application Servers are processing the real data.
Hence I worked out that if we can start scaling up the number of kernels on the Application Server against the Database Server - the number of Client connections that generate the App Server calls is really not as important....one just needs to know how many users a single Terminal Server or Web server can support - then work out how many TRANSACTIONS a Database/Application server configuration can support. By moving toward a Transactional model - one can truly understand how to architect Oneworld for a very large customer.
Ensuring that this kind of benchmark was not only useful for prospective customers to guage OneWorld's scalability but also ensuring that the benchmark was independent from a specific industry vertical was important. Too many times we had heard "but thats not my business" - even though it was unimportant. Tony and I decided, therefore, to ensure that what we benchmarked not only used some VERY advanced functionality in JD Edwards - but also that the fictitious company we created was not representative of any customer that JD Edwards could possibly sell too. In fact, we made the business model so ludicrous that it was OBVIOUSLY not any cusomers "business" and instead, they would be forced to look at the detail of the benchmark not the high-level view.
The benchmark was codenamed "Moonshot" - and the fictitious company was named "Apollo Inc". The benchmark was centered around Distribution - and we created a model that had so many transactions that it would be extremely difficult for any hardware provider to meet the goal.
The business model was the following...
1. Apollo Inc has 2 warehouses - each with 30,000 locations (10 aisles, 6 bins/aisle, 500 locations/bin)
2. Apollo sells ~5,000 different items (moondisks) - to ~50,000 distributors from around the US
3. Apollo has numerous web servers taking the orders and entering the orders into EDI format
4. In 8 hours, 200,000 orders with 84 lines apiece (16 million order lines) needs to be processed from Order to Cash. All orders come from a Distributor (hence the large # of lines)
5. No two order lines are the same (completely randomized on object #, customer # and Qty)
The Business Process flow used the following modules :
1. Sales Order Entry
2. Advanced Warehouse Management (picking)
3. Advanced Transportation (shipping)
4. Invoice
5. Sales update to GL
The Sales Order Entry process was set up to either use Advanced Pricing (discounts based on quantities) or NOT (depending on the user). Both ON and OFF were tested.
Advanced Warehouse Management would pick from random warehouse locations using either FIXED or RANDOM logic (depending on which of the 2 warehouses were used)
Advanced Transportation assigned a carrier based upon location - weight was calculated and shipment costs were assigned as a backorder.
Invoices were then generated to reflect new freight costs and eventually updated against AR.
One of the most important, fundamental differences between Moonshot and the other benchmarking practices - is that a mistake in any part of the process would reflect throughout the entire procedure. We also purposefully introduced a small amount of invalid orders to ensure that the procedure was 100% accurate.
The goal of the project was to create a benchmark environment that could eventually be scaled (based on known technology) linearly to 2,000,000 order lines per hour. The proof of concept scaled to no more than 40,000 order lines per hour (Adv Pricing off) and 25,000 order lines per hour (Adv Pricing On) with 4 processors - to 100,000 order lines per hour with 16 processors - though with later benchmarks (performed at HP) this scaled to 485,000 lines per hour last November (
http://biz.yahoo.com/prnews/001121/co_jd_edwa.html) with 48 processors. Our extrapolation was that 300 processors would be required for 2million transactions per hour (still accurate) and that a 24 processor RS6000 would handle the database load.
These results are pretty much exactly as customers see the performance of OneWorld. It is interesting to note (and you'd be proud, Richard) that since B7332 it is extremely difficult to see the limitations on Next Numbering anymore. Fascinating since we used to hit limitations at 50 users once upon a time !
Another important aspect of this benchmark is the fact that not only are the transactions real (and thousands of pages of PDF's are produced) - but the entire documentation of Moonshot was designed to be open so that any competitor could follow suit and try and beat the benchmarks. Why ? Because the initial data is EDI based - and therefore ERP independent ! All another vendor has to do is to enter the financial model of the company by performing a CRP (all pretty simple !)
Lastly, from what I understand, I believe a platform partner is seriously considering a project to reach the 2 million mark. This would blow any other benchmark clear out of the water - as if it were necessary. 500,000 order lines per hour is pretty scalable - I think that Amazon MAY do 2 million order lines a WEEK over 24hour peak periods !
Well - thats my little 2c. I agree with you wholeheartedly, Richard, that ERP vendors should try to benchmark based on real customers - and I tried to achieve that with the high-watermark benchmark "moonshot". I hope that the legacy of these benchmarks continue - but my fear is that the industry is changing, and with the introduction of Microsoft to the industry - benchmarks will become less and less real and more and more marketable.
Its all getting interesting, as one partner would put it.
Jon Steel
ERP Sourcing
http://www.erpsourcing.com
[email protected]