E9.2 How To Read A Web Page for Processing

Euroboy · Jan 11, 2023

Hi All

I have a requirement where the data that i need to process is provided on a web page controlled by a 3rd party / one of our suppliers.
I can determine the URL for the webpage to land me at the right place, which will display all the data I need ("contents" / data on web page is basically in XML format).
the problem is how can I "read" the page / contents of web page?

I guess the starting point would be use Orchestrator to land on the webpage ... but what then?

Thanks for any help that may come my way.

on E1 9.2.6.3

BOster · Jan 11, 2023

If you want to scrape data off a Web Page your best bet is to do this piece outside of the JDE toolset in some other language/toolset like Java or Python and expose the functionality as a Web Service that can be consumed by Orchestrator (or BSSV/BSFN, etc.) to provide this functionality in JDE. The downside, obviously, is you would need the architecture to deploy this too. If all you have are the JDE based servers then I am sure there are Orchestrator Gurus here that could probably detail how to do this with Orchestrator/Groovy - I would probably try and do this with BSSVs and Java or even a C BSFN if trying to do it within the JDE toolset. To me it would just be easier to do it outside the confines of the JDE toolset - Python for example has libraries to help you scrape data off a Web Page.

You may also want to make sure that doing this (scraping data off a web page) doesn't violate your 3rd party supplier's TOS before you do this.

Euroboy · Jan 11, 2023

BOster said:
If you want to scrape data off a Web Page your best bet is to do this piece outside of the JDE toolset in some other language/toolset like Java or Python and expose the functionality as a Web Service that can be consumed by Orchestrator (or BSSV/BSFN, etc.) to provide this functionality in JDE. The downside, obviously, is you would need the architecture to deploy this too. If all you have are the JDE based servers then I am sure there are Orchestrator Gurus here that could probably detail how to do this with Orchestrator/Groovy - I would probably try and do this with BSSVs and Java or even a C BSFN if trying to do it within the JDE toolset. To me it would just be easier to do it outside the confines of the JDE toolset - Python for example has libraries to help you scrape data off a Web Page.

You may also want to make sure that doing this (scraping data off a web page) doesn't violate your 3rd party supplier's TOS before you do this.

Thanks Brian.

We need to do this using the std JDE toolset, and my preferred way would be via the orchestrator route rather than BSSV.

If there are any orchestrator Gurus out there, want to show the rest of us what to do?

Thanks, again

DaveWagoner · Jan 11, 2023

You certainly can just do a connection/connector out to a http website and get the entirety of the HTML/CSS back. Parsing the return body becomes a really "fun" exercise.

E9.2 How To Read A Web Page for Processing

Euroboy

Active Member

BOster

Legendary Poster

Euroboy

Active Member

DaveWagoner

Reputable Poster

Similar threads

We value your privacy