E9.2 How to remove CDATA[] XML convention

ccornagg

Well Known Member
Hello,

I'm kindly asking your advice on whether there's an easy way using Orchestrator to remove "CDATA[] convention" from an XML file.

Here you are an example of what I'm looking for, using an example of XML tag which is representing company name "Carlo & Co".

Input XML file is containing:

<MLNM><![CDATA[Carlo & Co]]></MLNM>

Output XML file should contain:

<MLNM>Carlo &amp; Co</MLNM>

As you can see in my example I have a special character & which is converted into &amp; (see also UDC 70|RC).

Kind regards,

Carlo
E1 9.2.7.3
 
"Easy" is relative! Some people (me included, in years past) consider any orch with scripting as "difficult". But with chatGPT handling 99% of the scripting lift this becomes an easy exercise.

Open chatgpt. (link to this conversation)

Prompt:​

Code:
I need a jython script using the following template to convert XML input from "<MLNM><![CDATA[Carlo & Co]]></MLNM>" to "<MLNM>Carlo &amp; Co</MLNM>":
import com.oracle.e1.common.OrchestrationAttributes
#import json


def main(orchAttr, inputMap):
    returnMap = {}
    # Add logic here
    
    #stringVal = inputMap["inputStringVal"]
    #numVal = float(inputMap["inputNumVal"])
    
    #orchAttr.writeWarn("Jython Warn ")   
    #orchAttr.writeDebug("Jython Debug ")


    #array =[{"Column1":"test1", "Column2":"val1"},{"Column1":"test2", "Column2":"val2"}]     
    #returnMap["Output Array"] = json.dumps(array)
 
    #returnMap["Output String"] = stringVal
    #returnMap["Output Attrib"] = "AB " + str(orchAttr.getAddressNumber())
    #returnMap["Addition"] = numVal + 1.1
 
    return returnMap

ChatGPT response:​

Here's a Jython script using the provided template to convert the XML input from "<MLNM><![CDATA[Carlo & Co]]></MLNM>" to "<MLNM>Carlo &amp; Co</MLNM>":
Code:
import com.oracle.e1.common.OrchestrationAttributes
#import json
import re

def main(orchAttr, inputMap):
    returnMap = {}
    
    # Extract the input string
    xml_input = inputMap["XMLInput"]
    
    # Replace the CDATA section and escape the ampersand
    pattern = re.compile(r'<!\[CDATA\[(.*?)\]\]>')
    modified_string = pattern.sub(lambda m: m.group(1).replace("&", "&amp;"), xml_input)
    
    # Replace the original tag content with the modified string
    modified_xml = re.sub(r'<MLNM><!\[CDATA\[.*?\]\]></MLNM>', f'<MLNM>{modified_string}</MLNM>', xml_input)
    
    # Output the modified XML
    returnMap["ModifiedXML"] = modified_xml
    
    return returnMap

How the script works:​

  1. The script takes an input XML string (xml_input), which is provided through the inputMap.
  2. It uses regular expressions to find and replace the CDATA section and converts the & symbol to &amp;.
  3. The modified XML string is returned as the output in the returnMap.
You can adjust the inputMap["XMLInput"] key based on how you're passing the XML data into the script. If you need further customization or adjustments, let me know!

The script errored.

Prompt 2:​

Exception SyntaxError: no viable alternative at input ''<MLNM>{modified_string}</MLNM>'' in <script> at line number 16 at column number 64: Caused by: File "<script>", line 16

Response 2:​

The error you're encountering is due to the syntax of the string replacement within the script. Since Jython may have issues with the use of f-strings or some formatting, we'll modify the code to use standard string concatenation. Here's the revised script:
Code:
import com.oracle.e1.common.OrchestrationAttributes
#import json
import re

def main(orchAttr, inputMap):
    returnMap = {}
    
    # Extract the input string
    xml_input = inputMap["XMLInput"]
    
    # Replace the CDATA section and escape the ampersand
    pattern = re.compile(r'<!\[CDATA\[(.*?)\]\]>')
    modified_string = pattern.sub(lambda m: m.group(1).replace("&", "&amp;"), xml_input)
    
    # Replace the original tag content with the modified string
    modified_xml = xml_input.replace(xml_input, "<MLNM>" + modified_string + "</MLNM>")
    
    # Output the modified XML
    returnMap["ModifiedXML"] = modified_xml
    
    return returnMap

Explanation of Changes:​

  • Instead of using f-string formatting, the script now uses + to concatenate strings.
  • The replace method directly substitutes the original input string with the modified version.
This should resolve the syntax error. Let me know if you encounter any further issues!

The script as provided returned "<MLNM><MLNM>Carlo &amp; Co</MLNM></MLNM>".​

I went in to see what was happening and fixed the script manually.

Fixed script:
Code:
import com.oracle.e1.common.OrchestrationAttributes
#import json
import re

def main(orchAttr, inputMap):
    returnMap = {}
    
    # Extract the input string
    xml_input = inputMap["XMLInput"]
    
    # Replace the CDATA section and escape the ampersand
    pattern = re.compile(r'<!\[CDATA\[(.*?)\]\]>')
    modified_string = pattern.sub(lambda m: m.group(1).replace("&", "&amp;"), xml_input)
    
    # Replace the original tag content with the modified string
    modified_xml = xml_input.replace(xml_input,  modified_string )
    
    # Output the modified XML
    returnMap["ModifiedXML"] = modified_xml
    
    return returnMap

Thank the creator​

1724686951748.png

Screenshot of the working script component:​

Notice how I have to manually add the input
1724686902722.png
 
Back
Top