• JDELIST is celebrating its 20 year anniversary today!
  • Introducing Dark Mode! Switch by clicking on the lightbulb icon next to Search or by clicking on Default style at the bottom left of the page!

Media Object Text Conversion (RTF to HTML) failes with Invalid Character Error

Larry_Jones

Legendary Poster
We are upgrading to 9.2 from 9.1 and in the process we are trying to lose the ActiveX dependencies - so we need to convert the text.

The issue is that we just ran a test conversion of ~ 100,000 text entries in F00165 using P98MOHTM.

Most (83,000) entries converted successfully.
~ 1,300 had the error "Invalid characters found in RTF stream"
The remaining entries appear to have been lost during the journey - no mention of them was made by the convertor :(

Anyone encounter similar issues with this? Resolutions found?

Any successes?
 

Larry_Jones

Legendary Poster
OK - I'll update our solution.

First, we believe the cause of the error is entries pasted from Emails or Word documents that, while valid RTF entries, were not handled by whatever narrow method Oracle used for converting rtf text.
We came up with a "fix" on our own for the 12,000+ entries not converted out of approximately 750,000. The less than perfect fix was to:
1. Run the full conversion against F00165.
2. Create a SQL Procedure that could accept large amounts of text (nvarchar(max)) and write same to F00165.GDTXFT. Also updates GDGTFUTS1 to 'NON-RTF' (valid values in this field are blank, CONVERTED, or NON-RTF).
3. Write a Crystal reports that:
a. Verified the count of entries not converted, then
b. Converted the RTF to regular text (carriage returns and spacing preserved but not special formatting such as bullets, colors, bold, etc).
c. Generated SQL (calls to execute the procedure created in Step 2) for each text entry needing conversion
4. Copy / paste the generated SQL into SSMS (Sql Server Management Studio) query window and execute

Following execution all rtf entries that Oracle failed on are converted.

FYI we find that OL.F00165 out-of-the-box has 27 rtf text entries that evidently failed Oracle's conversion for Object Librarian ... :)
 

BOster

Legendary Poster
Thanks for posting back the solution. At some point I think we will need to go through the same conversion and will undoubtedly run into the same issue.
 
Top