× The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.



Just to update the group...

IBM's come back and said it's a documented limit of the XML parser..
https://www.ibm.com/support/knowledgecenter/ssw_ibm_i_72/rzasc/xmlparselimit.htm


- If the parsing is done in a single-byte character CCSID, the maximum
number of characters that the parser can handle is 2147483408.
- If the parsing is done in UCS-2, the maximum number of UCS-2
characters that the parser can handle is 1073741704.

If the CCSID of the file is 1208, the default is to process the file as
UCS-2, which means that the limit is 1,073,741,704.
IBM suggested I add option "ccsid=job" to get the larger limit of
2,147,483,408.

That has enabled me to process the current 1.5GB document.

There's an existing RFE asking for the parser to handle larger XML docs..
http://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=126521

I've added a comment to my case and to the RFE above..

I don't understand why that limit applies when using XML-INTO with the
%HANDLER() option, especially in conjugation with the path= option; thus
processing by repeated subsets of data in just a given section of the
document.

Obviously, the parser isn't trying to read the document in its entirety,
otherwise I wouldn't have been able to successfully process any of it.
Seems like it must be reading/parsing in chunks, but instead of clearing
after each chunk, it's saving it in memory.

Again, this doesn't make sense when using XML-INTO with the %HANDLER()
option in conjugation with the path= option.


Charles




On Thu, Mar 26, 2020 at 10:33 AM Charles Wilt <charles.wilt@xxxxxxxxx>
wrote:

All,

Is there any kind of limits or known issues to the size of an XML doc that
can be processed using the handler version of XML-INTO?

I've got some code...
XML-INTO %HANDLER(#PartMasterLine : allOk)
%XML(%TRIM(WpIfsFile) : %trim($OPTIONS));

The #PartMasterLine is a DIM(20) data structure...

I process about 600,000 elements (75-80% or so) successfully...

Then get a

RNX0353 - The XML document does not match the RPG variable; reason code 5.

Cause . . . . . : While parsing an XML document, the parser found that the XML document does not correspond to RPG variable "PARM" and the options do not allow for this. The reason code is 5. The exact subfield for which the error was detected is "PARM(1).returnindicator". The options are "doc=file
path=ShowPartsMaster/ShowPartsMasterDataArea/PartMaster/PartMasterLine
case=any ns=remove datasubf=Data allowextra=yes allowmissing=yes".


5. The XML document contains extra XML attributes or elements that do not match subfields.



Now, I've added some code and even changed the DIM(20) to a DIM(1) so that I can better understand where the problem is..


Looking at the XML, starting from the last successfully processed PartMasterLine, I don't see any problems...


And if I remove about half the data (from ~1.5GB to ~700MB) from the beginning, the remaining data processes without issue...


I've tried "minifying" the XML and I get further into the doc, but still get the error.


Also have tried new versions of the doc, same problem.


The one constant, the error gets thrown at approximately the 1GB mark into whatever document.


I've got a case open with IBM, but I thought I'd through this out here to see if anybody else has seen a similar issue.


We running 7.2 and are pretty up to date on PTFs.


Thanks!

Charles






As an Amazon Associate we earn from qualifying purchases.

This thread ...

Follow-Ups:
Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2025 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.