|
Just to update the group...
IBM's come back and said it's a documented limit of the XML parser..
https://www.ibm.com/support/knowledgecenter/ssw_ibm_i_72/rzasc/xmlparselimit.htm
- If the parsing is done in a single-byte character CCSID, the maximum
number of characters that the parser can handle is 2147483408.
- If the parsing is done in UCS-2, the maximum number of UCS-2
characters that the parser can handle is 1073741704.
If the CCSID of the file is 1208, the default is to process the file as
UCS-2, which means that the limit is 1,073,741,704.
IBM suggested I add option "ccsid=job" to get the larger limit of
2,147,483,408.
That has enabled me to process the current 1.5GB document.
There's an existing RFE asking for the parser to handle larger XML docs..
http://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=126521
I've added a comment to my case and to the RFE above..
I don't understand why that limit applies when using XML-INTO with the
%HANDLER() option, especially in conjugation with the path= option; thus
processing by repeated subsets of data in just a given section of the
document.
Obviously, the parser isn't trying to read the document in its entirety,
otherwise I wouldn't have been able to successfully process any of it.
Seems like it must be reading/parsing in chunks, but instead of clearing
after each chunk, it's saving it in memory.
Again, this doesn't make sense when using XML-INTO with the %HANDLER()
option in conjugation with the path= option.
Charles
On Thu, Mar 26, 2020 at 10:33 AM Charles Wilt <charles.wilt@xxxxxxxxx>
wrote:
All,that
Is there any kind of limits or known issues to the size of an XML doc
can be processed using the handler version of XML-INTO?5.
I've got some code...
XML-INTO %HANDLER(#PartMasterLine : allOk)
%XML(%TRIM(WpIfsFile) : %trim($OPTIONS));
The #PartMasterLine is a DIM(20) data structure...
I process about 600,000 elements (75-80% or so) successfully...
Then get a
RNX0353 - The XML document does not match the RPG variable; reason code
the XML document does not correspond to RPG variable "PARM" and the options
Cause . . . . . : While parsing an XML document, the parser found that
do not allow for this. The reason code is 5. The exact subfield for which
the error was detected is "PARM(1).returnindicator". The options are
"doc=file
path=ShowPartsMaster/ShowPartsMasterDataArea/PartMaster/PartMasterLinenot match subfields.
case=any ns=remove datasubf=Data allowextra=yes allowmissing=yes".
5. The XML document contains extra XML attributes or elements that do
that I can better understand where the problem is..
Now, I've added some code and even changed the DIM(20) to a DIM(1) so
PartMasterLine, I don't see any problems...
Looking at the XML, starting from the last successfully processed
beginning, the remaining data processes without issue...
And if I remove about half the data (from ~1.5GB to ~700MB) from the
get the error.
I've tried "minifying" the XML and I get further into the doc, but still
into whatever document.
Also have tried new versions of the doc, same problem.
The one constant, the error gets thrown at approximately the 1GB mark
to see if anybody else has seen a similar issue.
I've got a case open with IBM, but I thought I'd through this out here
--
We running 7.2 and are pretty up to date on PTFs.
Thanks!
Charles
This is the RPG programming on IBM i (RPG400-L) mailing list
To post a message email: RPG400-L@xxxxxxxxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: https://lists.midrange.com/mailman/listinfo/rpg400-l
or email: RPG400-L-request@xxxxxxxxxxxxxxxxxx
Before posting, please take a moment to review the archives
at https://archive.midrange.com/rpg400-l.
Please contact support@xxxxxxxxxxxx for any subscription related
questions.
Help support midrange.com by shopping at amazon.com with our affiliate
link: https://amazon.midrange.com
As an Amazon Associate we earn from qualifying purchases.
This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].
Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.