My preference is to just stick to XML-SAX. Once I learned the details and noticed how much faster it is as processing XML documents then I have yet write another XML-INTO process. XML-SAX allows you to handle large documents and/or those that are not formatted right to fit into a DS with better results (IMO). My work has been to take an XML document, parse and then load a database record with values. I don’t need a very large variable to handle all the incoming data that will be parsed from the XML-INTO function.
I use this program code to analyze a XML document before I write a XML-SAX process to handle it. There are somethings that trip most up. One is the special characters not being encoded before you process. The main culprit is the '&' character. Others are UCS encoding. The RPG reference was a big help learning to use the XML-SAX parsing way. I copied parts of the sample code into this program. When you get data with a predefined reference like < (<),> (>) or & (&) the parser splits up the data into multiple calls to the hander. So you have to take into account the need to hold on to values without knowing if you have one of these predefined data.
// Data structure used as a parameter between
// the XML-SAX operation and the handling
// procedure.
// - "attrName" is set by the procedure doing the
// XML-SAX operation and used by the handling procedure
// - "attrValue" is set by the handling procedure
// and used by the procedure doing the XML-SAX
// operation
// - "haveAttr" is used internally by the handling
// procedure
/COPY QCPYSRC,HSPECLE
Ctl-Opt Debug(*XMLSAX);
Dcl-F Qsysprt Printer(132) Usage(*Output);
Dcl-F Print Printer(132) Usage(*Output);
Dcl-Ds Qsysprtds Len(132);
End-Ds;
Dcl-Ds Printds Len(132);
End-Ds;
D Xmlevnts DS Qualified Inz
D Eventid 10I 0 Dim(25) Overlay(Xmlevnts:*NEXT)
D Event 30A Dim(25) Overlay(Xmlevnts:*NEXT)
D Eventid# 10I 0 Dim(25) Overlay(Xmlevnts:*NEXT)
Dcl-S Element# Int(10);
Dcl-S Idx Int(10);
Dcl-S Idx2 Int(10);
/COPY QCPYSRC,Dspgminf4
D Xmlrc 10I 0 Overlay(Pgminf:368)
Dcl-Ds Info Inz;
Attrname Varchar(20);
Haveattr Ind;
Attrvalue Varchar(20);
End-Ds;
// Prototype for procedure "myHandler" defining
// the communication-area parameter as being
// like data structure "info"
Dcl-Pr Myhandler Int(10);
Commarea Likeds(Info);
Event Int(10) Value;
String Pointer Value;
Stringlen Int(20) Value;
Exceptionid Int(10) Value;
End-Pr;
Dcl-S Xmldoc Varchar(265);
// Prototype for Tstxmlsax1
Dcl-Pr Tstxmlsax1;
Inpdoc_ Like(Inpdoc);
End-Pr;
// *ENTRY Interface for Main Procedure
Dcl-Pi Tstxmlsax1;
Inpdoc Char(64);
End-Pi;
Xmldoc = %TRIM(Inpdoc);
// Start SAX processing. The procedure "myHandler"
// will be called for every SAX event; the first
// parameter will be the data structure "info".
Xml-Sax(E) %HANDLER(Myhandler : Info)
%XML(Xmldoc :'doc=file ccsid=ucs2');
// The XML-SAX operation is complete. The
// communication area can be checked to get the
// value of the attribute.
If Not %ERROR() And Attrvalue <> '';
Dsply (Attrname + '=' + Attrvalue);
Endif;
For Idx2 = 1 To Idx By 1;
Qsysprtds = 'XML Event: ' +Xmlevnts.Event(Idx2)
+' = ' +%EDITC(Xmlevnts.Eventid(Idx2):'3')
+' Count:' +%EDITC(Xmlevnts.Eventid#(Idx2) :'3')
;
Write Qsysprt Qsysprtds;
Endfor;
*INLR = *ON;
Return;
// The SAX handling procedure "myHandler"
Dcl-Proc Myhandler;
Dcl-Pi *N Int(10);
Comm Likeds(Info);
Event Int(10) Value;
String Pointer Value;
Stringlen Int(20) Value;
Exceptionid Int(10) Value;
End-Pi;
Dcl-S Value Char(65535) Based(String);
Dcl-S Ucs2Value Ucs2(16383) Based(String);
Dcl-S Rc Int(10) Inz(0);
Dcl-S Parentevt Like(Event) Static;
Element# += 1;
Idx2 = %LOOKUP(Event :Xmlevnts.Eventid);
If Idx2 > 0;
Xmlevnts.Eventid#(Idx2) += 1;
Else;
Idx += 1;
Xmlevnts.Eventid(Idx) = Event;
Xmlevnts.Eventid#(Idx) += 1;
Idx2 = Idx;
Endif;
Select;
When Event = *XML_Start_Document;
Xmlevnts.Event(Idx2) = '*XML_START_DOCUMENT';
When Event = *XML_Version_Info;
Xmlevnts.Event(Idx2) = '*XML_VERSION_INFO';
When Event = *XML_Encoding_Decl;
Xmlevnts.Event(Idx2) = '*XML_ENCODING_DECL';
When Event = *XML_Standalone_Decl;
Xmlevnts.Event(Idx2) = '*XML_STANDALONE_DECL';
When Event = *XML_Doctype_Decl;
Xmlevnts.Event(Idx2) = '*XML_DOCTYPE_DECL';
When Event = *XML_Start_Element;
Xmlevnts.Event(Idx2) = '*XML_START_ELEMENT';
Parentevt = Event;
When Event = *XML_Chars;
Xmlevnts.Event(Idx2) = '*XML_CHARS(ELEMENT) ['+%CHAR(Parentevt)+']';
When Event = *XML_Predef_Ref;
Xmlevnts.Event(Idx2) = '*XML_PREDEF_REF';
When Event = *XML_Ucs2_Ref;
Xmlevnts.Event(Idx2) = '*XML_UCS2_REF';
When Event = *XML_Unknown_Ref;
Xmlevnts.Event(Idx2) = '*XML_UNKNOWN_REF';
When Event = *XML_End_Element;
Xmlevnts.Event(Idx2) = '*XML_END_ELEMENT';
Parentevt = Event;
When Event = *XML_Attr_Name;
Xmlevnts.Event(Idx2) = '*XML_ATTR_NAME';
When Event = *XML_Attr_Chars;
Xmlevnts.Event(Idx2) = '*XML_ATTR_CHARS';
When Event = *XML_Attr_Predef_Ref;
Xmlevnts.Event(Idx2) = '*XML_ATTR_PREDEF_REF';
When Event = *XML_Attr_Ucs2_Ref;
Xmlevnts.Event(Idx2) = '*XML_ATTR_UCS2_REF';
When Event = *XML_Unknown_Attr_Ref;
Xmlevnts.Event(Idx2) = '*XML_UNKNOWN_ATTR_REF';
When Event = *XML_End_Attr;
Xmlevnts.Event(Idx2) = '*XML_END_ATTR';
When Event = *XML_Pi_Target;
Xmlevnts.Event(Idx2) = '*XML_PI_TARGET';
When Event = *XML_Pi_Data;
Xmlevnts.Event(Idx2) = '*XML_PI_DATA';
When Event = *XML_Start_Cdata;
Xmlevnts.Event(Idx2) = '*XML_START_CDATA';
Parentevt = Event;
When Event = *XML_End_Cdata;
Xmlevnts.Event(Idx2) = '*XML_END_CDATA';
Parentevt = Event;
When Event = *XML_Comment;
Xmlevnts.Event(Idx2) = '*XML_COMMENT';
When Event = *XML_Exception;
Xmlevnts.Event(Idx2) = '*XML_EXCEPTION';
Printds = 'Parsed to position:' +%CHAR(Stringlen) +' ExceptID:'
+%CHAR(Exceptionid);
Write Print Printds;
When Event = *XML_End_Document;
Xmlevnts.Event(Idx2) = '*XML_END_DOCUMENT';
Endsl;
Printds = Xmlevnts.Event(Idx2)
+' = ' +%EDITC(Xmlevnts.Eventid(Idx2):'3')
;
If Stringlen > 0;
Printds = %TRIMR(Printds)+ ' :: '
+%SUBST(%CHAR(Ucs2Value) :1 :%INT(Stringlen/2))
;
Endif;
Write Print Printds;
Return Rc;
End-Proc;
Thanks, Matt
-----Original Message-----
From: RPG400-L [mailto:rpg400-l-bounces@xxxxxxxxxxxx] On Behalf Of Jon Paris
Sent: Wednesday, May 04, 2016 9:15 AM
To: Rpg400 Rpg400-L
Subject: Re: XML best process
RPG’s limits have gone up a lot but not to the G level!
XML-INTO’s %Handler option is designed for this purpose and may be usable here. Without seeing all the details of the XML it is hard to tell.
If that won’t work for some reason, the only other thing (beyond XML-SAX) that I can think of would be to use %Scan to find the begin/end of a particular element and then pass that “lump” to XML-INTO. But if that works you should have been able to get %Handler to work anyway come to think of it.
Holler if you need %Handler examples.
Jon Paris
www.partner400.com
www.SystemiDeveloper.com
On May 4, 2016, at 10:52 AM, John R. Smith, Jr. <smith5646midrange@xxxxxxxxx> wrote:
I have a very deeply nested XML with each level having a variable
number of entries.
It is my understanding that in order to use XML-INTO, I create a data
structure for each level and have it dim-ed within the previous level
data structure. So, assuming only 4 levels of variable entries (and
there are more in some areas), if I assume a possible 50 entries per
level and 200 bytes for the bottom level, that means I have 6.25M
occurrences of the bottom level data structure (50 * 50 * 50 * 50) for
a total of 1.25G of memory. OUCH!!!
If I use SAX, I think I can alloc the memory for the data structure as
I need it thus reducing it from 1.25G but then I have to keep track of
where I am in the XML. This seems to be a lot more difficult.
Am I missing something in these two options or is there another way to
process my file that I haven't found yet?
--
This is the RPG programming on the IBM i (AS/400 and iSeries)
(RPG400-L) mailing list To post a message email: RPG400-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/rpg400-l
or email: RPG400-L-request@xxxxxxxxxxxx Before posting, please take a
moment to review the archives at http://archive.midrange.com/rpg400-l.
Please contact support@xxxxxxxxxxxx for any subscription related questions.
--
This is the RPG programming on the IBM i (AS/400 and iSeries) (RPG400-L) mailing list To post a message email: RPG400-L@xxxxxxxxxxxx To subscribe, unsubscribe, or change list options,
visit:
http://lists.midrange.com/mailman/listinfo/rpg400-l
or email: RPG400-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives at
http://archive.midrange.com/rpg400-l.
Please contact support@xxxxxxxxxxxx for any subscription related questions.
As an Amazon Associate we earn from qualifying purchases.