Thanks Blake, It's good to know there are options to handle larger file
sizes.
I will consider all of these possibilities.
Paul
-----Original Message-----
From: RPG400-L <rpg400-l-bounces@xxxxxxxxxxxxxxxxxx> On Behalf Of Blake
Butterworth
Sent: Saturday, February 23, 2019 9:44 AM
To: rpg400-l@xxxxxxxxxxxxxxxxxx
Subject: Re: Large XML file processing via XMLTABLE
My company exchanges XML files with other tolling agencies for national
interoperability. We retrieve files from an interop hub, unzip them in the
IFS and process them using RPG and XML-INTO, which I've found to perform
really well. The business rules require that we ack each file within an
hour, which is normally not a problem, but I struggled initially with one of
the files, a weekly bulk electronic transponder file, because it is north of
4GB unzipped. The RPG XML operations have an approx. 2GB file size
limitation, so I ended up using Java and experimented with different
approaches to be able to process and ack the file in under an hour.
Currently, we retrieve, unzip and parse these large XML files and load
~33-34 million records into two DB2 tables in around 10 mins. My RPG
transponder file processing program employs some Java classes, which
implement the Java STaX parser functionality to parse the XML. To load the
data into DB2, I found the best performance using the Java Toolb ox type 4
JDBC driver to insert records in 500-1000 record batches. I experimented
with various batch sizes, but 500-1000 seems to perform best. The batch
insert approach really speeds things up. In RPG, a similar approach can be
accomplished with data structure arrays. I found the type 4 driver performs
faster than the type 2 driver, and the JDBC batch insert approach faster
than the Toolbox RLA classes. I also tested parsing the large XML files
using Expat because of the lack of a file size limitation, but found Java to
be a lot faster in my experience.
I would be glad to answer any questions, Paul, if you are interested in
potentially utilizing any of these approaches.
Regards,
Blake Butterworth
Application Development Manager
Kansas Turnpike Authority
--
This is the RPG programming on the IBM i (AS/400 and iSeries) (RPG400-L)
mailing list To post a message email: RPG400-L@xxxxxxxxxxxxxxxxxx To
subscribe, unsubscribe, or change list options,
visit:
https://lists.midrange.com/mailman/listinfo/rpg400-l
or email: RPG400-L-request@xxxxxxxxxxxxxxxxxx
Before posting, please take a moment to review the archives at
https://archive.midrange.com/rpg400-l.
Please contact support@xxxxxxxxxxxx for any subscription related questions.
Help support midrange.com by shopping at amazon.com with our affiliate link:
https://amazon.midrange.com
As an Amazon Associate we earn from qualifying purchases.