× The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.



Hi Jim,

On 8/27/2012 2:49 PM, James Franz wrote:
Looking for suggestions to speed up parsing xml files with millions of records.
Using Scott's Expat RPGLE and works great, but taking couple hours in batch per file.

First of all, I don't deserve any credit for Expat. I had no part in writing it at all. All I wrote was a few simple prototypes to call it from RPG.

It's really hard to address a performance problem without first determining where your program is spending it's time. Logic dictates that the time would be spent interpreting/parsing the XML, but it could also be spending time reading from disk, or it might be memory strapped, causing it to slow down significantly due to that. Or, of course, it could be in your back-end code (the code that runs in your handler routines)

Just some general suggestions:

1) Try recompiling Expat with OPTIMIZE(40) and DBGVIEW(*NONE) this can make a significant difference in C code, and you're unlikely to need to debug Expat, anyway.

2) Make sure the code that reads the XML data (you didn't say where you're getting it from... should I assume a stream file?) is reading optimially. If it's a stream file, make sure the buffer size is a multiple of the disk block size (available from the statvfs() API). I would suggest about 20-30 times the disk block size would be a good starting value.

3) Try using the PEX APIs to insert milestone checkpoints that you can use to pare down where the performance issues are occurring.

-SK

As an Amazon Associate we earn from qualifying purchases.

This thread ...

Follow-Ups:
Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.