× The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.



John, I'm with you on the difficulties in the XLSX spec - yeah, everything is really spread around.

That's why I chickened out at first - I saw thesharedstrings.xml file and found out that every string in the workbook is really listed there, with indexes to them being the contents of each cell node.

So I went to the 2003 XML Spreadsheet format - it is limited - no drawings, no images, other thing you can't do - but it was ONE XML file, with sections for each sheet, etc. But there were difficulties - size for one, and that apps on iPads wouldn't touch it - and iPads, etc., are being required more and more.

Then I went back to digging into the OOXML spec - and found that things are not so bad as I first thought. And that, in fact, the template approach IS a very viable way to do this.

1. The majority of the files in the package (XLSX is a zipped file containing all the files in the workbook) could be left unchanged.
2. Sheets were each in their own XML and were the main thing that needed to be written to.
3. I could use a thing called an inline string instead of shared strings.
- except for being read in the Numbers app in iPads, so I found another answer =("a string") is a trivial formula that Numbers DOES render properly!
4. A few other files in the package had lists of sheets - these need writing to if you have a variable number of sheets or a varying order in the sheets by name.
5. Some values need to be modified AFTER writing out all the rows - things like the <dimension> node, which needs to know the lower-right cell - the sed QShell function does nicely there.
- I've thought of other ways to handle this kind of thing - putting values into an XML ahead of the sheet data that are known only AFTER writing the sheet data
- An example is to break things up further, then use QShell utilities to concatenate them into the final result
- But sed does a really nice job
6. The jar command can be used in QShell to create the final XLSX file from the constituents, just suppress the manifest file and use the XLSX extension.

Things like SEQUEL can write to XLSX files but do not use Excel templates - I've put in a request!

Scott's POI support does write to XLSX but isn't the fastest kid on the block when building larger workbooks

Things like borders and other formatting and all isn't easily, if at all, doable with these tools - but the template approach - or the hybrid that Henrik has in powerEXT - can do it all very well.

Hope this sparks some thoughts in others - I know this is an opportunity for us in terms of enhancing the use of IBM i in our shop.

Cheers
Vern

On 8/22/2014 9:47 AM, John Yeung wrote:
On Fri, Aug 22, 2014 at 10:02 AM, Bradley Stone <bvstone@xxxxxxxxx> wrote:
On Fri, Aug 22, 2014 at 12:16 AM, John Yeung <gallium.arsenide@xxxxxxxxx>
wrote:

But his current concern is specifically XML (Excel .xlsx files)

I must have missed that. Seems rather proprietary, but still can be done
fairly simply with the template applications mentioned.
The current Microsoft XML-based formats (Office 2007+) are ostensibly open:

http://en.wikipedia.org/wiki/Office_Open_XML

It could be argued that Microsoft threw its weight around a bit
(deliberately or not; justifiably or not), but in the end, it's a
published standard, so in that sense, far less proprietary than "old
Microsoft" at least.

The thing with .xlsx is that, sure, it's XML-based, but as anyone who
has actually worked with it in earnest quickly finds out, the XML is
not the hard part. It's that the information is scattered all around
in various pieces. This is, in part, due to a certain amount of
holdover from the structural design of the older (and even more arcane
and opaque) .xls binary format. Some of the decisions Microsoft made
way back when were probably due to legitimate efficiency/space issues
(sharing strings and formulas where possible, for example), and cynics
could easily argue some decisions were made to ensure the format was
more proprietary and harder to reverse engineer.

In any case, what we're left with is not programmer-friendly, and the
characterization of "still can be done fairly simply with [various
template applications]" isn't one I subscribe to. I mean, the XML
part of it is quite simple, yes. At least as simple as any other XML
thing you're going to do. But .xlsx, well, more power to Vern, but I
would personally be looking harder for a pre-built package that
specifically handles Excel. And I know it has to be fast, so even
though there are excellent Excel packages for dynamic languages like
Python and Ruby, they don't fit the bill. Maybe for the writing part,
at least, Scott Klement might be persuaded to share what he's
developed so far with pure RPG. Short of that, my personal strategy
for this problem (i.e. essentially writing my own fast Excel library
if I can't find one) would be to try to translate some existing
dynamic-language Excel package (which almost by definition would be
open source) to RPG or C++.

John Y.


As an Amazon Associate we earn from qualifying purchases.

This thread ...

Follow-Ups:
Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.