Recommended CCSID for UTF-8 is 1208 see http://publib.boulder.ibm.com/infocenter/iseries/v5r3/index.jsp?topic=%2Frzaha%2Ffileenc.htm
UTF8 is identical to 8-bit ASCII in the lower order 128 character positions (0-127), where the highest order bit is 0. Then it moves into two-three-four-five-or-six bytes per character. The 1D you see is a translation into EBCDIC of the original using the translation tables for whatever CCSID you specify, though it could be the byte after an escape byte that is not being caught. To change DSPFIL to another CCSID press F15 and type 3 in the option field and 1208 in the CCSID field. Doing that will probably produce no visible difference. You'll get a message that only values common between 1208 and 37 (or whatever CCSID your job uses) can be edited reliably.
From: midrange-l-bounces@xxxxxxxxxxxx [mailto:midrange-l-bounces@xxxxxxxxxxxx] On Behalf Of J Franz
Sent: Friday, October 12, 2012 10:36 AM
To: Midrange Systems Technical Discussion
Subject: Parse error ... not well formed token
Using Expat parser, and UTF8 file - it failed at line 2,097,268 (so I thought ccsid 1252 should be good?) What looks like a blank space is a hex 1D (WRKLNK opt 5-DSPFIL)
Any way to keep parser from throwing up (the ignore opt on CPF9897 ended the pgm)? Or a method of
verifiying all characters are valid before the parsing?
2nd issue is have several files "too big" for DSPFIL to open - any other options? Files are in the 2 - 3 gig range and it is not an option for exporting system to break them down. I could write a pgm to split them, but would
prefer not to. v6r1
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list To post a message email: MIDRANGE-L@xxxxxxxxxxxx To subscribe, unsubscribe, or change list options,
or email: MIDRANGE-L-request@xxxxxxxxxxxx Before posting, please take a moment to review the archives at http://archive.midrange.com/midrange-l