Re: Qshell jar zipped file has strange characters -- MIDRANGE-L

Tim:

Consider that the original PKZIP specification started out on PC-DOS and MS-DOS in the 1980s -- life was simpler then -- everything on the PC was ASCII, as far as file names were concerned ... and even the contents of files were expected to be ASCII (codes 0-127) or else assumed to be a "binary" file (e.g. com, .exe, etc.).

"CCSID" is an IBM-only concept, part of IBM's "Character Data Representation Architecture" (CDRA).. "Coded Character Set ID" (CCSID) does not exist on most other non-IBM platforms. See

http://en.wikipedia.org/wiki/CCSID

By the time Java emerged, in the mid-90s, Java adopted the ZIP file format as the file format for JAR files ... and Java (at least initially) ran mainly in an ASCII world, though by then Unicode was also emerging.

I believe the current "standard" for ZIP files and JAR files both specify "UTF-8" for encoding the "directory" information (file names).

Both ZIP and JAR have no notion of "CCSID" as that is an IBM-only concept, and is not even implemented consistently on all IBM platforms. So, to imagine that UNZIP or JAR should somehow "know" (without being told) what CCSID to use, just does not make much sense, to me..

The IFS (other than QSYS.LIB) internally stores path names using Unicode, CCSID 1200 (UTF-16). See this post by Bruce Vining for details:

http://archive.midrange.com/mi400/200411/msg00001.html

AFAIK, OS/400 and IBM i have the best, most deeply integrated support for CCSIDs of any of IBM's operating systems offerings.

Since ZIP and JAR are tools from the "open systems " world (Unix, Linux, PC-DOS, MS-DOS, Windows, etc.), I would not expect these tools to understand or know how to deal with CCSIDs.

HTH,

Mark S. Waterbury

> On 3/2/2015 7:05 PM, Tim Brown wrote:

The problem stems from an incompatibility with archivers. The original zip
spec, on which jar was based, had no ccsid attribute for file names and
instead specified that 437 be used (which doesn't work internationally).
Since then various changes and workarounds have been put in place but
they're not always supported or interoperable. Windows compressed folders
doesn't support anything and expects the char set to be local system. Your
test was jar to jar which i would expect to work as would interop with
recent versions of 7zip and winzip.

This mailing list archive is Copyright 1997-2025 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.