|
Bruce Jin wrote: >When I use jdbc on PC to query files on as400, the alpha fields are all >messed up if the file has a ccsid=65535. Files with ccsid=37 are displayed >ok. Why? Because Java does, in fact, know what CCSID 65535 is. It is "binary" and therefore in an unknown character set. There are no good choices if the 65535 is what it says it is and you don't, yourself, know what the data is supposed to be. The whole point of 65535 is that the system "knows it does not know" so it doesn't translate it. I have seen it proposed in my own performance area for code that comes out of deep trace interfaces in OS/400 and where, in fact, some kinds of data might change its CCSID, record by record, depending on what is being traced. The is the one downside case of the JDBC interface. One of the things that makes Java great is that it almost always knows how to translate data to a String and it knows that a String is supposed to be Unicode. JDBC especially so. Here, JDBC doesn't know by definition. What is it to do? JDBC makes the best of a bad bargain by simply expanding individual bytes into the Unicode character and presenting this as a String, hoping it is somehow IBM code page 819 (ISO 8859-1) or 1252 (Microsoft Latin 1), neither which is officially supported in OS/400's data base, but which can actually be snuck into an OS/400 DB using 65535. If you don't happen to know all that, or care about all that, then this "expand one byte to two and stuff it into the Java character in the String" behavior seems odd, especially if you do know the underlying data is "some" EBCDIC code page. But, it is no different than what happens if you read an EBCDIC encoded IFS file using interfaces that avoid translation, though the latter often takes more work to achieve. The real problem here is that a lot of 65535 are really some kind of EBCDIC code page. That seems to be your situation, I'm guessing. A lot of C or RPG code can trudge on by pretending that 65535 data is IBM CCSID 500 or IBM CCSID 37 or even the JOB CCSID, even if the data is some oddball mixture of EBCDIC code pages. But, it is supposed to be up to you, the programmer, to somehow figure out which situation is and how to translate it. That's even true in RPG if you actually care about the precise CCSID in use. As mentioned, some people use 65535 as a way to sneak US ASCII into an OS/400 data base, something Java environments are a little more likely to bring with them than RPG. Those people do what you do with such "snuck in data" see pretty much expected results because 65535 happens to hold ASCII instead of EBCDIC that time. Meanwhile, properly tagged data (CCSID 37 or anything else) gets properly translated to Unicode by JDBC, which makes you and them happy. I have dealt with this in Java, where 65535 really meant "unknown or mixed EBCDIC code pages", and it is a royal pain. What it boils down to, is that you have to decide yourself what it should be translated to and perform that yourself. I suggest you read the data as byte arrays and investigate (I think it is) the AS400Text object in the toolbox to help with the translation. Good luck. Larry W. Loen - Senior Java and AS/400 Performance Analyst Dept HP4, Rochester MN +--- | This is the JAVA/400 Mailing List! | To submit a new message, send your mail to JAVA400-L@midrange.com. | To subscribe to this list send email to JAVA400-L-SUB@midrange.com. | To unsubscribe from this list send email to JAVA400-L-UNSUB@midrange.com. | Questions should be directed to the list owner: joe@zappie.net +---
As an Amazon Associate we earn from qualifying purchases.
This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].
Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.