×

Good News Everybody!

The new search engine is LIVE!

Please report any problems to david (at) midrange.com.





Hi Ashish,

If you have enough control over your XML generation, Thorbjrøn's suggestion makes sense and should pretty much work anywhere.

AFAIK, CCSID is a double byte (only) set, so the triple bytes you see are, I believe, artifacts of UTF-8 encoding. That occurs at hex values above 07FF. So, you could change the encoding to UTF-16. But that's only part of the story. Other parts are the encoding you use to save the file and the tool you use to read it.

I ran into an issue the other day that gave me fits and renewed my appreciation of what Java does for you. It was pretty simple: a straightforward HTML error page for Apache that included French. I got the famous boxes and question marks, even though I specified encoding in UTF-8. The base problem was that Windows WordPad defaulted to system encoding (1252 I think.) I tried saving as Unicode, but WordPad uses BOM and the browsers didn't like it. I could have found a tool that would save it properly, but people down the road might not have it, so I owned up to my red face and changed the encoding to ISO 8859-1, which worked for the French characters. With Java in between, I never would have seen the issue at all.

So, I believe the moral is: if you're using other than default encoding on your box, be sure the tools you use are capable of saving and reading the encodings. HTH,


Joe Sam

Joe Sam Shirah - http://www.conceptgo.com
conceptGO - Consulting/Development/Outsourcing
Java Filter Forum: http://www.ibm.com/developerworks/java/
Just the JDBC FAQs: http://www.jguru.com/faq/JDBC
Going International? http://www.jguru.com/faq/I18N
Que Java400? http://www.jguru.com/faq/Java400

----- Original Message ----- From: "Ashish Kulkarni" <kulkarni_ash1312@xxxxxxxxx>
To: <java400-l@xxxxxxxxxxxx>
Sent: Wednesday, July 30, 2008 10:05 AM
Subject: XML file and Japanese characters


Hi
Has any one worked with creating a XML file from database which has Japanese database which has 3 byte characters.
The AS400 file is created with CCSID 5026, i need to get data from this file and create XML file, which will be send to other program
Currently the issue is when i create XML file with UTF-8 these japanese characters become some thing unreadable
So how do i convert these characters to readable UTF-8? or do i have to create XML file with some other encoding.
Any ideas, has anyone worked with project where you are need to get data from non English database into a XML file

Ashish



--
This is the Java Programming on and around the iSeries / AS400 (JAVA400-L) mailing list
To post a message email: JAVA400-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/java400-l
or email: JAVA400-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/java400-l.



As an Amazon Associate we earn from qualifying purchases.

This thread ...

Follow-Ups:
Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2026 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.