|
So in this case if i change the XML to UTF-16 or ISO 8859-1,
will it work,
From: Joe Sam Shirah <joe_sam@xxxxxxxxxxxxx>
Subject: Re: XML file and Japanese characters
To: "Java Programming on and around the iSeries / AS400"
<java400-l@xxxxxxxxxxxx>
Date: Wednesday, July 30, 2008, 12:26 PM
Hi Ashish,
If you have enough control over your XML generation,
Thorbjrøn's
suggestion makes sense and should pretty much work
anywhere.
AFAIK, CCSID is a double byte (only) set, so the triple
bytes you see
are, I believe, artifacts of UTF-8 encoding. That occurs
at hex values
above 07FF. So, you could change the encoding to UTF-16.
But that's only
part of the story. Other parts are the encoding you use to
save the file
and the tool you use to read it.
I ran into an issue the other day that gave me fits and
renewed my
appreciation of what Java does for you. It was pretty
simple: a
straightforward HTML error page for Apache that included
French. I got the
famous boxes and question marks, even though I specified
encoding in UTF-8.
The base problem was that Windows WordPad defaulted to
system encoding (1252
I think.) I tried saving as Unicode, but WordPad uses BOM
and the browsers
didn't like it. I could have found a tool that would
save it properly, but
people down the road might not have it, so I owned up to my
red face and
changed the encoding to ISO 8859-1, which worked for the
French characters.
With Java in between, I never would have seen the issue at
all.
So, I believe the moral is: if you're using other
than default encoding
on your box, be sure the tools you use are capable of
saving and reading the
encodings. HTH,
Joe Sam
Joe Sam Shirah - http://www.conceptgo.com
conceptGO - Consulting/Development/Outsourcing
Java Filter Forum:
http://www.ibm.com/developerworks/java/
Just the JDBC FAQs: http://www.jguru.com/faq/JDBC
Going International? http://www.jguru.com/faq/I18N
Que Java400? http://www.jguru.com/faq/Java400
----- Original Message ----- From: "Ashish Kulkarni"
<kulkarni_ash1312@xxxxxxxxx>
To: <java400-l@xxxxxxxxxxxx>
Sent: Wednesday, July 30, 2008 10:05 AM
Subject: XML file and Japanese characters
> Hi
> Has any one worked with creating a XML file from
database which has
> Japanese database which has 3 byte characters.
> The AS400 file is created with CCSID 5026, i need to
get data from this
> file and create XML file, which will be send to other
program
> Currently the issue is when i create XML file with
UTF-8 these japanese
> characters become some thing unreadable
> So how do i convert these characters to readable
UTF-8? or do i have to
> create XML file with some other encoding.
> Any ideas, has anyone worked with project where you
are need to get data
> from non English database into a XML file
>
> Ashish
>
>
>
> -- > This is the Java Programming on and around the iSeries
/ AS400 (JAVA400-L)
> mailing list
> To post a message email: JAVA400-L@xxxxxxxxxxxx
> To subscribe, unsubscribe, or change list options,
> visit:
http://lists.midrange.com/mailman/listinfo/java400-l
> or email: JAVA400-L-request@xxxxxxxxxxxx
> Before posting, please take a moment to review the
archives
> at http://archive.midrange.com/java400-l.
>
--
This is the Java Programming on and around the iSeries /
AS400 (JAVA400-L) mailing list
To post a message email: JAVA400-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/java400-l
or email: JAVA400-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/java400-l.
As an Amazon Associate we earn from qualifying purchases.
This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].
Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.