Hi Ashish,

So in this case if i change the XML to UTF-16 or ISO 8859-1,
will it work,

ISO 8859-1 will only work using Thorbjrøn's suggestion; that is,
everything must be in that encoding, which is why he suggested the &#xyz;
formatting (notice that's *formatting*, not encoding.) That's because IFS
files default to ASCII (on CCSID 37 machines, AFAIK.)

For other encodings, you have to explicitly set the encoding when
writing/reading the file. Just changing the encoding parameter in the XML
text doesn't do anything. You should be able to write UTF-8 as easily (or
with as much difficulty) as anything else. I am assuming that you gen the
XML, then write it to disk, then your web service reads the XML file.

The really bad news is that apparently Java has a problem with Unicode
BOM's as well, see this link for an explanation and proposed solution:

Java's UTF-8 and Unicode writing is broken - Use this fix
http://tripoverit.blogspot.com/2007/04/javas-utf-8-and-unicode-writing-is.html

I haven't tried it myself, but I'd be really interested in the results As you may have noticed, I don't have a lot of spare time at the moment.


Joe Sam

Joe Sam Shirah - http://www.conceptgo.com
conceptGO - Consulting/Development/Outsourcing
Java Filter Forum: http://www.ibm.com/developerworks/java/
Just the JDBC FAQs: http://www.jguru.com/faq/JDBC
Going International? http://www.jguru.com/faq/I18N
Que Java400? http://www.jguru.com/faq/Java400

----- Original Message ----- From: "Ashish Kulkarni" <kulkarni_ash1312@xxxxxxxxx>
To: "Java Programming on and around the iSeries / AS400"
<java400-l@xxxxxxxxxxxx>
Sent: Wednesday, July 30, 2008 3:07 PM
Subject: Re: XML file and Japanese characters


Hi
I am not worried about how they are displayed,
My requirement is to create a XML file and then call a WebService using this
XML file,
Then it is will be the webservice to interpret the data,
So in this case if i change the XML to UTF-16 or ISO 8859-1,
will it work,

A$HI$H


--- On Wed, 7/30/08, Joe Sam Shirah <joe_sam@xxxxxxxxxxxxx> wrote:

From: Joe Sam Shirah <joe_sam@xxxxxxxxxxxxx>
Subject: Re: XML file and Japanese characters
To: "Java Programming on and around the iSeries / AS400"
<java400-l@xxxxxxxxxxxx>
Date: Wednesday, July 30, 2008, 12:26 PM
Hi Ashish,

If you have enough control over your XML generation,
Thorbjrøn's
suggestion makes sense and should pretty much work
anywhere.

AFAIK, CCSID is a double byte (only) set, so the triple
bytes you see
are, I believe, artifacts of UTF-8 encoding. That occurs
at hex values
above 07FF. So, you could change the encoding to UTF-16.
But that's only
part of the story. Other parts are the encoding you use to
save the file
and the tool you use to read it.

I ran into an issue the other day that gave me fits and
renewed my
appreciation of what Java does for you. It was pretty
simple: a
straightforward HTML error page for Apache that included
French. I got the
famous boxes and question marks, even though I specified
encoding in UTF-8.
The base problem was that Windows WordPad defaulted to
system encoding (1252
I think.) I tried saving as Unicode, but WordPad uses BOM
and the browsers
didn't like it. I could have found a tool that would
save it properly, but
people down the road might not have it, so I owned up to my
red face and
changed the encoding to ISO 8859-1, which worked for the
French characters.
With Java in between, I never would have seen the issue at
all.

So, I believe the moral is: if you're using other
than default encoding
on your box, be sure the tools you use are capable of
saving and reading the
encodings. HTH,



Joe Sam

Joe Sam Shirah - http://www.conceptgo.com
conceptGO - Consulting/Development/Outsourcing
Java Filter Forum:
http://www.ibm.com/developerworks/java/
Just the JDBC FAQs: http://www.jguru.com/faq/JDBC
Going International? http://www.jguru.com/faq/I18N
Que Java400? http://www.jguru.com/faq/Java400

----- Original Message ----- From: "Ashish Kulkarni"
<kulkarni_ash1312@xxxxxxxxx>
To: <java400-l@xxxxxxxxxxxx>
Sent: Wednesday, July 30, 2008 10:05 AM
Subject: XML file and Japanese characters


> Hi
> Has any one worked with creating a XML file from
database which has
> Japanese database which has 3 byte characters.
> The AS400 file is created with CCSID 5026, i need to
get data from this
> file and create XML file, which will be send to other
program
> Currently the issue is when i create XML file with
UTF-8 these japanese
> characters become some thing unreadable
> So how do i convert these characters to readable
UTF-8? or do i have to
> create XML file with some other encoding.
> Any ideas, has anyone worked with project where you
are need to get data
> from non English database into a XML file
>
> Ashish
>
>
>
> -- > This is the Java Programming on and around the iSeries
/ AS400 (JAVA400-L)
> mailing list
> To post a message email: JAVA400-L@xxxxxxxxxxxx
> To subscribe, unsubscribe, or change list options,
> visit:
http://lists.midrange.com/mailman/listinfo/java400-l
> or email: JAVA400-L-request@xxxxxxxxxxxx
> Before posting, please take a moment to review the
archives
> at http://archive.midrange.com/java400-l.
>

--
This is the Java Programming on and around the iSeries /
AS400 (JAVA400-L) mailing list
To post a message email: JAVA400-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/java400-l
or email: JAVA400-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/java400-l.




As an Amazon Associate we earn from qualifying purchases.

This thread ...

Follow-Ups:
Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2022 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.