×
The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.
On 21-Feb-2012 02:39 , Åke Olsson wrote:
We have the following:
A Physical file (customer file) which has CCSID = 37
A user (in Poland) which has CCSID = 870 and a client-access host
code page also set as 870.
What we see is that a specific character in a name field is stored
as X'69' in the database but when the user program retrieves it is
translated to X'72'.
Same thing other way - when the user writes a X'72' character it is
stored in the database as X'69'
In many ways this makes sense; the char the user sees and keys is a
capital "E" with some accents underneath.
What is slightly frustrating is that I cannot find a table that
describes the rules for the translation. Is there one anywhere?
I would need something along the lines:
CCSID-37 hex code translated to CCSID-870 hex code
Not so easy as to find a definitive table defining the rules. There
are nuances for types of conversions, for which each nuance would
require a separate table. Any one code page can be viewed online at the
IBM [software] globalization pages:
IBM Software -> Globalization -> Coded character sets and related
resources -> Coded character set identifiers
http://www.ibm.com/software/globalization/ccsid/ccsid_registered.html
Even so, I sometimes will effectively create two one-byte character
column tables with the two CCSIDs, and then copy every character as a
row from one to the other, and review the outcome; i.e. effectively
generate a report which is a table showing the effect of the
translate-with-CCSID. If interested, I could script that. I have also
exported the '.txt' code page tables [see below] to do other reporting;
mostly to find which elements have no match in the other.
A link to a text version and a .pdf of the code page follows the two
described code points:
Code Page 0037, 0x69 is the letter LN200000 N Tilde Capital
Code Page 0870, 0x72 is the letter LE440000 E Ogonek Capital
ftp://ftp.software.ibm.com/software/globalization/gcoc/attachments/CP00037.txt
ftp://ftp.software.ibm.com/software/globalization/gcoc/attachments/CP00037.pdf
ftp://ftp.software.ibm.com/software/globalization/gcoc/attachments/CP00870.txt
ftp://ftp.software.ibm.com/software/globalization/gcoc/attachments/CP00870.pdf
It seems that the above code point pairing is used between those code
pages to allow /round trip/ of the data, even though the code point is
not representative of the same letter after just one translation. That
is possible because the cp870 does not include LN200000 nor does cp37
include LE440000. However that would not be helpful to anyone using
cp00037 or any other code page wherein the Ñ is represented; i.e. to
them, the name which should have the upper case E with the "accents
underneath", would see the value instead with the upper case Ñ.
Often the /International EBCDIC/ CCSID 500 would be chosen for the
database file to limit such problems, but I presume even that would not
help, because that LE440000 character is not in CP00500 either.
Regards, Chuck
As an Amazon Associate we earn from qualifying purchases.