× The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.



On 21-Feb-2012 02:39 , Åke Olsson wrote:
We have the following:

A Physical file (customer file) which has CCSID = 37

A user (in Poland) which has CCSID = 870 and a client-access host
code page also set as 870.

What we see is that a specific character in a name field is stored
as X'69' in the database but when the user program retrieves it is
translated to X'72'.

Same thing other way - when the user writes a X'72' character it is
stored in the database as X'69'

In many ways this makes sense; the char the user sees and keys is a
capital "E" with some accents underneath.

What is slightly frustrating is that I cannot find a table that
describes the rules for the translation. Is there one anywhere?
I would need something along the lines:

CCSID-37 hex code translated to CCSID-870 hex code


Not so easy as to find a definitive table defining the rules. There are nuances for types of conversions, for which each nuance would require a separate table. Any one code page can be viewed online at the IBM [software] globalization pages:
IBM Software -> Globalization -> Coded character sets and related resources -> Coded character set identifiers
http://www.ibm.com/software/globalization/ccsid/ccsid_registered.html

Even so, I sometimes will effectively create two one-byte character column tables with the two CCSIDs, and then copy every character as a row from one to the other, and review the outcome; i.e. effectively generate a report which is a table showing the effect of the translate-with-CCSID. If interested, I could script that. I have also exported the '.txt' code page tables [see below] to do other reporting; mostly to find which elements have no match in the other.

A link to a text version and a .pdf of the code page follows the two described code points:

Code Page 0037, 0x69 is the letter LN200000 N Tilde Capital
Code Page 0870, 0x72 is the letter LE440000 E Ogonek Capital
ftp://ftp.software.ibm.com/software/globalization/gcoc/attachments/CP00037.txt
ftp://ftp.software.ibm.com/software/globalization/gcoc/attachments/CP00037.pdf
ftp://ftp.software.ibm.com/software/globalization/gcoc/attachments/CP00870.txt
ftp://ftp.software.ibm.com/software/globalization/gcoc/attachments/CP00870.pdf

It seems that the above code point pairing is used between those code pages to allow /round trip/ of the data, even though the code point is not representative of the same letter after just one translation. That is possible because the cp870 does not include LN200000 nor does cp37 include LE440000. However that would not be helpful to anyone using cp00037 or any other code page wherein the Ñ is represented; i.e. to them, the name which should have the upper case E with the "accents underneath", would see the value instead with the upper case Ñ.

Often the /International EBCDIC/ CCSID 500 would be chosen for the database file to limit such problems, but I presume even that would not help, because that LE440000 character is not in CP00500 either.

Regards, Chuck

As an Amazon Associate we earn from qualifying purchases.

This thread ...


Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.