|
"Jason M. Felice" <jasonf@shell.nacs.net> writes: > Apparently, version 3.5 of GNU recode (I don't know about prior versions) > installs as a library and contains functions for translating from anything to > anything. Since we have about 30 IBM maps on one side of the equation, and > about 10 ISO maps on the other side of the equation (not to mention possible > support for newer xterms which can support UTF8 - and I heard something about > Linux console and UTF8), this means that we'd either need to produce 600 > translation map tables (one for each direction), or 80 (one to Unicode for > each map and one from Unicode for each map), or just use librecode. Guess > which one I'm voting for? I can't find any reference to IBM840. Poland is supposed to use CCSID 870, according to the AS/400 National Language Reference. tn5250 doesn't support iso-8859-2 anyway. The list of IBM charsets is at: http://publib.boulder.ibm.com/cgi-bin/bookmgr/BOOKS/QB3AWC01/G.2 Recode 3.5 knows about 42 character sets with space in 0x40, which are probably EBCDIC, although the 0.15.6 transmaps.h only contains 39 EBCDIC character sets. I've got a newer version of the transmaps script which I think fixes this. Only 26 of recode's EBCDIC character sets correspond to IBM numeric CCSIDs, actually. This seems to be the same as glibc's data, too. (grep -l 'x40.*SPACE' /usr/share/i18n/charmaps/IBM* | wc -l) Perhaps there's a smaller number of mappings that we should consider, anyway. Looking at the character sets, "-m 870" might as well use ISO-8859-2 instead of -1 anyway - the special characters won't make sense on a Latin-1 terminal either way. (In fact, IBM's description of CCSID 870 is "Latin-2 Multilingual".) I think this would go by the "Character Set" column in the IBM manual. Everything with character set 697 should work with ISO-8859-1. For Greek, CCSID 423 with character set 218 would only work properly with ISO-8859-7 (or Windows codepage 1253). That would make a total of 52 tables, if we support all the numeric EBCDIC CCSIDs in each direction, and pick an appropriate ISO-8859-x character set for each. That's an enormous 13K of memory taken up. > The drawbacks are that most distributions don't have GNU recode installed by > default (I'm assuming, RH6 and RH6.1 don't, neither does RH5.2). Debian comes with recode, at least. With the size of the library, I wouldn't recommend bundling it: $ ls -l /usr/lib/librecode.so.0.0.0 -rwxr-xr-x 1 root root 493848 Jun 18 1999 /usr/lib/librecode.so.0.0.0* > Okay the usual suspects who participated in the iconv() discussion should > participate here, especially if yu know of any shortcomings of the GNU recode > library. Somewhat related to this, I've noticed that the current transmaps.h generated by recode 3.4 doesn't map the control characters (like horizontal tab), although recode 3.5 seems to have fixed this. Does this affect us? > If there is no clean solution, I'll just provide the extra translation map > "pl-iso-8859-2" for the interim. Would anyone be affected if it just replaced the existing IBM870 table? -- Carey Evans http://home.clear.net.nz/pages/c.evans/ This message was composed from the finest electrons used by many of the world's greatest writers. +--- | This is the LINUX5250 Mailing List! | To submit a new message, send your mail to LINUX5250@midrange.com. | To subscribe to this list send email to LINUX5250-SUB@midrange.com. | To unsubscribe from this list send email to LINUX5250-UNSUB@midrange.com. | Questions should be directed to the list owner/operator: david@midrange.com +---
As an Amazon Associate we earn from qualifying purchases.
This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].
Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.