× The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.


  • Subject: Re: Character maps - GNU recode.
  • From: Carey Evans <c.evans@xxxxxxxxxxxx>
  • Date: 11 Feb 2000 00:23:21 +1300
  • User-Agent: Gnus/5.0803 (Gnus v5.8.3) XEmacs/21.1 (Bryce Canyon)

"Jason M. Felice" <jasonf@shell.nacs.net> writes:

> Apparently, version 3.5 of GNU recode (I don't know about prior versions)
> installs as a library and contains functions for translating from anything to
> anything.  Since we have about 30 IBM maps on one side of the equation, and 
> about 10 ISO maps on the other side of the equation (not to mention possible
> support for newer xterms which can support UTF8 - and I heard something about
> Linux console and UTF8), this means that we'd either need to produce 600
> translation map tables (one for each direction), or 80 (one to Unicode for
> each map and one from Unicode for each map), or just use librecode.  Guess
> which one I'm voting for?

I can't find any reference to IBM840.  Poland is supposed to use
CCSID 870, according to the AS/400 National Language Reference.
tn5250 doesn't support iso-8859-2 anyway.

The list of IBM charsets is at:

  http://publib.boulder.ibm.com/cgi-bin/bookmgr/BOOKS/QB3AWC01/G.2

Recode 3.5 knows about 42 character sets with space in 0x40, which are
probably EBCDIC, although the 0.15.6 transmaps.h only contains 39
EBCDIC character sets.  I've got a newer version of the transmaps
script which I think fixes this.

Only 26 of recode's EBCDIC character sets correspond to IBM numeric
CCSIDs, actually.  This seems to be the same as glibc's data,
too. (grep -l 'x40.*SPACE' /usr/share/i18n/charmaps/IBM* | wc -l)

Perhaps there's a smaller number of mappings that we should consider,
anyway.  Looking at the character sets, "-m 870" might as well use
ISO-8859-2 instead of -1 anyway - the special characters won't make
sense on a Latin-1 terminal either way.  (In fact, IBM's description
of CCSID 870 is "Latin-2 Multilingual".)

I think this would go by the "Character Set" column in the IBM manual.
Everything with character set 697 should work with ISO-8859-1.  For
Greek, CCSID 423 with character set 218 would only work properly with
ISO-8859-7 (or Windows codepage 1253).

That would make a total of 52 tables, if we support all the numeric
EBCDIC CCSIDs in each direction, and pick an appropriate ISO-8859-x
character set for each.  That's an enormous 13K of memory taken up.

> The drawbacks are that most distributions don't have GNU recode installed by
> default (I'm assuming, RH6 and RH6.1 don't, neither does RH5.2).

Debian comes with recode, at least.  With the size of the library, I
wouldn't recommend bundling it:

$ ls -l /usr/lib/librecode.so.0.0.0
-rwxr-xr-x    1 root     root       493848 Jun 18  1999 
/usr/lib/librecode.so.0.0.0*

> Okay the usual suspects who participated in the iconv() discussion should
> participate here, especially if yu know of any shortcomings of the GNU recode
> library.

Somewhat related to this, I've noticed that the current transmaps.h
generated by recode 3.4 doesn't map the control characters (like
horizontal tab), although recode 3.5 seems to have fixed this.  Does
this affect us?

> If there is no clean solution, I'll just provide the extra translation map
> "pl-iso-8859-2" for the interim.

Would anyone be affected if it just replaced the existing IBM870 table?

-- 
         Carey Evans  http://home.clear.net.nz/pages/c.evans/

         This message was composed from the finest electrons
         used by many of the world's greatest writers.
+---
| This is the LINUX5250 Mailing List!
| To submit a new message, send your mail to LINUX5250@midrange.com.
| To subscribe to this list send email to LINUX5250-SUB@midrange.com.
| To unsubscribe from this list send email to LINUX5250-UNSUB@midrange.com.
| Questions should be directed to the list owner/operator: david@midrange.com
+---

As an Amazon Associate we earn from qualifying purchases.

This thread ...

Follow-Ups:
Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.