× The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.


  • Subject: Re: Character maps - GNU recode.
  • From: "Jason M. Felice" <jasonf@xxxxxxxxxxxxxx>
  • Date: Thu, 10 Feb 2000 09:49:09 -0500

On Fri, Feb 11, 2000 at 12:23:21AM +1300, Carey Evans wrote:
> "Jason M. Felice" <jasonf@shell.nacs.net> writes:
> 
> > Apparently, version 3.5 of GNU recode (I don't know about prior versions)
> > installs as a library and contains functions for translating from anything 
>to
> > anything.  Since we have about 30 IBM maps on one side of the equation, and 
> > about 10 ISO maps on the other side of the equation (not to mention possible
> > support for newer xterms which can support UTF8 - and I heard something 
>about
> > Linux console and UTF8), this means that we'd either need to produce 600
> > translation map tables (one for each direction), or 80 (one to Unicode for
> > each map and one from Unicode for each map), or just use librecode.  Guess
> > which one I'm voting for?
> 
> I can't find any reference to IBM840.  Poland is supposed to use
> CCSID 870, according to the AS/400 National Language Reference.
> tn5250 doesn't support iso-8859-2 anyway.

It was 870, not 840.  Apologies.  The reason tn5250 doesn't support 
iso-8859-2 is just because that's not what comes out the other end of the
translation maps, so far as I can tell.  The polish codepage submission
was basically an 'alternate' encoding of IBM-870 which translated to
iso-8859-2 instead of iso-8859-1.

> 
> The list of IBM charsets is at:
> 
>   http://publib.boulder.ibm.com/cgi-bin/bookmgr/BOOKS/QB3AWC01/G.2
> 
> Recode 3.5 knows about 42 character sets with space in 0x40, which are
> probably EBCDIC, although the 0.15.6 transmaps.h only contains 39
> EBCDIC character sets.  I've got a newer version of the transmaps
> script which I think fixes this.
> 
> Only 26 of recode's EBCDIC character sets correspond to IBM numeric
> CCSIDs, actually.  This seems to be the same as glibc's data,
> too. (grep -l 'x40.*SPACE' /usr/share/i18n/charmaps/IBM* | wc -l)
> 
> Perhaps there's a smaller number of mappings that we should consider,
> anyway.  Looking at the character sets, "-m 870" might as well use
> ISO-8859-2 instead of -1 anyway - the special characters won't make
> sense on a Latin-1 terminal either way.  (In fact, IBM's description
> of CCSID 870 is "Latin-2 Multilingual".)

Hmm, this might make a decent interim solution, but this makes difficult
or impossible handling some of the larger character sets.  Some important
ones are Chinese, Japanese, and Klingon. (Does IBM have a Klingon codepage?
Linux does.)

> 
> I think this would go by the "Character Set" column in the IBM manual.
> Everything with character set 697 should work with ISO-8859-1.  For
> Greek, CCSID 423 with character set 218 would only work properly with
> ISO-8859-7 (or Windows codepage 1253).
> 
> That would make a total of 52 tables, if we support all the numeric
> EBCDIC CCSIDs in each direction, and pick an appropriate ISO-8859-x
> character set for each.  That's an enormous 13K of memory taken up.

Hmm, I think this burns the DBCS/Unicode bridge.

> 
> > The drawbacks are that most distributions don't have GNU recode installed by
> > default (I'm assuming, RH6 and RH6.1 don't, neither does RH5.2).
> 
> Debian comes with recode, at least.  With the size of the library, I
> wouldn't recommend bundling it:
> 
> $ ls -l /usr/lib/librecode.so.0.0.0
> -rwxr-xr-x    1 root     root       493848 Jun 18  1999 
>/usr/lib/librecode.so.0.0.0*

I agree, but we would need to do *something* to make it easier (and more
clear) how to install the emulator.

> 
> > Okay the usual suspects who participated in the iconv() discussion should
> > participate here, especially if yu know of any shortcomings of the GNU 
>recode
> > library.
> 
> Somewhat related to this, I've noticed that the current transmaps.h
> generated by recode 3.4 doesn't map the control characters (like
> horizontal tab), although recode 3.5 seems to have fixed this.  Does
> this affect us?

No, we handle all the ebcdic control characters, and we shouldn't be displaying
any ASCII characters which aren't printable.

> 
> > If there is no clean solution, I'll just provide the extra translation map
> > "pl-iso-8859-2" for the interim.
> 
> Would anyone be affected if it just replaced the existing IBM870 table?
> 

I'm curious.

-Jay 'Eraserhead' Felice
+---
| This is the LINUX5250 Mailing List!
| To submit a new message, send your mail to LINUX5250@midrange.com.
| To subscribe to this list send email to LINUX5250-SUB@midrange.com.
| To unsubscribe from this list send email to LINUX5250-UNSUB@midrange.com.
| Questions should be directed to the list owner/operator: david@midrange.com
+---

As an Amazon Associate we earn from qualifying purchases.

This thread ...

Follow-Ups:
Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.