RE: Vertical bar translation -- MIDRANGE-L

Tom, I don't know whether this'll make sense, but it's straight from the Printer Device Programming manual, which has a lot on fonts, etc., in Appendix D.

Character sets are used with code pages to determine how each character will appear in the printed output. Code pages consist of hexadecimal identifiers (code points) assigned to character identifiers.

As I understand this, a character set is all the possible characters in a font, which IBM identifies as a type family (Sonoran Serif, e.g.), a typeface (bold compressed), and a type size (10 pt) - Sonoran Serif 10 pt bold.

A code page is a subset of the characters in the whole set, mapped to the 256 possible hex values (double-byte sets not considered for now) of a single character - these hex values (code points) are the actual values in the file - these never change.

Try going into qsh, then enter
cat > codepage.txt
This is the currency symbol for the U.S. - $
F3

Then wrklnk Find codepage.txt, put a 5 on it, and it ought to look like the above. If you take F10, you can see the code points (the nhex values). Take F15, take option 3, change the code page to 285, ENTER, and F12 - you should see the British currency symbol of old - the pound. Take F10 again, and you will see that the hex value has not changed. Both the pound and the dollar are in the character set - the code page determines which one is displayed for code point 5B.

HTH a little

Vern

At 09:28 PM 10/21/2003 -0400, you wrote:

Rob:

No good answer, just a bunch comments. I run into similar cases, once every year or two. Each time, I go through the laborious process of cross-referencing CCSIDs and/or code pages and/or character sets and probably EBCDIC, ASCII and decimal values as well. I've never been sure what the parts all actually mean. But I've created my mental maps with guidelines that seem to work much of the time in letting me track issues down.

I'm not writing any of this claiming any of it's correct. More hoping that someone who _knows_ something will jump in and make corrections or add refinements. Some day I'd really like to learn this and I suspect many feel the same.

First, what is a "character set"? I tend to think of them as applying mostly to the human/machine interface. The character set determines which character to display or print. This is very closely related to a font in that a font will specify how the exact shape of a character will be drawn. I'm not sure if a "character set" is ever actually used unless data is ready to be presented to a human. The character set, then, determines whether a particular bit pattern in your code page will be rendered as a vertical bar or as an upper-case, superscripted Q.

Next, why "code pages"? Since code pages seem to be tied to language groups, I've thought that code page translation has to do with machine logic. In particular, collating sequences or some such. Some bit patterns in some languages are assigned to special forms of letters -- a lower-case "o" with an umlaut for example. (Off the top of my head; I have no idea how that fits into any collating sequence in any language.) But the bit pattern for a simple "o" compared to an umlauted "o" should be sortable in a meaningful sense, and the sequence might differ from language to language.

Therefore, translation from one code page/character set to another must be defined so that collating continues to make sense as data travel between unlike systems. When code page/character sets go with data, the collating can be handled in predictable ways as long as there's universal agreement. Some languages might have umlaut-"o" all the time and need them sequenced right alongside regular "o"; others might have no significant use for them. Efficiency alone might be enough reason to micro-code differently for them.

For CCSIDs, these seem to be kind of combined code page and character set together in a single identifier. Code pages and character sets seem to be losing out world-wide to CCSIDs.

Okay, so you have a keyboard setting and/or display device setting that tells your computer something about the _meanings_ of the bit patterns that get sent when you strike a key (code page/character set). You also have a file with CCSID that says something about what translation should happen when data goes into/out of the file. At the other end, when another human looks at the data on maybe a printout, the bit patterns were sent to the printer which had settings about what valid characters it could render (code page/character set).

You mention a file as the intermediary that had CCSID 65535, i.e., no translation, no interpretation. I suspect that unless your input device had the exact settings as the output device, some characters will be misrepresented. This can be true even if you type on your keyboard, view the result on your display and then print the file on your personal printer plugged right into your PC. Since you often have no way to predict anything on the output side, you'll have to perform explicit conversion. CCSID 65535 will probably have to go. Specify an actual CCSID.

Now, hopefully, someone else will add some truly useful bits to this. I've exhausted my conjecture. Maybe we can even get a decent FAQ put together that describes all these in real-world terms.

Tom Liotta

midrange-l-request@xxxxxxxxxxxx wrote:
>   5. Vertical bar translation (rob@xxxxxxxxx)
>
>I have a file in a library.  The CCSID of this file, (from DSPFD) is
>65535.
>I use the following command on this file:
>CPYTOSTMF  +
>                          FROMMBR('/qsys.lib/ediftpdta.lib/gxsftpun.f+
>                          ile/gxsftpun.mbr') +
>                          TOSTMF('/ediftpdta/edioutifs') +
>                          STMFOPT(*REPLACE) STMFCODPAG(*STMF) +
>                          ENDLINFMT(*CRLF)
>Then I use FTP in the standard ASCII method to move this file to another
>platform.
>
>Problem:  Hex CA is a vertical bar.  On my keyboard it's the cap's mode of
>the backslash just above the Enter key.
>If I ftp this down to my PC and use either Notepad or Wordpad it looks
>great - a vertical bar.
>However if I use TYPE thefilename from DOS it looks like a superscripted
>underlined capital Q.
>I don't know how to use these utilities to display the ascii code of the
>data.
--
Tom Liotta
The PowerTech Group, Inc.
19426 68th Avenue South
Kent, WA 98032
Phone  253-872-7788 x313
Fax    253-872-7904
http://www.powertech.com
__________________________________________________________________
McAfee VirusScan Online from the Netscape Network.
Comprehensive protection for your entire computer. Get your free trial today!
http://channels.netscape.com/ns/computing/mcafee/index.jsp?promo=393397
Get AOL Instant Messenger 5.1 free of charge.  Download Now!
http://aim.aol.com/aimnew/Aim/register.adp?promo=380455
_______________________________________________
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.