MIDRANGE dot COM Mailing List Archive



Home » MIDRANGE-L » January 2014

Re: Conversion from UTF-8 to EBCDIC with iconv



fixed

On 30-Jan-2014 07:04 -0800, Jan Grove Vejlstrup wrote:
I want to use iconv to convert from UTF-8 (ccsid 1208) to EBCDIC
(ccsid 500). What happens, if I have a character, that exists in
UTF-8 but not in EBCDIC ?

I suppose that depends somewhat on what is requested on the invocation. For example having used a /best-fit/ "conversion alternative" specification, should provide for "nonidentical conversions performed based on the substitution alternative" specified. I think that quoted text meant to suggest the "conversion alternative" vs "substitution alternative"; i.e. although separately specified [in the "fromcode string"], the docs seem to refer to them interchangeably, though apparently intending to refer to just one of the effects. But as Bruce suggested, when a substitution is required, then the "substitution character" for the target CCSID should be expected; the "substitution alternative" specification enables feedback on the number of substitution characters that were placed in the outbuf. Anyhow, from the docs:

<http://pic.dhe.ibm.com/infocenter/iseries/v7r1m0/topic/apis/iconv.htm>
iconv()--Code Conversion API
"...
During conversion, iconv() may encounter valid characters in the input buffer that do not exist in the target CCSID. This is known as a character mismatch. In this case, iconv() performs the conversion based on the conversion alternative specified on the iconv_open() function.
...
_Parameters_

cd

INPUT

The conversion descriptor returned by the iconv_open() or QtqIconvOpen() function that represents the following:
• CCSIDs to convert from and to
• The conversion alternative to use for character mismatches
• The substitution alternative
...
...
During conversion, iconv() may encounter valid characters in the input buffer that do not exist in the target CCSID. This is known as a character mismatch. In this case, iconv() performs the conversion based on the conversion alternative specified on the iconv_open() function.
..."


<http://pic.dhe.ibm.com/infocenter/iseries/v7r1m0/topic/apis/iconvopn.htm>
iconv_open()--Code Conversion Allocation API
"...
_Conversion alternative_. The conversion alternative that is selected to convert graphic character data. This value is only used on the fromcode parameter. The following values can be used:

000 The IBM®-defined default conversion method and the associated conversion tables. Most of the default tables follow the round-trip conversion criterion. For the default tables that do not follow the round-trip conversion criterion, see the i5/OS globalization topic collection.

057 The enforced subset match (substitution) criterion. For the CCSID conversion pairs that support this criterion, see i5/OS globalization.

102 The best-fit conversion criterion for character mismatch.
..."






Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2014 by MIDRANGE dot COM and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available here. If you have questions about this, please contact