On 17-Oct-2016 11:33 -0500, Tom L. Deskevich wrote:
I have to say I do not know the double byte character set.
DBCS is little different than SBCS, when the topic is EBCDIC CCSIDs;
they are all quite nearly language-specific. Just take a quick peek at a
list of [Coded Character Set IDentifiers]
(
http://www.ibm.com/software/globalization/ccsid/ccsid_registered.html)
for what each is /named/, to get an idea.
But that is the CCSID (65535) we use for file creation is our
standard for any language that uses symbols and such for the
language.
As an /installed/ file, I recall there was a special process [as part
of database restore] whereby the file will be implicitly tagged with the
CCSID of the primary language of the system. Is the noted file actually
a file built-for, but has not been installed-as, a packaged program product?
As the actual final-form of the file, as intended to be tagged with
*HEX for that column, then I would see little reason to operate
differently for DBCS than SBCS other than for additional storage; i.e.
if other than Latin-like languages are represented with CCSID(65535),
then consistency would have the Latin-like languages handled the same way.
I am creating the extract files, so I have some flexibility.
I think the application probably is best reviewed for the topics of
/globalization/; e.g. some topics in the KC:
(
https://www.ibm.com/support/knowledgecenter/search/globalization?scope=ssw_ibm_i_73)
so I should use CCSID 1209 for UTF8?
CCSID 1208 is for UTF8. For /global/ data, i.e. not specific to a
language, a very encompassing CCSID is required for tagging the data in
a column, or the data must be tagged with a CCSID in an alternate
manner; e.g. as a stored value in another column, to identify what is
the data in the otherwise BINARY/undescribed column of data for the same
row.
I am not saying CCSID(*HEX) can not be used in any particular
application, just that, as column-data that any utilities might
reference, whereby such utilities depend on properly tagged data, are
not going to function in a generally-desirable manner. That is similar
to how using program-described files in generic utilities [like FTP] are
not always going to do nice things to the non-text [binary] data when
using text-mode transfer between encoding schemes [or even between code
pages within the same encoding scheme], and similarly for the use of a
binary/image-mode transfer, the references to the transferred text-data
in another encoding scheme are going to show gibberish despite the
portion that is binary data transporting without any corruption.
The CPYTOIMPF ignores all CCSID codes except 65535, so I guess I need
another utility to do this?
The Copy To Import File (CPYTOIMPF) essentially ignores only CCSID
65535 [aka *HEX]; i.e. does not effect translation, only when the column
CCSID attribute [or data type] suggests not to do so. IOW, as an
effectively /generic/ utility, the feature must know the column CCSID to
effect desirable results; CCSID(*HEX) is a non-CCSID, as an indication
that *no CCSID translation should occur*.
The client access method just doubles up the record length, to
produce the double byte information the way I read it.
Use of the "Client Access data transfer" and the "forced translation"
option has to /assume/ then, somehow, what is the CCSID of the EBCDIC
data that claims there should be no CCSID translation. If the data is
Japanese character data but the user requesting the transfer is
USEnglish with the default mixed CCSID of 937, the /force translate/
option will convert from 937->ASCII. But the actual data presumably was
stored as 5026, despite being tagged as *HEX, so rather than just
garbage, probable garbage plus conversion errors is the effect on
transfer of the data.
As an Amazon Associate we earn from qualifying purchases.