× The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.



Hi

It is possible that the CSV file is in UTF-8 -- the hex 31 you describe is what the number 1 would be in UTF-8.

It is possible that there is a BOM (byte order mark) at the beginning of the file - for UTF-8 that would be these 3 bytes - 0xEF,0xBB,0xBF. But it is optional and might not be there.

With UTF-8 the byte strings for individual characters can be 1, 2, 3, or 4 bytes long. Take a look at the file in WRKLNK, if you can identify the position where there is a Chinese character, press F10 to see the actual bytes. There might be 1 blank in the character display but 3 or 4 bytes in the hex display on the right.

If it IS UTF-8, you might try marking the file with CCSID 1208 and do a text transfer, not a binary transfer.

Or mark the field in your PF as 1208 CCSID, do the binary transfer, then see what RPG does with it. In RPG, do and EVAL from the UTF-8 field to a regular 37 EBCDIC field.

Good luck - we are working with this stuff, too, here - another developer is officially on the task, but I've done work with importing UTF-8 that include emojis and special characters that are not in 37.

Vern

On 5/13/2020 9:36 AM, smith5646midrange@xxxxxxxxx wrote:
I have a client that is receiving a .csv file for import to the IBM i (I
sure wish outlook would quit capitalizing that letter for me!!!!) that
contains Chinese characters.literally. When the file lands in the IFS,
based on the hex bytes it looks like it is ASCII DBCS. I say that because
the number 1 is hex 31 but I know Chinese characters require DBCS.


This file is then imported from the IFS to a DB2 file in the same character
set (binary transfer). From there, it is "converted" to a normal DB2 file
with CCSID 37 data (and the Chinese characters which apparentlyu are no
longer DBCS characters so they are just weird single byte characters).


We need to be able to blank the entire "cell" if it contains any Chinese
characters. To add complexity, not every cell in the column has Chinese
characters and we want to keep the cells with the non-Chinese values. This
step can be done anywhere that it is convenient, whether it be on the IFS or
either version of the DB2 files.


My client does not approve of Java code on the IBM i but I've been given the
authority to use Java to do this "if I have to". I have no idea why they
threw that statement out there unless someone has already researched and
found a Java solution but they can't code in Java. However, since I have
done very little with Java called from RPGLE and my java skill set is from
the 1990's, I would rather have an RPGLE solution if that is possible.


So, what solutions have I not stumbled over yet.


Also, some of this terminology might be wrong such as the "ASCII DBCS".
Please feel free to correct any terminology so I know for later when I am
trying to explain this.





As an Amazon Associate we earn from qualifying purchases.

This thread ...

Follow-Ups:
Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2025 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.