× The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.



Vernon,

a 3 byte UTF-8 character may be or may not be supported in EBCDIC depending
on the EBCDIC CCSid. ...


// Replacing Microsoft Smart Quotes
myField = replaceUTF8unit(myField:'E28098':'''':819);
myField = replaceUTF8unit(myField:'E28099':'''':819);
myField = replaceUTF8unit(myField:'E2809C':'"':819);
myField = replaceUTF8unit(myField:'E2809D':'"':819);
myField = replaceUTF8unit(myField:'E28093':'-':819);
myField = replaceUTF8unit(myField:'E28094':'--':819);
myField = replaceUTF8unit(myField:'E280A6':'...':819);

// Replacing EUR Sign with Text
myField = replaceUTF8unit(myField:'E282AC':'EUR':819);

On Wed, Jul 12, 2017 at 8:14 PM, Vernon Hamberg <vhamberg@xxxxxxxxxxxxxxx>
wrote:

Henrik

Exactly - in the case I just found, the UTF-8 byte string was 3 bytes -
the HORIZONTAL ELLIPSIS. It preceded the emoji, which was at byte # 381 -
but XML-SAX said the error was at position 379 - that lead me to think that
the count was of characters, the emoji is 1 character - as you describe
here - so that was correct if by character, not if by bytes.

The SQL approach is showing much promise!

Thanks
Vern


On 7/12/2017 12:52 PM, Henrik Rützou wrote:

The thing that exactly goes wrong here is that UTF-8 is a 1-4 single byte
unicode encoding,
UTF-16 i a 1-2 double byte unicode encoding and UTF-32 is a 4 byte
encoding.

In other words, you need 4 bytes to cover the full range of Unicode

EBCDIC and ASCII is a single byte encoding so there is no way you are able
to cover
the whole range of Unicode in 256 characters.

UCS-2 is a double byte encoding so there is no way you are able to cover
the whole
range of Unicode in 65.536 characters

The emojies probably has a value beyond the 64K limit of UCS-2 and that is
why it
fails because they will need 2 UCS-2 characters each 2 bytes long = 4
bytes
and
that is not supported by ICONV

So in any case the XML unicode has to be cleaned up so it either fits
EBCDIC or UCS-2
it is as simple as that.



On Wed, Jul 12, 2017 at 7:01 PM, Mike Jones <mike.jones.sysdev@xxxxxxxxx>
wrote:

Hi Vernon,

Three cheers for just considering using the SQL XML functionality
(XMLTABLE
function), even if you don't end up using it.

I helped someone parse some XML on one of these forums once and had it
working using XMLTABLE in about an hour. Days later, the person was
still
tinkering around trying to get the more laborious ways of parsing to
work.

Best wishes for a speedy solution...

Mike

On Tue, Jul 11, 2017 at 2:50 PM, Vernon Hamberg <
vhamberg@xxxxxxxxxxxxxxx>
wrote:

I'm aware of - and would like to use - the SQL XML functionality, which
does NOT seem to have these problems - I've tried a few bits. Problem
is,
it'd be a rewrite of this program.

Cheers
Vern


--
This is the RPG programming on the IBM i (AS/400 and iSeries) (RPG400-L)
mailing list
To post a message email: RPG400-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/rpg400-l
or email: RPG400-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/rpg400-l.

Please contact support@xxxxxxxxxxxx for any subscription related
questions.

Help support midrange.com by shopping at amazon.com with our affiliate
link: http://amzn.to/2dEadiD




--
This is the RPG programming on the IBM i (AS/400 and iSeries) (RPG400-L)
mailing list
To post a message email: RPG400-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/rpg400-l
or email: RPG400-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/rpg400-l.

Please contact support@xxxxxxxxxxxx for any subscription related
questions.

Help support midrange.com by shopping at amazon.com with our affiliate
link: http://amzn.to/2dEadiD





As an Amazon Associate we earn from qualifying purchases.

This thread ...

Follow-Ups:
Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.