|
I'm undoubtedly missing something, but I still don't understand your
concern.
As you indicated, UTF-16 can require 1 or 2 16-bit code point units
representing 2 or 4 bytes of storage. UTF-8, in the same manner, can
require 1, 2, 3, or 4 8-bit code point units representing 1, 2, 3, or 4
bytes of storage. Both encodings can be used to represent the full range of
characters in the currently defined Unicode standard. Prior to 2003 UTF-8
could, in theory, go beyond 4 8-bit code points but that capability was
never used and was formally removed from the standard in 2003.
The default for RPG is UCS-2 (CCSID 13488) which is, unfortunately, based
on an earlier form of Unicode where a fixed-width 16-bit code point was
used (no surrogate pair support for that second 16-bit code point). On the
RPG H-spec you can however specify CCSID(*UCS2 :1200) which tells RPG that
UTF-16 is to be used as the default Unicode CCSID. It's unfortunate that
the RPG keyword is UCS2 when it's really more than UCS-2, but that's what
happens with keywords sometimes when the world changes. Barbara -- am I
missing something about the RPG implementation of CCSID 1200 support?
UTF-8 is typically the preferred encoding for web applications, but that's
largely due to its ASCII transparency and not due to support for a larger
character set (when compared to UTF-16).
On Tue, Jan 14, 2014 at 7:37 AM, Henrik Rützou <hr@xxxxxxxxxxxx> wrote:
Bruce1,114,112
UTF-16 is a 2 OR 4 bytes unicode encoding. Unicode has a total of
codeEither
points.
In other words you sometime needs 2 DBCS characters to create a single
Unicode
code point so 10 Unicode characters may take up 15 DBCS characters.
Does RPGLE support that - the answer is NO.
On Tue, Jan 14, 2014 at 12:12 AM, Bruce Vining <bvining@xxxxxxxxxxxxxxx
wrote:
I do not understand this second note at all. CCSID 1200 gives you thefull
UTF-16 range and is what I generally use.--
A SBCS job environment is going to limit you to 192 discrete EBCDIC
characters if you ask for the Unicode data to be converted to the job
CCSID. But it's your code (someplace) that's asking for that conversion
so don't ask.
If transforms are needed there are APIs such as iconv() and
QlgTransformUCSData (with support for UTF-8, UTF-16, and UTF-32).
ofwrote:
these APIs (plus others) could be easily wrapped within a user functionand
hidden from the application developer in terms of their implementation.
On Mon, Jan 13, 2014 at 8:38 AM, Henrik Rützou <hr@xxxxxxxxxxxx>
in
Joep,
CCSID 1200 or 13488 doesn't basically give you full unicode support
convertRPGLE
unless you base or result is UTF-8 and you use binary iconv to
UTF-8)between
the formats.
Iconv will do correct conversion of large characters (3-4 bytes
conversioninto 2*2 bytes UTF-16 CCSID 1200) since it is a "calculated"
unicodethat isn't based on a translation table.
In other words you can calculate the hex conversion of the full
tobyspan
between UTF-8, UTF-16 and UTF-32.
The problem is that these string conversions isn't natively supported
RPGLE
as a field type, you have to use raw storage manipulation with iconv
ASCIIachive it.
Basically UTF-8 is a one byte string that shares x'00'-x'7F' with
iconvbutdirectly
it
would be nice just to be able to move ingoing or outgoing UTF-8
to/from
a field type without conversions.
UTF-8 can be converted to SBCS EBCDIC in two ways, on a "normal"
CGIDEV2CCSID 1208>37 that only will support the 256 characters in the SBCSEBCDIC
CCSID or on byte level.
At the moment I'm working on a replacement of powerEXT Core, a
DBCSSBCS hybrid where a new middleware will have full Unicode,SBCS and
insupport.
My problem is that neither SBCS or DBCS "original" has that support
isDB2or
fields - unless I have overseen something.j.beckeringh@xxxxxxxxxxxxxxxxxxxxxxxxxx
On Mon, Jan 13, 2014 at 2:52 PM, <
wrote:
Henrik,
What exactly are you looking for? Do you want to use Unicode in RPG
do
you specifically want to use UTF-8 encoding in RPG? Using Unicode
13488simple enough through UCS-2 encoding (datatype C; CCSID 1200 or
isas
conversionBruce mentioned; implicit conversion by assignment or explicit
by %ucs2 and %char).
Joep Beckeringh
Henrik Rützou <hr@xxxxxxxxxxxx>
Re: DB2 UTF-8 fields used in RPGLE
Unless I have overlooked something the RPGLE UTF-8 field support
toinmore or less useless since it in reality only supports characters
bytes"the
jobs SBCS EBCDIC CCSID :-(
It would be far better that the DB just passed the data "as is
so it could be passed to either a the jobs SBCS EBCDIC field or
(RPG400-L)(RPG400-L)(RPG400-L)a DBCS field by using a %BIF.--
Why on earth didn't IBM not just copy the DBCS support to UTF-8
support? Maybe Barbara Morris can answer that question?
This is the RPG programming on the IBM i (AS/400 and iSeries)
mailing list
To post a message email: RPG400-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/rpg400-l
or email: RPG400-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/rpg400-l.
--
Regards,
Henrik Rützou
http://powerEXT.com <http://powerext.com/>
--
This is the RPG programming on the IBM i (AS/400 and iSeries)
mailing list
To post a message email: RPG400-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/rpg400-l
or email: RPG400-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/rpg400-l.
--
Regards,
Bruce
www.brucevining.com
www.powercl.com
--
This is the RPG programming on the IBM i (AS/400 and iSeries)
mailing list
To post a message email: RPG400-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/rpg400-l
or email: RPG400-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/rpg400-l.
--
Regards,
Henrik Rützou
http://powerEXT.com <http://powerext.com/>
--
This is the RPG programming on the IBM i (AS/400 and iSeries) (RPG400-L)
mailing list
To post a message email: RPG400-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/rpg400-l
or email: RPG400-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/rpg400-l.
--
Regards,
Bruce
www.brucevining.com
www.powercl.com
--
This is the RPG programming on the IBM i (AS/400 and iSeries) (RPG400-L)
mailing list
To post a message email: RPG400-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/rpg400-l
or email: RPG400-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/rpg400-l.
As an Amazon Associate we earn from qualifying purchases.
This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].
Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.