× The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.




From the InfoCentre under SUBSTR:

"1 The SUBSTR function accepts mixed data strings. However, because SUBSTR operates on a strict byte-count basis, the result will not necessarily be a properly formed mixed data string."

What you see is expected behaviour.

LEFT operates on characters. You only specify the number of them. LEFT works out where they start and end therefore it implicitly handles multi-byte characters.

SUBSTR operates on bytes. You specify the starting position and the length in bytes therefore if the length stops in the middle of a multi-byte character you will get crap returned.

Although u-umlaut appears to be a single character in UTF-8 it is represented as multiple bytes.

CCSID 37 ü x'DC'
CCSID 819 ü x'FC'
CCSID 1208 ü x'C3BC'

Clear?

Mind you, I don't believe the E-acute is correct.

On 21/03/2008, at 12:22 PM, Elvis Budimlic wrote:

Have you tested their example Rob?

I think they have the output wrong for SUBSTR, at least for languages that
do indeed support umlauts. Perhaps when converting the output to English
something bizarre happened to person who documented this example.
To enter umlaut type characters your job would need to be in an appropriate
CCSID (i.e. DEU, DE, 273 for Germany as German supports umlauts).
If you then turn around and display them via that sample query, I'd fully
expect to get the same results from LEFT and SUBSTR.

So, I'd chalk it off to documenting mistake.

Elvis

Celebrating 11-Years of SQL Performance Excellence on IBM i5/OS and OS/400
www.centerfieldtechnology.com


-----Original Message-----
Subject: LEFT vs SUBSTR

Not that I am big into UCS but someone want to explain this for me:

Assume that NAME is a VARCHAR(128) column, encoded in Unicode UTF-8, that
contains the value 'Jürgen'.
SELECT LEFT(NAME, 2), SUBSTR(NAME, 1, 2)
FROM T1
WHERE NAME = 'Jürgen'
Returns the value 'Jü' for LEFT and 'JÊ' for SUBSTR(NAME, 1, 2).

http://publib.boulder.ibm.com/infocenter/systems/scope/i5os/topic/ db2/rbafzs
caleft.htm

Rob Berendt

--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.



Regards,
Simon Coulter.
--------------------------------------------------------------------
FlyByNight Software OS/400, i5/OS Technical Specialists

http://www.flybynight.com.au/
Phone: +61 2 6657 8251 Mobile: +61 0411 091 400 /"\
Fax: +61 2 6657 8251 \ /
X
ASCII Ribbon campaign against HTML E-Mail / \
--------------------------------------------------------------------




As an Amazon Associate we earn from qualifying purchases.

This thread ...

Follow-Ups:
Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.