× The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.



Henrik,

I believe those CCSIDs are configurable. Also, in some circumstances IBM i will try to pick the CCSID based on other factors (such as using an ASCII that supports the same set of characters as the EBCDIC you're using.)

So I wouldn't say they'd always be 819 or 1252, but I think in most western countries (those using Latin-1 character set) it'll probably be 819 and 1252 unless someone has reconfigured them.

Chuck Pence will probably reply to this and tell you precisely how it works :-)

-SK


On 5/8/2015 2:06 PM, Henrik Rützou wrote:
Scott

in general we can say that files from FTP in the IFS will become CCSID 819
while files dragged and
droped from your windows will become CCSID 1252 in the IFS - or am I wrong?


On Fri, May 8, 2015 at 8:59 PM, Scott Klement <midrange-l@xxxxxxxxxxxxxxxx>
wrote:

Jim,

In file transfer situations, I would never trust the CCSID file attribute
(unless you've already made sure that it's right, of course).

Unless you're transferring a save file from another IBM i
system/partition, the CCSID is not part of what gets transferred.  All
that's transferred is the data itself.  The system will usually just assign
a 'default' CCSID -- it has no way of knowing if it's the right one for
your data.  It expects you to change it accordingly if your data is
different.

If you are finding that a single character (such as a "smart quote" or
international symbol) is showing up as two bytes of data, resulting in
extra 'garbage' when translated to EBCDIC, this almost always means that
the data is UTF-8, but you're telling the system that it's ASCII (such as
819) and therefore it will translate the basic alphabet and numbers
correctly, but more 'special' characters will be mistranslated.

Really, considering that it's 2015, we should all be using Unicode (UTF-8
or UTF-16) for as much as possible.  ASCII and EBCDIC are really
cumbersome.  But, I know it's hard when you have so many applications that
are already in EBCDIC -- but an all-unicode environment is really what you
should be striving for in the long run, if you can't do it today.

Anyway -- how to "purify" the data -- there are certain commonplace
issues, such as replacing "smart quotes" with straight quotes that make
sense to do. I would definitely do this in Unicode (or ASCII if that's what
it is) before translating to EBCDIC.

But aside from these common things, it's general ugly and nasty to remove
"unwanted" characters.  There's no good way to do this, since there's
really no way the computer knows which characters are "allowed" and which
are not.  How does it know whether a half-moon character, for example, is
intentional or whether it's an error?  Same is true of accented characters
-- often times people (at least in the USA) will see these and say they are
"garbage" -- but, they are normal parts of human languages in most of the
world.  How can the computer know that they are "garbage"?  Obviously, it's
easy for us as human beings to look at the data and realize that a
particular character doesn't belong there -- but I'm sure you understand
that a computer can't see things that way.

So I guess if you want to "purify" your data, the BEST way to do that is
to find out where these unwanted characters are coming from, and have it
stop sending them.  If you really, truly, can't do that then the "hack"
would be to make a list of everything you DO want, and remove everything
else.  What is/isn't a wanted character will almost certainly vary from
application to application, so there isn't really any built-in way to do
this.  Just make a string of all the characters you want, and use RPG
operations like %CHECK to find the ones not in that character set and
remove them.  But, this really is a hack...



On 5/8/2015 1:33 PM, Jim Franz wrote:

without asking every entity, can one tell looking at the file attributes?

Jim

On Fri, May 8, 2015 at 2:28 PM, Henrik Rützou <hr@xxxxxxxxxxxx> wrote:

  Jim

even if the files you receive is in CSSID 819/1252 are you sure that they
isn't
UTF-8 files?


On Fri, May 8, 2015 at 8:25 PM, Jim Franz <franz9000@xxxxxxxxx> wrote:

  EBCDIC CCSID = 37
Most file imports are via ftp - ccsid 1252, occasionally burned dvd for

new

customer startup of history.
Some trading partners are mainframe, some unix/Linux, some Win, all US
based entities, but we think some servers are overseas (we see time
differences).

When we write ascii text, usually 819

what hurts us most is screen input (web interface to SQL Server then to
Power i) where user cuts & pastes paragraphs of text from their source
systems (thousands of different customers).
Jim




On Fri, May 8, 2015 at 2:07 PM, Henrik Rützou <hr@xxxxxxxxxxxx> wrote:

  Jim

what is the EBCDIC CSSid on your machine and how do you recieve files?

On Fri, May 8, 2015 at 8:00 PM, Jim Franz <franz9000@xxxxxxxxx> wrote:

  We do a lot of import and export of data, plus have both PC client

(local

and web) input as well at PC5250.
Had a recent thread involving cut and paste data (ebcdic x'3F') that

caused

an issue.
We use CCSID 37 and ascii 819.

There are more EBCDIC characters than what we see on the US Keyboard.

Some

we need, such as copyright symbol, cents sign, etc, but many

We are wanting to take steps to clean the data on input, whether from

ascii

or ebcdic side. We have some input already cleansed, but only at

screen

program level.

Couple questions:
1. Just replacing all below ebcdic x'40'  leaves a lot of strange
characters like x'8C' (sort of a moon with a hat..). One thought is

to

identify all the characters we need and replace the rest. No need to

keep

line and page formatting stuff.
Is this a good idea?

2. Thinking that since a multitude of entry/update points, db

triggers

are

best? Am wondering about apps that write the data, and now after

write,

the

screen column data is different than column data in file (trigger pgm
cleaned the data - hoping to avoid opening up all the apps.

3. How far do people with heavy edi take this? Am I leaving some

something

out with the keyboard characters plus a few more? These are names,
addresses, notes (which are sometimes pages of notes).

Jim Franz
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L)

mailing

list

To post a message email: MIDRANGE-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.




--
Regards,
Henrik Rützou

   http://powerEXT.com <http://powerext.com/>
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing

list

To post a message email: MIDRANGE-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.


  --
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing

list

To post a message email: MIDRANGE-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.




--
Regards,
Henrik Rützou

   http://powerEXT.com <http://powerext.com/>
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing
list
To post a message email: MIDRANGE-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.



--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.






As an Amazon Associate we earn from qualifying purchases.

This thread ...

Follow-Ups:
Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.