On Tue, Dec 19, 2017 at 6:04 PM, Jeremy Meyer <JMeyer@xxxxxxxxxxxxx> wrote:
Ok I see what you are saying. That makes sense now, I can run a .decode('cp500') on the items in question and get what I am after.
Well, I'm a little surprised you went with cp500 instead of cp037
(since you had originally mentioned CCSID 37), but that's a relatively
minor point. If the `decode` is working for you, that is great.
Why I feel it's fairly easy to futz with encodings in Python is that
if you've just retrieved a bunch of values in a row (a.k.a. record),
you can use Python's language facilities to apply the decoding
repeatedly and selectively, without a huge amount of typing. For
example, if you've got a cursor called `c1`, then you might have
something like
c1.execute('select * from mylib.myfile')
for row in c1:
values = [v.decode('cp037') if isinstance(v, bytes) else v for v in row]
# Do something with `values` here.
The above builds a list called `values` for each row. That list
contains Python strings (conceptual Unicode characters; you're not
supposed to care about the encoding) wherever bytes were retrieved,
but leaves numeric fields numeric.
For folks who are new to Python and find the long expression on the
right-hand side of the `values` assignment confusing, there is nothing
wrong with breaking it down into simpler chunks:
c1.execute('select * from mylib.myfile')
for row in c1:
values = []
for v in row:
if isinstance(v, bytes):
values.append(v.decode('cp037'))
else:
values.append(v)
# Do something with `values` here.
The main point being you don't necessarily have to type `.decode()`
for every field you're decoding, which is handy if you have to decode
a dozen or so fields.
Thank you very much!
You're very welcome, though I hope you haven't jumped the gun. Were
you saying that you actually *tried* the `decode` method and it
worked? Or simply that what I was saying made sense to you?
Also, while it's not that painful to work with encodings in Python,
the "database-level" suggestions presented in other responses are
definitely worth considering. Ultimately, if you *can* change the
CCSID at the data source, I think that is the most desirable.
John Y.
As an Amazon Associate we earn from qualifying purchases.