RE: CSV file problems -- MIDRANGE-L

Thanks Scott (and everyone else who responded)

Excellent, clear explanation that I will file for future reference (as I
know I will forgot next time it comes up).

Sheri

-----Original Message-----
From: Scott Klement [mailto:midrange-l@xxxxxxxxxxxxxxxx]
Sent: Thursday, May 20, 2010 3:42 PM
To: Midrange Systems Technical Discussion
Subject: Re: CSV file problems

Hi Sheri,

On 5/20/2010 2:07 PM, Rowe, Sheri wrote:

I tried creating the CSV with CCSID of 37 and 1208 and 1252 but it

still

gives me incorrect data for 2 items.
The 2 item numbers are 5E021 and 5E011.

This isn't related to CCSIDs, and it's not an IBM i-specific problem.

(Try opening up a blank Excel worksheet, and type 5E021 into a cell,
then set the formatting to "Number" and see what it shows you.)

Excel stores all of it's numbers in floating point format. In that
format, numbers are stored by exponent and mantissa. So when Excel
reads data from a CSV file, it allows numbers to be input in that format

as well.

5E+21 would mean the number 5 multiplied by 10**21 (10 to the 21st
power). Or in conventional notation: 5,000,000,000,000,000,000,000
(that's 21 zeros)

Of course, the + isn't required. The + symbol means positive, and like
most numbers, if you don't include the +, positive is assumed. So
5E021 means the same thing... 5 followed by 21 zeros.

Most programming languages will understand 5E021 as a number in this
fashion. (Though, I don't think RPG will, since it's one of the few
programming languages where floating point numbers aren't the norm.)

Anyway, what you've really discovered is one of the major failings of
the CSV format in general. It has no way to tell software what the data

type of a given field should be. Excel sees that it's a valid number,
so it treats it as a number. If it weren't a valid number, it'd treat
as a string. There's no communication from your program to Excel to
tell it whether something SHOULD be a number or not.

I don't exactly know what you mean by "import the file directly". If
you mean you're using the Excel Data Plugin provided with IBM i Access,
then it doesn't use CSV, and it knows how to tell Excel what the data
types are. Likewise, if you use ODBC, XML or create the file in Excel's

native XLS/XLSX (using a tool like HSSF/POI or JExcelAPI, for example),
you have an option to tell Excel what the data is.

You also have lots more control over formattting... how wide the columns

should be, what fonts to use, how to represent negative numbers, colors,

etc. None of that is possible with CSV.

However, I think there _is_ a kludge for your particular situation with
CSV files. I think if you insert a ' character in front of the number,
such as "'5E021", it'll force Excel to treat it as a character field
instead of numeric. That's a kludge, and won't work with any other
software that supports CSV (so in my mind violates the CSV format...
though technically there's no standard for CSV).

However, I may have my facts wrong, as I don't import stuff into Excel
using CSV format very often. (And now you know why!)