Thanks Mark,
I will keep your suggestion in mind.
At this time, I have instructed the user that embedded commas are not allowed in
the description.
Since they have control over that information, it is now up to them to
make sure that the data complies with the correct format.
Thanks,
Jeff Young
Sr. Programmer Analyst
IBM -e(logo) server Certified Systems Exper - iSeries Technical Solutions V5R2
IBM Certified Specialist- e(logo) server i5Series Technical Solutions
Designer V5R3
IBM Certified Specialist- e(logo)server i5Series Technical
Solutions Implementer V5R3
________________________________
From: Mark S. Waterbury <mark.s.waterbury@xxxxxxx>
To: RPG programming on the IBM i / System i <rpg400-l@xxxxxxxxxxxx>
Sent: Fri, July 16, 2010 2:26:20 PM
Subject: RE: Parsing a CSV file with embeded commas in text field
Hi, Jeff:
The standard for CSV files is to place " quotes " around any data that
contains special characters such as a comma "," ... as others have
already suggested.
Since you cannot modify the PC product that produces the (non-standard)
comma-delimited file, perhaps you could do something like this?
Write a special CSV Validation Program that "pre-processes": the
comma-delimited file, after it is produced by the PC application, but
before you attempt to import the data from that file on OS/400.
For example, you know the correct number of commas that should normally
be present in each row (or record), based on the number of fields (or
columns) expected in that data, so you could just count the total number
of commas that appear in each row (record), and if there are more than
the expected number, this would indicate that there may be extra commas
in some of the data fields. Read each row, count the number of commas,
and if greater than the expected number, the validation program would
REJECT those rows that do not "conform", so that you will end up with a
"sanitized" version of the CSV file that is "clean" and acceptable. And,
at the same time, this validation program would also produce a listing
(or text file) of any rejected rows, so you can send that information
back to the users of that PC application, asking them to go back into
the PC application and correct the data. (You could attach the listing
of rejected records to an e-mail to the users of the PC application.)
If you know for certain that commas can only appear as "data" in certain
columns (fields), you could try to apply special "fix up" logic to
recognize commas in that column, and insert quotes (" ") around the data
for that column, and then re-scan the entire line to see if the
"correct" number of commas now appears (outside of any quotes). But
this may be considered more "risky" than the approach outlined in the
paragraph above.
I hope this suggestion might help.
All the best,
Mark S. Waterbury
On 7/16/2010 10:00 AM, Jeff Young wrote:
I am using the GETTOK routine to parse records from a csv file where some of
the
character fields have embedded commas.
I seem to recall that this was addressed in the past, but I do not recall how
to
do it.
ex: ...,Apples, Red Delicious 88ct,nnnn,nnnn ....
All help will be appreciated.
Thanks,
Jeff Young
As an Amazon Associate we earn from qualifying purchases.