Re: IFS in RPG -- RPG400-L

John,

>Lets say Barb/Hans/George allowed us to have an F-Spec  for a stream file.
>And lets say that they even gave us an I/O opcode that would parse a comma
>delimited file automatically into an externally defined D/S.

To be useful, you'd want keywords to set the record separator and especially the
field separator or field pattern.  And  they would need to accept regular
expressions to define them.  You'd use either the field separator or the field
pattern, but not both.  A field separator may be a simple value like a tab or
pipe, but commas generally need field patterns rather than separators.  In other
words, you define the pattern of characters which comprise a single field,
rather than what separates the fields in the record.

For example, in AWK programs I often disable the field separator and enable a
field pattern of /"([^"]|"")*"|[^,"\t|]*/ to parse my input text files.  That
regex pattern will properly parse fields separated by tabs, | (pipes), unquoted
commas, and quoted strings containing either commas or doubled up quotes as in

 Acme,"Photo frame, 4"" x 6"", Gold",1.23       (this example has 3 fields in 
it)

Likewise, I set the record separator to accept CR alone, LF alone, or CRLF so I
can handle text with about any form of line ending.  But all these things would
need to be configurable because different stream files will have different
requirements.

The simplicity with which I can do that kind of parsing in AWK is what makes it
my language of choice for text file manipulation.  In fact, in many ways AWK's
optional implicit input loop reminds me of RPG's primary file processing.

>This is the thing to keep in mind.    I would code a Ext. D/S to match the
>layout of a Comma delimited file that I "Knew" very well.    There would be
>NO mismatches.

That would work in some instances but certainly not be flexible enough.  It is
extremely common for text files to have "multiple formats" in them.  Say a ACH
type file with batch headers followed by one or more transactions and a summary
record.  The whole lot of which can repeat numerous times, possibly enclosed
within other control headers or footers, etc.

In other words, while it may be possible for a given file to establish a line
(ie record) terminator, and while the same field pattern may work for an entire
file, a single external DS only works for very simple files such as CPYFRMIMPF
already handles.

Perhaps one could name the DS to be used any given I/O operation -- like a
program-described READ to a DS -- but that presumes you know what format will be
read next.  An alternative would be to read a "record" into a single varying
length character field.  Based on the contents you could then use a select group
to pass the line of text plus one of various DS names to a routine which could
parse the fields and populate the DS subfields.  (Like the UNSTRING operation
Peter mentions.)

But at that point, the UNSTRING-like opcode really has nothing to do with the
IFS.  It just becomes another opcode which works with characters fields and DS
of any origination.

Thus my contention is that the parsing shouldn't be done at the F-spec keyword
level.  Text file input should be done into a single varying length field, and
then parsed based on the contents of the record.  Or we could use record
identifying indicators in the input specs, and code different input layouts
based on the contents of some column!  Should make all the old-timers feel right
at home! :)

And once all the read operation does is find the next line terminator, then we
are back to doing very little more than putting a wrapper on the Posix APIs.
You could just use the routines in "Who Knew You Could Do That with RPG IV?".

>WE are geting these files from NON-AS/400 applications !!
>And we are getting more and more of them each day.

Agreed.

>WE NEED THIS

Do we?  This to me falls in the camp of just needing a open/clse statements, and
a ReadLine statement.  And those are simple enough to do with the existing APIs.

Beyond that we could use something like COBOL's UNSTRING or AWK's SPLIT(), but I
see that as being more of a generic string opcode or BIF and not part of IFS
support per se.

And it could be done with a service program now too, so I remain unconvinced the
compiler team needs to spend their time doing it.

More than IFS open/close/read support, I'd like to see BIFs for sophisticated
regex operations, say a %LIKE or %MATCH, or to have %SCAN enhanced to accept
regular expressions.

Having a %SPLIT or whatever which parses a varying length string into an array
of varying length strings (one element per "field") or directly into DS
subfields would seem more useful to me than IFS operations.

Doug