|
John, >Lets say Barb/Hans/George allowed us to have an F-Spec for a stream file. >And lets say that they even gave us an I/O opcode that would parse a comma >delimited file automatically into an externally defined D/S. To be useful, you'd want keywords to set the record separator and especially the field separator or field pattern. And they would need to accept regular expressions to define them. You'd use either the field separator or the field pattern, but not both. A field separator may be a simple value like a tab or pipe, but commas generally need field patterns rather than separators. In other words, you define the pattern of characters which comprise a single field, rather than what separates the fields in the record. For example, in AWK programs I often disable the field separator and enable a field pattern of /"([^"]|"")*"|[^,"\t|]*/ to parse my input text files. That regex pattern will properly parse fields separated by tabs, | (pipes), unquoted commas, and quoted strings containing either commas or doubled up quotes as in Acme,"Photo frame, 4"" x 6"", Gold",1.23 (this example has 3 fields in it) Likewise, I set the record separator to accept CR alone, LF alone, or CRLF so I can handle text with about any form of line ending. But all these things would need to be configurable because different stream files will have different requirements. The simplicity with which I can do that kind of parsing in AWK is what makes it my language of choice for text file manipulation. In fact, in many ways AWK's optional implicit input loop reminds me of RPG's primary file processing. >This is the thing to keep in mind. I would code a Ext. D/S to match the >layout of a Comma delimited file that I "Knew" very well. There would be >NO mismatches. That would work in some instances but certainly not be flexible enough. It is extremely common for text files to have "multiple formats" in them. Say a ACH type file with batch headers followed by one or more transactions and a summary record. The whole lot of which can repeat numerous times, possibly enclosed within other control headers or footers, etc. In other words, while it may be possible for a given file to establish a line (ie record) terminator, and while the same field pattern may work for an entire file, a single external DS only works for very simple files such as CPYFRMIMPF already handles. Perhaps one could name the DS to be used any given I/O operation -- like a program-described READ to a DS -- but that presumes you know what format will be read next. An alternative would be to read a "record" into a single varying length character field. Based on the contents you could then use a select group to pass the line of text plus one of various DS names to a routine which could parse the fields and populate the DS subfields. (Like the UNSTRING operation Peter mentions.) But at that point, the UNSTRING-like opcode really has nothing to do with the IFS. It just becomes another opcode which works with characters fields and DS of any origination. Thus my contention is that the parsing shouldn't be done at the F-spec keyword level. Text file input should be done into a single varying length field, and then parsed based on the contents of the record. Or we could use record identifying indicators in the input specs, and code different input layouts based on the contents of some column! Should make all the old-timers feel right at home! :) And once all the read operation does is find the next line terminator, then we are back to doing very little more than putting a wrapper on the Posix APIs. You could just use the routines in "Who Knew You Could Do That with RPG IV?". >WE are geting these files from NON-AS/400 applications !! >And we are getting more and more of them each day. Agreed. >WE NEED THIS Do we? This to me falls in the camp of just needing a open/clse statements, and a ReadLine statement. And those are simple enough to do with the existing APIs. Beyond that we could use something like COBOL's UNSTRING or AWK's SPLIT(), but I see that as being more of a generic string opcode or BIF and not part of IFS support per se. And it could be done with a service program now too, so I remain unconvinced the compiler team needs to spend their time doing it. More than IFS open/close/read support, I'd like to see BIFs for sophisticated regex operations, say a %LIKE or %MATCH, or to have %SCAN enhanced to accept regular expressions. Having a %SPLIT or whatever which parses a varying length string into an array of varying length strings (one element per "field") or directly into DS subfields would seem more useful to me than IFS operations. Doug
As an Amazon Associate we earn from qualifying purchases.
This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].
Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.