On 19-May-2015 19:08 -0500, Singh, Ashwani (Lake Mary) wrote:
On 19-May-2015 16:51 -0500, CRPence wrote:
We have to build one CIF record from these incoming records.. this
is also an example only and actual elements can go up-to 1600.
Is there an opportunity to ask for revisions to the way they export
the data? Could they not perform a more /standard/ pipe-delimited text
export of their data instead of the alternative they offer [whereby one
record is represented as many each prefixed by control data]?
Conceivably, with some specific alterations, the opportunity could exist
to send the data directly into a database file [rather than a stream
file] and thus make the scenario more directly conducive to manipulation
directly via SQL SELECT, and thus possibly to ease the coding
requirements of a program to parse; even possibly to present the data
directly to the database import feature by encapsulating the SELECT in a
VIEW.
Options that I was thinking was to -: Directly read data from IFS
and reformat in RPGLE.
English does such a poor job of explaining thoughts about coding
something. Such thoughts are best expressed in pseudo-code, or at least
much more explicitly as step-by-step, because the above text leaves out
too much detail. At first I thought the meaning was, in RPGLE to effect:
create table target_PF ...
do forever
read a STMF sub-record
if primary record or eof then do
place prior row-data into insert-row staging-area
if staging-area is full, insert-multiple into target_PF
enddo
if eof then leave
parse new sub-record
append new parsed data onto current row
if missing pipes, append new NULL or DEFAULT to current row
enddo
if staging-area not empty, insert-multiple into target_PF
But reading later options, in both is mentioned the DDS file; I would
have thought the Option1 was doing output to the DDS-created file, so
why not mentioned here? Or is Option1 just reformatting the stream data
to both add missing pipes and remake the multiple record data for each
entity as one contiguous pipe-delimited record; with the intention then
to use Copy From Import File (CPYFRMIMPF) to do all the parsing and
populating and parallelism?
Option2: Copy from IFS --> Read all 39 million records that I have
to read to come to 720K records that what we are expecting but that
seems too inefficient with the current approach --> Generate the
required DDS file data.
I am sure it was 69mil before! :-(
By /Copy/ does that mean effective [or actually] Copy From Stream
File (CPYFRMSTMF) to make a 39m row database flat file that is then
regenerated into a ~720k rows of describe data... by unstated means?
Perhaps not, but hopefully my point can be inferred, that from the above
description, the reader has little idea how or even if the progress was
made from origin as STMF into the target DDS-created PF.
And BTW, why DDS? Why not SQL TABLE?
Option3: Copy from IFS ---> Output directly by reading the file into
DDS file - COPY from IFS takes a long time.
Again, two references to the word _copy_ which often allude to the
use of the various /Copy/ utilities. A program usually /reads/ the
record data and /writes/ [or inserts] the row-data. I am thinking maybe
this option is what I thought Option1 was supposed to be.?
Again, English is quite vague for describing algorithmic processing;
English formed as pseudo-code can be much clearer.
But I guess all is moot [including my later inline comments], because
given now we know for sure that the file record data originates as STMF,
the fastest approach surely is to open and read the stream records,
generate the row-data, and write\insert the row-data using either
blocked writes or multi-row insert into the described database file [be
that defined by DDL or DDS]. One note: BLOCK(*YES) mentioned is
desirable, but even more desirable is ensuring the block-size is
maximized; see the Override To Database File (OVRDBF) for the Sequential
Only (SEQONLY) and the number of records [there was some enhancement in
one of the newer releases to avoid the user-calculation for number of
records].
* Are the components consistently numbered and always one through
sixteen?
No they can vary.
That makes them less conducive to use by the database. As does the
varying length of the non-data prefix.
[...]
^CIF(1000001,2)="1XX-50-0XXX"
^CIF(1000001,3)="XXXXXXX|||XXXX"
[...]
* So component-2 has one element having the value of '1XX-50-0XXX'
as an apostrophe-delimited string, and similarly the component-3
has four elements with elem-1 having the value 'XXXXXXX', elem-2
having the value '', elem-3 having the value '', and elem-4 having
the value 'XXXX'?
No that's a single record for us.
I realized those two /sub-records/, the records I referred to as
defining component-2 and component-3 of the apparent entity called
1000001, were all sub-components of what should be just one record [but
for whatever reason, instead were exported as separate records; as the
so-called sub-records]. I was just interrogating\verifying my
understanding, from what was visibly presented in the sample data.
Thus if component-2 had just that one visible element, but that
component is expected to have two elements [thus missing one element],
and if component-3 had those four elements and expected to have only
those four elements, then the record would look like
...|1XX-50-0XXX||XXXXXXX|||XXXX|... where the ellipses represent the
preceding or following data in the fully-formed pipe-delimited record.
* Each component-# has a known\expected number of elements; in the
OP, one [the primary] was expected to have five elements but the
original example showed three-of-five whereas the posted\attached
data file shows only one element for component-1. So is the
problem general, in that any component-# might have included fewer
than the expected number of elements?
Correct, that was just an example
And thus in my interpretation above, whereby the second record with
the ^CIF(1000001,2) data provides one element yet proposing a scenario
whereby two are expected, then adding the extra pipes and appending to
the parsed results of ^CIF(1000001,1) is what should happen with the
progression towards a reformatted\fully-formed record.?
As an Amazon Associate we earn from qualifying purchases.