AW: Replacing carriage returns in a data file -- MIDRANGE-L

Why to copy information first into a source file member and scan this member
after instead of reading the information necessary directly from the IFS
file into a single variable or multiple variables?
A file reference variable can be used with SQL functions like any character
field, so scanning with SQL functions for example LOCATE, and reading
information into variables (also with SQL functions for example SUBSTRING)
is no problem.

Assumed a *.csv file has to be performed. Each row is ended with a CRLF and
the column information is separated by a semi colon and character and date
field are enclosed with double quotes. Each row consists of 5 columns
(CustomerNo, ItemNo, ItemDescription, SalesDate and Amount), just to keep it
simple.

This *.csv file could be read as follows:
D CsvFile S SQLType(CLOB_File)

D Data S 256A Varying
D DataLen S 10I 0
D StrPosData S 10I 0
D NextCRLF S 10I 0

D Index S 3U 0
D ColData S 256A Varying
D ColDataLen S 10I 0
D StrPosSep S 10I 0 inz(1)
D NextSep S 10I 0

D DSSales DS Qualified Inz
D CustNo 15A
D ItemNo 15A
D SalesDate D DatFmt(*ISO)
D Amount 11P 2

/Free
Monitor;

StrPosData = 1;

Clear CsvFile;
CsvFile_Name = '/home/Hauser/Sales1.csv';
CsvFile_NL = %Len(%Trim(CsvFile_Name));
CsvFile_FO = SQFRD; //Read only

DoU 1=0;
Clear NextCRLF;
Clear Data;
Exec SQL Set :NextCRLF = Locate(x'0D25', :CsvFile, :StrPosData)

// 2.2. Retrieve Data between current position and next CRLF or
If NextCRLF > *Zeros;
DataLen = NextCRLF - StrPosData;
Exec SQL Set :Data = Substr(:CsvFile, :StrPosData, :DataLen)
Else;
Exec SQL Set :Data = Substr(:CsvFile, :StrPosData);
EndIf;

StrPosData += DataLen + 2;

// 2.3. Split retrieved Data into columns (depending on the ';'
separators)
StrPosSep = 1;
For Index = 1 to 5;
NextSep = %Scan(';': Data: StrPosSep);
ColDataLen = NextSep - StrPosSep;
If NextSep > *Zeros;
ColData = %Trim(%Subst(Data: StrPosSep: ColDataLen): '" ');
Else;
ColData = %Trim(%Subst(Data: StrPosSep): '" ');
EndIf;

StrPosSep += ColDataLen + 1;

Select;
When Index = 1;
Select;
When Index = 1;
DSSales.CustNo = ColData;
When Index = 2;
DSSales.ItemNo = ColData;
When Index = 3;
Iter;
When Index = 4;
Monitor;
DSSales.SalesDate = %Date(ColData: *ISO);
On-Error;
Clear DSSales.SalesDate;
EndMon;
When Index = 5;
Monitor;
DSSales.Amount = %Dec(ColData: 11: 2);
On-Error;
Clear DSSales.Amount;
EndMon;
EndSL;
EndFor;

//********* Do whatever you want with the data read into data
structure

If NextCRLF = *Zeros;
Leave;
EndIf;
EndDo;

On-Error;
//***** Handle Error
EndMon;

/End-Free

Wrapping the source code like with an UDTF will allow accessing the *.csv
table with a simple select statement. (Instead of using FILE reference
variable unix-type APIs could be used for accessing the IFS files).
I have generic UDTF that can read each *.Csv file (up to 128 columns) and
split the content into (Varchar) columns.
This UDTF is very helpful if I have to select productive data and copy them
into tables located on our development machine.
For example:
Insert into MyFile (FileColA, FileColB, FileColX)
Select Col1, Col2, Col3

From Table (ReadCsv('/home/Hauser/MyCsvFile.csv': ';')) x;

... but in either way nobody is forced using this technique. It is only a
different way to solve a specific problem.

Mit freundlichen Grüßen / Best regards

Birgitta Hauser

"Shoot for the moon, even if you miss, you'll land among the stars." (Les
Brown)
"If you think education is expensive, try ignorance." (Derek Bok)
"What is worse than training your staff and losing them? Not training them
and keeping them!"

-----Ursprüngliche Nachricht-----
Von: midrange-l-bounces@xxxxxxxxxxxx
[mailto:midrange-l-bounces@xxxxxxxxxxxx] Im Auftrag von CRPence
Gesendet: Thursday, 15.8 2013 18:23
An: midrange-l@xxxxxxxxxxxx
Betreff: Re: Replacing carriage returns in a data file

On 15 Aug 2013 00:54, Peter Connell wrote:

The SQL technique instead of CPYFRMIMPF sounds very interesting.

I am not sure how loading a stream file into a program via SQL or writing
a STMF using SQL could qualify as "instead of CPYFRMIMPF".
Rather, those techniques of moving data between a stream file and a LOB
column [as file reference] could replace using STMF blocked or record I/O;
in effect, defer to the SQL to effect a multi-block read of the STMF into
the LOB field. For the CPYFRMIMPF feature, the required parsing of the data
would remain for the user program to perform. And although a much closer
replacement for CPYFRMSTMF, the parsing of the Line-End [EOR] is still an
issue. Simply having all records as the data stored in a field, is far from
having the stream-records being prepared as effective database records which
can be inserted or queried. The CPYFRMxxx utilities likely are often still
going to be preferable to writing the code, if the chosen utility provides
what is necessary without any coding.

Short of having a UDTF [specific or generic; the latter is fantasy,
except within very explicitly defined limitations\boundaries] that effects
essentially what CPYFRMSTMF or CPYFRMIMPF could effect, copying the stream
data into a database file using those CL command invocations or similar
[e.g. RYO import] *prior* to accessing the data via the SQL might often
remain preferable. Because a UDTF is not update capable, the results would
have to be copied to another file anyhow, if a permanent copy of the
massaged data were required.

Reminds me of the BULK IMPORT feature available on SQL Server.

The DB2 could similarly add to the available SQL statements to effect
import and export. Presumably providing those, awaits their inclusion in
the standards; i.e. until then, the features are provided via the CL [copy
component] utilities CPYFRMIMPF and CPYTOIMPF, such that they are not in
conflict with whatever the standards eventually establish.

I can understand that this may be possible since observing that
CPYFRMIMPF can to throw SQL errors indicating there is some sort
resource contention because of previous use of the SQL environment
prior using CPYFRMIMPF in the same job. That implies that CPYFRMIMPF
is using SQL under the covers.

The import\export copy utilities use SQL to interface to the database
file as input [FROMFILE] or output [TOFILE], using the SQL CLI.
Originally the use of the SQL CLI was the non-server mode, which was very
finicky; effectively only one utilization per job, and the first utilization
was the arbiter for the environment. That and other features implemented
using the SQL [e.g. BRMS] for some of their work were /corrected/ to utilize
the server-mode of the CLI, for which there should be fewer [¿no?] impacts
between them.

The CPYFRMIMPF likely still uses a block-read of the stream file data
using the C run-time function fread(); i.e. its use of the SQL is unlikely
to be loading the STMF as a LOB File Reference. And of course the
implementation of the LOB data loading by the SQL _could be_ effectively the
same implementation [i.e. fread()], and thus using the SQL to load that data
into a LOB column may be little more than added overhead, as compared with
loading the data into a non-column program buffer using the fopen() and
fread() directly.

The CPYFRMIMPF generates a [hopefully multi-row] SQL INSERT into the
ToFile. The CPYTOIMPF generates a SQL SELECT of data cast to character
string form FROM the FromFile.

Do you have an example that demonstrates this.

Given the above, I think "this" might need to be clarified. The given
example shows a LOB File Reference declarative being used both to read the
file data into the implicit LOB column as target of the REPLACE scalar
function and to write the file using the SET per the Host Variable being SET
is the LOB File Reference which is established with the /overwrite/ option.

FWiW: You may want to read Birgitta's article; there is a section
entitled "External UDTF reading data from a comma-separated file" that shows
that the parsing of the data for either just the EOR or both the EOR and
effective field-data, is still required:
http://www.ibm.com/developerworks/ibmi/library/i-power-of-udtf/

Regards, Chuck

Birgitta Hauser on Thursday, 15 August 2013 7:36 p.m. wrote:

If embedded SQL would be an option for replacing characters in a
stream file, you may try the following example:

D CLOBFileFrom S SQLType(CLOB_File)
D CLOBFileTo S SQLType(CLOB_File)
/Free
CLOBFileFrom_Name = '/home/Hauser/MyFile.txt';
CLOBFileFrom_NL = %Len(%Trim(CLOBFileFrom_Name));
CLOBFileFrom_FO = SQFRD; //Read Only

CLOBFileTo = CLOBFileFrom;
CLOBFileTo_FO = SQFOVR; //Replace if exists

Exec SQL
Set :CLOBFileTo = Replace(:CLOBFileFrom, x'0D25', '');
If SQLCODE< *Zeros;
//Handle SQL Errors
EndIf;
Return;
/End-Free

BTW file reference variables (--> SQLType(CLOB_File),
SQLType(DBCLOB_File), SQLType(BLOB_FILE) and all the
SQLType(XML_?LOB_FILE)s) can be managed with SQL functions like any
character variable. In this way the streamfile could easily be
written directly into a database table directly with embedded SQL
(without CPYFRMxxx command).

--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L@xxxxxxxxxxxx To subscribe, unsubscribe,
or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx Before posting, please take a
moment to review the archives at http://archive.midrange.com/midrange-l.