RE: recommendations for faster IFS and string handling sub-procs. -- RPG400-L

I believe the QDCXLATE is unnecessary if your use _ILE_C_fgets() so that
might help as well--but I'm not certain, so check the return value in debug
mode.
My CSV converter in the RPG ToolKit runs much quicker and it uses fgets() so
perhaps that's all that's (on the surface) needed.
Here's the prototype for the regular fgets() and supporting procs:

      /IF       NOT DEFINED(STREAM_FILES)
      /DEFINE   STREAM_FILES
     
     D fOpen           PR              *   ExtProc('fopen')
     D  fileName                       *   Value Options(*STRING)
     D  fmode                          *   Value Options(*STRING)
     D O_READ_ONLY     C                   Const('r')
       
     D fClose          PR            10I 0 ExtProc('fclose')
     D  stream_FILE                    *   Value
      
0001 D ifsReadLine     PR              *   ExtProc('fgets')
0002 D  inBuffer                       *   Value Options(*STRING)
0003 D  inBufLen                     10I 0 Value
0004 D  stream_FILE                    *   Value
       
     D fEof            PR            10I 0 ExtProc('feof')
     D  stream_FILE                    *   Value
      
     D
    
      /ENDIF

Bob Cozzi
Cozzi Consulting
www.rpgiv.com


-----Original Message-----
From: rpg400-l-bounces@xxxxxxxxxxxx [mailto:rpg400-l-bounces@xxxxxxxxxxxx]
On Behalf Of Richard B Baird
Sent: Friday, June 13, 2003 1:54 PM
To: rpg400-l@xxxxxxxxxxxx
Subject: recommendations for faster IFS and string handling sub-procs.

Hey all,

Sorry for the really long post.

I've jumped head first into sub-procedures.

I had a job that needed to read a .CSV file from the IFS, load into a PF,
do some data conversion and then post the data to other files.

I was using CPYFRMIMPF to convert the csv, and it worked fine, but the file
is getting quite large - 2million records now and growing, and just the
CPYFRMIMPF was taking several hours.

so, I created a service program to read from the ifs, and another to parse
the CSV fields.   actually, I added these functions to existing service
programs (good idea or not?).

I've probably cut the time for the entire process in half, but I'd like to
do better (it still runs for a lot more than an hour), and I think that i
might get some savings from the way I pass the strings to my subprocedures,
or how I assemble the program.  I'll include my proc definitions specs at
the bottom for you all to tear apart.

here are my questions - keep in mind, i'm ILE challenged compared to a lot
of you guys:

1.  what are the performance ramifications of using these string and IFS
functions in a service program rather than, say, creating modules and
binding them together into one program, or by simply defining all my procs
in the driver program, eliminating the binding altogether?

2.  what is the best way of passing potentially large strings to and from a
sub-proc?
 (see my examples for how I did it - mostly varying strings.

3.  I've put all my IFS open/read/close stuff in a subproc for ease of use,
but I'm not saving as much in ease as i'm losing in performance.

4. my csv parsers do 1 variable at a time - this may be where I'm wasting a
lot of time - but i was hoping to reuse it later - It recieves the entire
record, a delimiter, and a 'field index' and returns the nth variable in
the record.  can anyone improve upon this design with out making it 'file
specific'?

5. any other tips you can give me would be appreciated.

below are some of my subprocs - cut them to pieces please - show no mercy.

I won't bother including the IFS open and close routines - they run once
each only per job - the read and most of the string funtions run as many as
1 to 10 times per each of 2 million records - the numeric field csv parser
includes code from Hans and Barbara posted to the list a while back.

Thanks a ton!!

Rick

ifs read service program procedure:

 * function: ReadStrmF - Reads a 'record' from the ifs -
 *        end-of-record marker assumed to be CR, LF, or CRLF.
 * parms:  peFD  - file id of previously opened stream file.
 *         peEOF - end of file marker (no data read at all)
 * returns: Varying string of record read

P ReadStrmF       B                   export
D ReadStrmF       PI         32765A   varying
D  peFD                         10I 0
D  peEOF                          n

D Char            S              1A
D Eof             S             10I 0
D Length          S              5P 0
D String          S          32765A
D Table           S             10A
D StringOut       S          32765A   varying

C                   eval      Eof = 1
C                   eval      peEof = *off

 * read one "line" from file (one char at a time until cr, lf or crlf

C                   eval      Length = 0
C                   eval      String = *blanks

C                   dou       Char = x'0A' or Char = x'0a'
C                   eval      Eof = Read(peFD: %addr(Char): 1)

 * if nothing read,

C                   if        Eof < 1

 * but some data has been previously recieved, treat like end of record.

C                   if        Length > 0
C                   leave
C                   end

 * no data prev recieved - return nothing, and set EOF.

C                   clear                   StringOut
C                   eval      peEof = *on
C                   return    StringOut
C                   end

 * don't process cr or lf

C                   if        Char <> x'0A' and Char <> x'0D' and
C                             Char <> x'0a' and Char <> x'0d'
C                   eval      Length = Length + 1
C                   eval      %subst(String: Length: 1) = Char
C                   end

 * maximum length reached

C                   if        Length = 32765
C                   leave
C                   end

C                   enddo

 * Convert to EBCDIC

C                   if        Length > 0
C                   call      'QDCXLATE'
C                   parm                    Length
C                   parm                    String
C                   parm      'QEBCDIC'     Table
C                   end

 * Return that line

C                   eval      StringOut = %trimr(String)
C                   return    StringOut

P                 E

parse csv records procedures:

 *+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 * ParseDelimAlph - returns a 256 alpha string which
 *   corresponds to the x element in a delimited record.

 *     peString    varying sized delimited record
 *     peDelim     delimiter charactor
 *     peIndx      field index
 *     peError     error indicator

 *      returns Variable or blanks, error indicator
 *+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

P ParseDelimAlph  B                   EXPORT
D ParseDelimAlph  PI           256a   varying
D   peString                 32765a   varying
D   peDelim                      1a   const
D   peIndx                       5i 0 const
D   peError                       n

D str             S            256a   varying

C                   eval      str = ParseDelim(peString:
C                                       peDelim:peIndx:peError)
C                   eval      peError = *off
C                   return    str

P                 E

 *+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 * ParseDelimNumb - returns a 30P 9d field which
 *     corresponds to the 'x' element in a delimited record.

 *     peString    varying sized delimited record
 *     peDelim     delimiter charactor
 *     peIndx      field index
 *     peError     error indicator

 *      returns numeric Variable or 0
 *+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

P ParseDelimNumb  B                   EXPORT
D ParseDelimNumb  PI            30p 9
D   peString                 32765a   varying
D   peDelim                      1a   const
D   peIndx                       5i 0 const
D   peError                       n

D str             S            256a   varying

D negative        S               n   inz(*OFF)
D string          DS            30
D    decnum                     30s 9 inz(0)
D i               S             10i 0 inz(1)
D digits          S             10i 0 inz(0)
D decpos          S             10i 0 inz(0)
D dec             S             10i 0 inz(0)
D ch              S              1a
D chtemp          S             30a   varying

 /free

  str = ParseDelim(peString:peDelim:peIndx:peError);

  if peError;
     return 0;
  endif;

   // Skip leading blanks (if any)

  dow i <= %len(Str) and %subst(Str:i:1) = ' ';
     i = i + 1;
  enddo;

  // Is string blanks or null?  then return 0

  if i > %len(Str);
     return 0;
  endif;

  // Is first non-blank char a minus sign?

  if %subst(Str:i:1) = '-';
     negative = *ON;
     i = i + 1;
  endif;

  // Skip leading zeros (if any)

  dow i <= %len(Str) and %subst(Str:i:1) = '0';
     i = i + 1;
  enddo;

  // Is string all zeros and blanks? then return 0

  if i > %len(Str);
     return 0;
  endif;

  // Loop through digits of string to be converted

  dow i <= %len(Str);
     ch = %subst(Str:i:1);

     if ch = '.';
        // We've reached the decimal point - only
        // one allowed

        if decpos <> 0;
           // We've already read a decimal point
           leave;
        endif;

        // Indicate decimal position just after last
        // digit read.

        decpos = digits + 1;
     elseif ch >= '0' and ch <= '9';

        // We've read a digit - save it

        digits = digits + 1;
        chtemp = chtemp + ch;

        // Have we read enough digits?

        if digits = 30;
           leave;
        endif;

     else;
        // Anything other than a digit or decimal point
        // ends the number
        leave;
     endif;

     // Advance to the next character

     i = i + 1;
  enddo;

  // Adjust decimal positions

  if decpos = 0;
     // If no decimal point coded, assume one after all digits
     decpos = %len(chtemp) + 1;
  else;
     // drop excess decimal digits
     dec = %len(chtemp) - decpos + 1;

     if dec > 9;
        %len(chtemp) = %len(chtemp) - (dec - 9);
     endif;
  endif;

  // Scale number appropriately

  %subst(string: 23-decpos: %len(chtemp)) = chtemp;

  // Set sign of result

  if negative;
     decnum = - decnum;
  endif;

  // Return answer

  peError = *off;
  return decnum;

 /end-free

P                 E

 *+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 * ParseDelim - retrieves the (x) element in a delimited record,
 *            verifies data type, returns to caller value or error

 * parms:   peString    varying sized csv record
 *          peDelim - delimitor
 *          peIndx  - field number
 *          peError - error indicator (returned)

 *      returns Variable or error
 *+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

 *+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

P ParseDelim      B
D ParseDelim      PI           256a   varying
D   peString                 32765a   varying
D   peDelim                      1a   const
D   peIndx                       5i 0 const
D   peError                       n

D str             S            256a     varying
D $beg            S              3s 0
D $len            S              3s 0
D $y              S              3s 0
D $x              S             10i 0

C                   eval      peError = *off

C                   if        peIndx < 1
C                   eval      peError = *on
C                   return    *blanks
C                   end

C                   eval      peError = *off
C                   eval      $beg = 0
C                   eval      $len = 0
C                   eval      str = *blanks
C                   eval      $y = 0

c                   if        peIndx = 1
c                   eval      $beg = 1
c                   end

C     1             do        peIndx        $x
C                   eval      $y   = %scan(peDelim:peString:$y+1)

C                   if        $x   = peIndx - 1
C                   eval      $beg = $y + 1
C                   end

C                   if        $x   = peIndx

C                   if        $y   = 0
C                   eval      $len = (%len(peString) + 1) - $beg
C                   else
C                   eval      $len = $y - $beg
C                   end

C                   end

C                   enddo

C                   if        $beg = 0
C                   eval      peError = *on
C                   end

C                   if        $len = 0 or $beg = 0
C                   return    *blanks
C                   end

C                   eval      str = %trim(%subst(peString:$beg:$le

c                   dow       %scan('"':str:1) > 0
C                   eval      str = %trim(%replace(
C                                          ' ':
C                                          str:
C                                          %scan('"':str:1)))
c                   enddo

C                   return    str

P                 E


_______________________________________________
This is the RPG programming on the AS400 / iSeries (RPG400-L) mailing list
To post a message email: RPG400-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/rpg400-l
or email: RPG400-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/rpg400-l.