AW: Variable length fields question -- MIDRANGE-L

Hi guys,

IIRC:
1. If VARLEN (without length) is specified (or VARCHAR without ALLOCATE in
SQL) only the 16 Byte address of the overflow area is stored within the
record. The varying length data is completely stored within the overflow
area. When reading the data the overflow area must be found and searched and
finally the data returned, i.e. additional paging is required.
If VARLEN(length) or VARCHAR with ALLOCATE is used, the number of allocated
bytes is stored directly within the table/physical file. All data longer
than the allocated length are stored within the overflow area and only the
address of this overflow area can be found within the column.
2. For short varying length columns (30 Bytes ?) the varying length data is
always stored within the column independent whether an allocated length is
specified or not.
3. If ALLOCATE or VARLEN(Length) is specified either the address of the
overflow area or if the data is shorter than the allocated length, the data
is stored within the column.
4. If ALLOCATE or VARLEN(length) is specified the length should be set so
between 80 and 90% of all data can be stored directly within the column (and
only the exceptions are stored within the overflow area).
5. If the table also includes LOB (Large Object) columns. The VARCHAR and
LOB columns are all stored in the same auxiliary overflow storage area. This
may cause additional paging for locating and returning the VARCHAR data.
6. Varying length columns are, more efficient on the IBM i when there is a
need to provide for wildcard searching. That's because DB2 UDB for i will
stop searching the varying length data a wildcard string search encounters
blanks that indicate the end of available characters. With fixed-length
characters, DB2 UDB for iSeries must search to the very end of the string,
even if the string contains blanks.
7. Using varying length fields also speeds up concatenating data (compared
with fixed length data). The data length of the varying length data is
always stored within the 2 leading byte of a varying length field, a
"substring" may be performed and the data returned.
Also if the data is changed, only the new data is replaced and the length
revised, but the rest of the column is not padded with whatever character.
When concatenating fixed length fields, trailing blanks must be trimmed off,
i.e. the last not blank character must be found. The search starts from the
end of the character field.
If the data is changed, the rest of the column must be padded with blanks.

Mit freundlichen Grüßen / Best regards

Birgitta Hauser

"Shoot for the moon, even if you miss, you'll land among the stars." (Les
Brown)
"If you think education is expensive, try ignorance." (Derek Bok)
"What is worse than training your staff and losing them? Not training them
and keeping them!"

-----Ursprüngliche Nachricht-----
Von: MIDRANGE-L [mailto:midrange-l-bounces@xxxxxxxxxxxx] Im Auftrag von
Voris, John
Gesendet: Tuesday, 29.7 2014 23:07
An: midrange-l@xxxxxxxxxxxx
Betreff: Re: Variable length fields question

I have been following the discussion with interest.
"The OS is smart enough to pad out the field" when processing, and RPG
buffers are sized at the maximum."

Here is our case: our Bill of Lading numbers are defined as 30A. But in
one file, CUPMIFH, the BOL was defined as 15A.
Most of our Bills of Lading are 7 or 8 char, and it was only once or twice
we found the 15A exceeded.
We have installed a fix in the RPG program where this surfaced as a problem,
so we have the problem resolved for now.

But when / if we were to address this mismatch . . . by redefining the
problem file, . . . would it make sense to define the BOL field as 30A
VARLEN(15) ?
Are there any advantages to making the field VARLEN ?

.+ One advantage we see is that SQL joins would be more reliable and
simpler to construct.
. - But we still have to recompile all RPG programs that are using this
problem file, correct ?

So if we are recompiling, we are more likely to go with the existing 30A and
not use VARLEN.

As far as saving DASD space, I believe it would be trivial, as the detail
file with the field BOL is only 2.2 GB ( 1.4 mill records ), and there are
other files with BOL scattered throughout the module in question. But the
space savings of 5A ( actually 5A - 2A for LEN ) is nominal.

So I assume that if we really wanted to gain any advantage, it would be to
convert the entire system to use the VARLEN(15).

And in actuality, I believe that if we defined the system using SQL Tables
and Views instead of DDS, then we would likely gain more in CPU savings than
anything we might achieve with database field sizes.

John Voris
Crown Americas LLC

--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L@xxxxxxxxxxxx To subscribe, unsubscribe,
or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx Before posting, please take a
moment to review the archives at http://archive.midrange.com/midrange-l.