Re: Reading RPG style record in Java -- JAVA400-L

Thanks again Chuck,

I think I have to make a deeper reflection.

The "RPG side" works since we controll 100% of the environment (JOBD are
always in synch with data def) but the Java part can be problematic.

Best regards

--
Marco Facchinetti

Mr S.r.l.

Tel. 035 962885
Cel. 393 9620498

Skype: facchinettimarco

2015-12-12 3:16 GMT+01:00 CRPence <crpbottle@xxxxxxxxx>:

On 04-Dec-2015 06:03 -0600, Marco Facchinetti wrote:

Hi Chuck, my comments in-line.

On 2015-12-02 20:24 GMT+01:00 CRPence wrote:

On 01-Dec-2015 03:55 -0600, Marco Facchinetti wrote:

as implied by the title I have to read (and use) this file (and

cannot modify it) in a Java program:

A R AF2WK TEXT('Afpds: workfile')
A*
A AWIDDOC 5 0 TEXT('Id doc')
A AWIDPAG 10 0 TEXT('Id page')
A AWIDOPE 14 0 TEXT('Id')
A AWNMOPE 20 TEXT('Operation')
A AWDSOPE 40 TEXT('Ds name')

A AWFLD 256 VARLEN(30) CCSID(1144)
A AWFLDEX 32000 VARLEN(1) ALWNULL DFT('')
A CCSID(1144)
A*
A K AWIDDOC
A K AWIDPAG
A K AWIDOPE

I use in RPG programs AWFLD and AWFLDEX as DS:

dDs_w_StampaTesto...
d ds qualified
d h like(Ubase)
d v like(Ubase)
d punti 4s01
d font 8
d codepage 8
d Orientamento 3s00
d Lunghezza 3s00
d Colore 3s00

so the code is very easy:

<ed: text from a followup reply expands code snippet:>

dow not %eof();
read af2wk;
if %eof();
leave;
endif;

Select;

When AWDSOPE = 'Ds_w_StampaTesto';
Ds_w_StampaTesto = AWFLD;

When AWDSOPE = 'Ds_w_StampaBox';

Ds_w_StampaBox = AWFLD;
Ds_w_StampaBoxExtended = AWFLDEX;

...

Endsl;

enddo;

So my question is: how to do the same in Java with hardcoding
positions, names and types?

Perhaps easy, but also seemingly flawed, at least according to

the DDS. The CCSID-tagged character data from the file is being
utilized in the program as though the data had been read from the
file as binary\CCSID(*HEX); suggesting that the effects are
potentially undesirable, if ever the job runs with anything other
than the CCSID(*HEX) or CCSID(1144). The ill effects could be quite
covert and if\when noticed, the origin [being a user with a
different CCSID] probably not easily inferred. Binary data is data
that must not be translated\converted according to a character
encoding, but when the data from AWFLD and AWFLDEX is being read
from the file, character conversion can occur; those fields should
be identified as FOR BIT DATA, as Hex, BINARY, or a similar
specification to avoid an issue.

If I understand correctly your warning it's about Binary data (i.e.

INT or similar).

Yes. The warning applies to any /binary/ [i.e. non-text] data, for
which that data will be direct-mapped via variables named in the DS. That
direct-map occurs in the code shown, wherever a DS variable of the naming
Ds_w_xxx appears in an expression Ds_w_xxx=AWFLDxx; i.e. the effect will
be, that the DS variables will implicitly _redefine_ the underlying data
instead of being cast from one type to another.

That redefine\direct-map is quite different than than having separately
_assigned_ each variable of the receiver DS from the distinct scalar values
from the data defined in a DS that defined actual database fields. In the
given scenario there are no scalar values, because the data remains
[effectively] un-described, awaiting the redefine established by the
definition of the target DS. But rather than being truly un-described, the
data was assigned a non-hex CCSID, and thus the data was subject to data
translation\conversions when initially obtained from the database file.

I will offer a program that slightly modifies the given scenario [to add
a couple Packed BCD; Zoned BCD has less potential for issues than Packed
and any sized\signed Integer], whereby first I place the binary data into
the database file [which mimics entering the data from RPG using a job with
CCSID(1144) or CCSID(*HEX), and then reads the data from the file into the
noted DS of the format of the file, and then moves that data into the other
DS by assignment of one DS to the other [which is an effective binary
copy]. The program then performs data validation to ensure the numeric
data was not modified\corrupted from the originally input values; when run
in CCSID(*HEX) job, all is well; when run in CCSID(1144) job, all is well;
when run in CCSID(37) job, Oh my! We have a problem.

Strings with normal text should not be affected.

Given the program is written with the default assumption and
understanding, that the data has been converted into the job CCSID; an
exception being the long-standing issue whereby standard-naming of objects,
though inherently the names are /text/, they are treated as having been
defined by code-point [at least they are, for @, #, $].

I'll convert any INT or packed to Signed.

Not sure what is meant; generally, as to when\where such conversion
would be intended, nor what is meant by "Signed". Perhaps the implication
is that all numeric data will be ensured to always be stored Zoned decimal
and always as positive Zoned [i.e. effectively _unsigned_] for which all
positive numeric values are the EBCDIC character _text_ data being the
digits zero to nine [i.e. are the invariant code points 0xF0 to 0xF9]?

But, unless the conversion is done for the data being stored in the file
with the record format AF2WK, then a later attempt to /convert/ that data
is too late; i.e. the data read from that file will have already
experienced any required _text_ conversions, thus the CCSID conversions
already having been mis-applied to the _non-text_ data.

FWiW: Each mapping of the data from AWFLD, into individual fields,

could be defined in a VIEW; scalar User Defined Function(s) (UDF)
can be created to define the effective _direct map_ of the un-typed
data [i.e. bytes of data] into typed-scalar values. The SQL can not
easily effect that directly [without UDF(s)], because the SQL only
supports a well-defined set of scalar-to-scalar mappings; a
substring of bytes is effectively either typed as BINARY or as
CHAR, for which the ability to cast into another data type is not
simple, because the data is the internal-format vs the
character-format the SQL perceives the scalar value to be.
Similarly, the DDS LF also does not have any support for
untyped\direct-map capabilities, but also provides no alternative
like the SQL does with

The problem with a VIEW is the number, there are approx. 60 DS's
with different layout and fields.

Of course no matter what is used, the 60 different layouts must be
defined somewhere.? The VIEW method is much more complicated than others,
but the data can be read directly and the formats directly available to the
RPG by EXTNAME for example.

Moreover the overall scope of the file structure is a sequential (by

key) reading to preserve the right sequence of operations.

Ah yes. That is a problem\deficiency with VIEWs. However that could be
resolved by use of Open Query File (OPNQRYF) to effect the keyed Open Data
Path (ODP) using the Key File (KEYFILE) parameter. However that is, like
using the VIEW directly if order was not of concern, is probably just
unnecessary complication.

expressions [that optionally are UDF invocations]. Note: a

creative use of a FIELDPROC could also implement the mapping of the
data, to character, but not if the AWFLDEX could ever approach the
32K, because the expansions of the data for examples like the
internal value of x'F3F2F1F5' as type 4S01 into the string '321.5'
might not fit.

For example, the following [untested] is an expression that can
effect the conversion of the character data for variable punti into
the expected 4S01 [aka NUMERIC(4, 1)] value:
zone(concat(substr(AWFLD, 5, 3),'.',substr(AWFLD, 8, 1)), 4, 1)

For example, the following [untested] is an expression that can
effect the conversion of the character data for variable
Orientamento into the expected 3S00 [aka NUMERIC(3, 0)] value:
zone( substr(AWFLD, 25, 3 ), 3, 0 )

Note: in the prior two examples, those can be functional only for
the file-data that was positive zoned Binary Coded Decimal (BCD)
numeric data, and only when the preferred-positive sign 0xF; i.e.
any negative values and even any zone portions of the BCD that are
not 0xF, would give errors, despite those values being valid while
stored in the database file.

I'll store all the numbers splitting the sign, using %ABSOLUTE and

moving the sign in a separate field supposed to be safe. Using them
in Java means converting the BINary fields (AWFLD and AWFLDEX) in
ASCII.

If the EBCDIC CCSID(1144) data is read into ASCII, then the corruption
of the inherently non-text [aka binary] data would be much worse than just
one EBCDIC CCSID to another.

The proper solution is either to use FOR BIT DATA *or* to convert all
data from the /binary/ internal-forms [any variant of Integer or Binary
Coded Decimal (Zoned and Packed)] into character strings; e.g. instead of
storing decimal negative ninety in a three-byte Zoned BCD as 0xF0F9D0,
store the value as a character string such as '-090', '-90', or '090-'.
And perhaps that is effectively what was alluded would be done; whereby the
"splitting the sign" from the digits achieves the same. But of course that
means many of the currently defined sixty DS, those with any /binary/ data,
are since going to be needing updates to reflect the storage and retrieval
as character vs internal-form; and of course the programs writing the data
to those files.?

For example, the following is not available with the SQL, until

after someone has created a UBIN2_SMALL function to effect the
conversion of the ushort value of the character data for variable
v into a SMALLINT; noting also that the SQL has no unsigned
integer type, so if the positive data is too large, the effect
would be a negative value when effecting a direct-map from the
ushort into short\smallint:
UBIN2_SMALL(substr(AWFLD, 3, 2) )

Thanks very much.

Some examples whereby the _text_ data tagged with a CCSID, i.e. other
than FOR BIT DATA [or similar], will convert the data whenever a user user
is running in a non-hex or non-CCSID(1144) job CCSID; e.g. user runs the
program while CCSID(37) is established CCSID(37) for the job:

int1 +84 increases to +208
The code point 0x54 = 0d0084 in CP1144 is SM140000 }
The code point 0xD0 = 0d0208 in CP0037 is SM140000 }

int1 +90 decreases to +81
The code point 0x5A = 0d0090 in CP1144 is LE110000 é
The code point 0x51 = 0d0081 in CP0037 is LE110000 é

zone -0 to +4 but invalid-dec=mch1202 per zone-portion=x5 vs xA->xF
The code point 0xD0 in CP1144 is LE130000 è
The code point 0x54 in CP0037 is LE130000 è

pack +4 increases to +5 [but non-preferred positive]
The code point x4F in CP1144 is SP020000 !
The code point x5A in CP0037 is SP020000 !

pack +5 to invalid-dec=mch1202 per sign-portion=x0 vs xA->xF
The code point x5F in CP1144 is SD150000 ^
The code point xB0 in CP0037 is SD150000 ^

Following did not effect substitution as I had expected:
pack +9 decreases to +3 but as substitution [¿error?] on I/O
The code point x9F in CP1144 is SC200000 €
The code point x3F in CP0037 is *unknown ␚

Following not included in example:
pack -7 to invalid-dec=mch1202 per digit=xB vs x0->x9 [sign bad too]
The code point x7B in CP1144 is SC020000 ₤
The code point xB1 in CP0037 is SC020000 ₤

The output from the following attached program source when running job
CCSID(37), using most of those above examples of /binary/ data stored in
the CCSID-tagged column, is shown here; running the program with either
CCSID(*HEX) or CCSID(1144) established for the job yields a "DSPLY No data
corruption! Yeah :-)"

DSPLY h changed from +84 to 208
DSPLY v changed from +90 to 81
Decimal data error.
DSPLY Orientamento: -90 to invalid Zoned BCD data
DSPLY Lunghezza: +234 to 235
Decimal data error.
DSPLY Colore chg: +345 to invalid Packed BCD data

The compile for the attached source member AFPDS_TST [used to be that
attaching a file with ".txt" was accepted; if neither midrange nor gmane
keep the attachment, then I will post later on code.midrange, though
possibly only if someone is interested and asks... because seems the OP
already accepts my warning as valid and understands already the
consequences for how the file\program are currently coded]:
CRTSQLRPGI AFPDS_TST SRCFILE(where_located)
SRCMBR(AFPDS_TST) OBJTYPE(*PGM) DBGVIEW(*SOURCE)

--
Regards, Chuck

--
This is the Java Programming on and around the IBM i (JAVA400-L) mailing
list
To post a message email: JAVA400-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/java400-l
or email: JAVA400-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/java400-l.