× The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.



Anyone know what the calorie count is for humble pie? I'm never going to
lose wait if I have to eat this much all the time!

I did some timing tests earlier today and Scott is 100% correct.

Working with a 16,381,440-byte stream file (i.e. just under 16Mb), I tried 4
options:

A. Reading the contents into a 16 Mb user space. This needed only a single
call to "read".

B. Reading the contents into 16 Mb of ALLOCed memory.

C. Using a 32Kb buffer (just a simple data structure in the program). This
needed about 500 reads.

D. Using a 20-byte buffer. This needed over 800,000 reads.

I ran 5 tests of each type and took the average elapsed time and CPU seconds
used.

The results were a surprise to me.

A.    43.4 seconds elapsed 3.0 seconds CPU time.
B.    20.4 seconds elapsed 2.6 seconds CPU time.
C.    4 seconds elapsed 2 seconds CPU time.
D.    210 seconds elapsed 89 seconds CPU time.

Several things come out of this.

a) I guess A and B show the relative overhead of a user space and
dynamically allocated memory. I'll be doing more ALLOCs in future.
b) I expected D to be bad and C to be much better, but I didn't expect C to
be much better than A and B. Why? Have I somehow chosen some magic buffer
size that corresponds to an internal IFS buffer size or something?

So, I suppose the conclusion is, ditch the user space idea and either ALLOC
the memory, or, if you believe the results above (and I'm still not sure I
do though I've run C in debug and watched it all the way through to make
sure it was doing all I asked it to), use a simple 32K data structure.

Can anyone explain why C is better than A and B? I wonder whether these
results would be repeated on an empty machine (mine wasn't - no such
luxury). I wonder whether A and B are getting paged out while the I-O is
occurring while C isn't because it's doing more CPU intensive work (multiple
reads). The CPU usage is similar but the elapsed times are significantly
different. Any suggestions?

Pete
----- Original Message ----- 
From: "Scott Klement" <klemscot@xxxxxxxxxxxx>
To: "RPG programming on the AS400 / iSeries" <rpg400-l@xxxxxxxxxxxx>
Sent: Tuesday, June 24, 2003 8:52 AM
Subject: Re: scanning user space


>
> > I think I was one of the ones who suggested a user space. The reason I
> > did that was to minimize the number of IFS I-Os that had to be done. A
> > user space provides a bigger buffer that is available from an RPG data
> > structure. You could ALLOC the memory, but a user space is less of a
> > worry (maybe I shouldn't worry so much and have just worked with too
> > many PC C-programmers who lose sleep over memory leaks?).
>
> why is it less of a worry?   Seems to me that if you deallocate your
> ALLOC'ed memory, it'll get cleaned up when the activation group ends.   If
> you forget to clean up your user space, it'll sit out there indefinitely.
>
> Aside from that, they seem like they'd be the same level of worry!
>
> > My impression/surmise is that doing a single or (if the IFS file is very
> > big and won't fit into a max-size user space entirely) a few I-Os into
> > the user space would be quicker than doing a large number of small
> > reads.
>
> Depends on how small.   Yes, reading 8k from the file at a time will be
> MUCH MUCH more efficient than reading 1 byte at a time.  But, you're
> talking about reading 200mb at a time, writing it to another disk object
> (the user space) and then reading that back again.
>
> Sure, access to the user space is faster than access to the stream file.
> But, in both cases, you have to read the entire stream file.   And then,
> with your solution, you still have to read the stream file.   Will reading
> it in a 200mb block save you so much time, that when you add reading the
> user space to the equasion, it will still be shorter?   That's really the
> big question.
>
> I agree that reading the file in larger chunks is faster, but my guess is
> that once the buffer gets larger than 2-4k, making it larger than that
> will not make an appreciable difference.
>
> Since his file has 2million records, and a user space is limited to only
> 16mb in size, it creates another problem.   I'm guessing that his average
> record size is larger than 8 bytes.  :)
>
> But, even if you could have a 200mb user space, it seems likely to me that
> the system would not load it into RAM, since the object would be too
> large.   (Though, I guess that would also depend on how much RAM you have,
> and how many other jobs are using it!)   So, the user space becomes
> another disk access, causing more slowdown.
>
> >
> > I'll try to find time to give my ideas on how to scan the user space
> > later. I won't have a chance to test this out this morning, but I'm
> > wondering whether the C-function sscanf might be useful to you. I think
> > prototyping this in RPG could be a challenge but I'm sure somebody on
> > this board is up to it! :-)
>
> I've prototyped sscanf() before.   You can't make a generic prototype
> for all uses of sscanf(), but you can make one for a specific use, like
> this, quite easily.
>
> But, I don't like sscanf()... especially when working with very large
> buffers, because of the potential for buffer overflows.   It amuses me
> that you're worried abot ALLOC, but you recommend sscanf!!   sscanf and
> the other scanf functions are a MAJOR cause of security problems on the
> internet.
>
> Take this sample:
>
>         sscanf(userspace, "%[^\n]%*[\n]", buf);
>
> Since the contents of userspace are supplied by a 3rd party, how do you
> know how long the largest possible value of buf will be?   What if a
> record in the space exceeds that size?    sscanf() won't detect it, it'll
> just cheerfully keep reading the data from the space and writing it to
> whatever memory happens to follow buf.
>
> On the iSeries, this would just cause the program to crash, most likely.
> On the PC, however, you could use it to overwrite the program stack,
> inserting your own code to be run...     This is where many of the
> security flaws found in internet servers come from.
> _______________________________________________
> This is the RPG programming on the AS400 / iSeries (RPG400-L) mailing list
> To post a message email: RPG400-L@xxxxxxxxxxxx
> To subscribe, unsubscribe, or change list options,
> visit: http://lists.midrange.com/mailman/listinfo/rpg400-l
> or email: RPG400-L-request@xxxxxxxxxxxx
> Before posting, please take a moment to review the archives
> at http://archive.midrange.com/rpg400-l.
>
>
>



As an Amazon Associate we earn from qualifying purchases.

This thread ...

Follow-Ups:
Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.