RE: Socket Listener Delay Issue -- MIDRANGE-L

Here is some additional information:

We verified that there is no difference between the received packets of good
transmissions and the delayed transmissions. The network is solid. No
detected TCP/IP retransmissions or lost packets. Only retransmission is
when the remote device opens a new socket when it does not get its expected
response within 10 seconds.

The system is a Power 570 with 8 processors (7 are activated), 128Gb memory

Separated into 3 LPARS.
The primary LPAR has 4.7 processors(Min 2, Max 6) and 68 gig (Min 1, Max 90)
memory

Immediately after one of the "delay" periods the WrkSysSts screen had: (it
is hard to captured while it is happening, since it usually lasts for 2
minutes and I can only tell when it is happening by viewing the log files.)

It happens roughly 28 times a day.

%CPU: 92.5
Jobs in system: 54770
Elapsed time: 00:00:11 (the 11 seconds after a major delay event, not
during)
.0 Wait-Inel and Act-Inel
Though the above is after the event. I have never seen any value other than
.0 for the "Inel" values on this system.

The effected subsystem is currently running using the *BASE pool.
Which had DB Faults: 23.3 Pages: 3925
Non DB Faults: 184.4 Pages: 722.9

During one time, NETSTAT *CNN showed there were established connections for
10 or more seconds, but no job was associated with them yet.

We will be configuring the dedicated memory pool and making it active
Wednesday AM during the scheduled maintenance down time.

I'll post the results when known.

Again, my thanks to all who offered suggestions and solutions.
Any and all suggestions are welcomed.

Rich

-----Original Message-----
From: midrange-l-bounces@xxxxxxxxxxxx
[mailto:midrange-l-bounces@xxxxxxxxxxxx] On Behalf Of DrFranken
Sent: Thursday, August 08, 2013 7:46 PM
To: Midrange Systems Technical Discussion
Subject: Re: Socket Listener Delay Issue

On the System/38 you had to shut down the subsystem to make changes.
Today we can add pools and change routing steps on the fly. I certainly
appreciate the desire to test things though!!

You mention CPU over 100% bit depending on the server that may be completely
acceptable. Is this server partitioned into multiple LPARs?
If ti is then with dynamic processor allocation and shared processors the
CPU could go as high as 1000%. In that case 100% is a very low number and
equates to only 10% which isn't busy at all! So to understand if 100% is
high or not we'll need to know the allocation of the CPU in the profile. If
the system is not partitioned then 100% is 100% and that means it IS very
busy at that time!

Potentially more important though is the faulting statistics as well as the
Transition data.

FYI I started on a S/34 (and still own it, it's truly a workbench now) but I
have accepted the modernization of the platform! The last name change was
more than 5 years ago - time flies when you're having fun!
Those who insist on staying in the past "may be a 5250" (
http://www.frankeni.com/drsblog00fh.html )

- Larry "DrFranken" Bolhuis

www.frankeni.com
www.iDevCloud.com
www.iInTheCloud.com