× The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.


  • Subject: Re: Change Management Software - Now sleuthing
  • From: John Earl <johnearl@xxxxxxxxxxxxxxx>
  • Date: Fri, 25 Feb 2000 12:33:44 -0800
  • Organization: The PowerTech Group

Ron,

Ron Hawkins wrote:
> 
> Well, everytime I get a question about response, I do a work active job. 
>Guess what tops the list of cpu percentage? And if I have two programmers 
>promoting source at the same time, each gets around 20% of cpu. That's 40% for 
>two, 60% for three... You get the idea (we have over 30 programmers.) When I 
>have my programming manager call for support on the issue, he gets nowhere. 
>They finally helped some (after three weeks) on a wise desk resource problem 
>that was taking over 80% of cpu any time a user filtered their call list. I 
>would definitely call this a resource pig.<

Like some other respondents, I'm surprised to here complaints about
SoftLanding's customer service.  I've always considered those folks to
be some of the most customer service oriented people in the business.

But what I really wanted to respond to is the description of 20% of
CPU being a performance problem.  My experience is that the CPU% used
field in the WRKACTJOB command is one of the _least_ accurate methods
of performance measurements.  First off, "CPU percentage used" is only
significant when your  box is running at 90%+ and or experiencing high
numbers of faults.  If you have one job chewing up 55% of CPU, and
your whole box is only running at 63% total percentage, it's not a
performance problem.  You've still got 37% of CPU available for anyone
else who wants it, and the fact that one job is using 55% means that
you're getting high utility of available resources from that job.

Another common mistake people make with WRKACTJOB and CPU % used is
that the window of time used for the averages is either too small or
too large to me statistically meaningful.  If a job is used 20% of the
CPU over the last 7 seconds, is that a problem or a statistical
anomoly?  A given job may have a burst of activity over 5 or 10
seconds and then cruise on low power for the next 10 minutes.  If you
happen to catch it at just he right time, you'll be fooled into
thinking that this job is a problem.  On the other hand, if your time
frame is too large, (say several hours) the real resource hog can
easily be hidden among long latentcy times.  You may never find it
because it uses 100% for 3 minutes at a time, and then is idle for the
next 57 minutes.   WRKACTJOB would report this as 5% CPU utilization
over the last 4 hours, but you've still got a performance nightmare
for 3 minutes of every hour.

And finally, I can't count the number of times that someone has used
WRKACTJOB to mis-identify the culprit.  The typical scenario goes like
this:  Job 'A' goes wild and starts chewing up the CPU.  Job 'B' is
essentially locked out while 'A' hogs the machine.  The system
Administrator signs on to terminal 'F' and does a WRKACTJOB.... but
because the 'A' has the system tied up, job 'F' waits several seconds
(or minutes) for the WRKACTJOB command to complete.   Here comes the
switcheroo.  Job 'A' completes, and because job 'B' is next in line,
it kicks in and starts working real hard (driving 'B's CPU% up). Now
job 'F' gets some cycles and takes it's WRKACTJOB snapshot.  But
because 'A' is already complete, and 'B' is working hard to catch up,
'B' get's falsely accused of being the culprit.

WRKACTJOB can be used to tell you if a job is abusing the system
(through CPU % + CPU seconds used + I/O + elapsed time + some
additional experience with system tuning), but if you're just looking
at CPU%, you're not getting the straight scoop.  In a case like this
there is no substitute for good perfromance measurement tools such as
The perormance tools LPP (PT1) Best/1, or PEX.

So Ron, I'm not saying that you aren't seeing a resource hog, I'm just
ranting about the common practice of using CPU% alone to identify one.

jte



--
John Earl                                          
johnearl@powertechgroup.com
The PowerTech Group                        206-575-0711
PowerLock Network Security              www.400security.com
The 400 School                                www.400school.com
--
+---
| This is the Midrange System Mailing List!
| To submit a new message, send your mail to MIDRANGE-L@midrange.com.
| To subscribe to this list send email to MIDRANGE-L-SUB@midrange.com.
| To unsubscribe from this list send email to MIDRANGE-L-UNSUB@midrange.com.
| Questions should be directed to the list owner/operator: david@midrange.com
+---

As an Amazon Associate we earn from qualifying purchases.

This thread ...

Follow-Ups:
Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.