|
Andy -- I am going to respond to your questions in installments, because I am multi-tasking, and because the questions are complex. Part 1 ------- Your understanding of the impact of extra disk activity is correct, right up until the queueing effects are factored in. If each of the users is pressing their enter key at regular intervals, like twice per second, and all of their transactions are uniformly small (requiring only a few disk accesses), and there are enough disk arms to keep up, then an extra disk access or two probably has a small additional price. But if the transactions are a little "lumpy" or someone runs an interactive query and all of the disk arms are busy right now gathering up a bunch of stuff, then the one extra disk access can take a long time. Depending on the granularity of the metrics you are using, the cost of this "lumpiness" may be hard to visualize. Averages are tricky. How many families do you know that have one and a half children? The example system I showed has a performance problem. Adding CPU will not help. Adding memory will not help. Adding disk arms _WILL_ help. It isn't easy to prove that. But I have convinced myself with system after system. If the cost of adding disk arms is too high, the other approach that will work is to somehow reduce the quantity of disk activity. Maybe 10 milliseconds doesn't sound like much, but with queueing it might be 40 or 50 milliseconds for each access. And with 32,000 unnecessary disk faults occuring in a 5 minute interval there is a chance that any one transaction might be delayed by seconds or even minutes. All transactions may not be impacted equally. 50 milliseconds is 1/20th of a second, so 20 unnecessary disk accesses that arrive before you at the disk arm that has what you need might delay your transaction by one second. 60 of them might cause a 3 second delay. The resulting response time might be aggravating to the user. And the batch job that normally processes 10 million chunks per hour may slow down... More later, -- Charly >From: "Andy Nolen-Parkhouse" <aparkhouse@attbi.com> >Date: Fri, 12 Jul 2002 05:22:07 -0400 > >Charly, > >No, I'm afraid I don't understand. I understand the impact of disk arms >on overall performance. I understand the impact of paging/faulting on >overall performance. I do not understand the impact of too few disk >arms on faulting. > >If you have 17 disk arms servicing thousands of interactive users, I can >appreciate that they could be overburdened. If this is the case, then >your performance reports or WRKDSKSTS display should indicate a level of >activity which would justify purchasing additional arms. > >So while I can see the effect that faulting would have on disk activity, >I don't see the effect of disk activity on faulting. Other than >tinkering with expert cache, adjusting your workload, or changing your >activity levels, what can you do about faulting/paging other than >increase memory? > >If your disk activity is within acceptable limits, then the extra disk >accesses resulting from faulting/paging will increase the response time >for some users by the duration of those accesses. If I interpret your >status display correctly, this is less than one fault per interactive >transaction. That one fault could add about 10 milliseconds to the >overall response time of the transaction. This doesn't strike me as >extreme. > >I don't have the answers, but I was responding to your paragraph below, >which seemed to imply that a shortage of disk arms leads to faulting: > >"Most systems I have seen recently have lots and lots of memory and it >is being mostly wasted. I can tell because they have an automatic tuner >moving memory around like crazy - the faulting is still high - the >bottleneck is usually the disk resources (don't get me started on that >topic) - the CPU is not being fully utilized - and the solution to any >performance problem is to buy more CPU or more memory." > >I do not see that adding more disk arms to the system you describe would >significantly lessen the level of paging/faulting. Nor to I think that >the term 'thrashing' is appropriate for a system with non-database >faults of 109/second. Thrashing usually describes a system which is >spending more processing power moving memory than performing work, this >doesn't apply in your situation. > >Regards, >Andy Nolen-Parkhouse > "Nothing would please me more than being able to hire ten programmers and deluge the hobby market with good software." - Bill Gates in 1976 "We are still waiting..." - Alan Cox in 2002 "Linux is only free if your time is worthless." Charly Jones 253 265-6244 Gig Harbor Washington USA _________________________________________________________________ Chat with friends online, try MSN Messenger: http://messenger.msn.com
As an Amazon Associate we earn from qualifying purchases.
This mailing list archive is Copyright 1997-2025 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].
Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.