× The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.



This is a very important technique, especially today, where modern processors have multiple cores per chip package, multiple threads per chip (hyperthreading, or "simultaneous multithreading" aka. SMT in IBM-speak) and often have multiple CPU chip packages installed in the machine.

The world is rapidly discovering that writing multi-threaded applications is difficult at best. This approach offers a very natural way to introduce a degree of parallelism into many commercial batch applications. And you can "tune" this, based on the number of available processors or cores, by just adjusting the number of subsets (or "chunks") of data and the total number of jobs submitted to run in parallel. You may also want to run them in a subsystem that has a dedicated shared memory pool with a high enough activity level to support all of this activity.

Of course, you must also develop some extra code to consolidate all of the results, if needed, but that is a relatively small price to pay for such potential performance gains.

> On 6/4/2010 6:29 PM, Lennon_s_j@xxxxxxxxxxx wrote:
Yes, I've done that too with good results, but in my case I was reading
a significantly large transaction file and updating many summary files.
Splitting up the transaction file largely by RRN and processing
between 7 and 10 in parallel significantly reduced the elapsed time, but
did drive the CPU hard.

Sam

On 6/4/2010 5:54 PM, Peter Connell wrote:
Kurt,
While there are undoubtedly horses for courses I have been tasked with data mining tasks over the last year where there are 10 million or more records driving the process each of which itself may generate scores of other reads. I've found that submitting up to simultaneous 10 jobs, each of which accepts parameters as to which portion of the input file drives it, has yielded exceptional performance. This does drive CPU right up but permits huge volumes to be processed overnight.

Peter

As an Amazon Associate we earn from qualifying purchases.

This thread ...

Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.