|
Hi Sas, >suppose we have one million record should be read and >it will take ex. 5 minute , so if we want the job to >be done in 1 (one) minute, so we should run pgm1 >( to read record 1-200),pgm2 (to read record 201-400) >and so on until pgm5( to read 801-1000) and all the >programs should be run on the same time . I have experience with this, and I understand that your 5 minutes is only an example. You need to be very careful how you partition your data. You have to be very certain that the final output does not depend on results from any other records. For instance, if you are calculating sales by week, and records 1-300 fall into week one but you partition the data so that pgm1 reads only records 1-200, pgm1 will not see all the records for week one. If pgm1 prints a report indicating say, low sales then it is not correct because pgm1 never saw all the sales for week one. The actual process of running 5 separate jobs is simple, but again you need to be concerned about the output. If you expect a report of total sales, you will have 5 subtotals that you need to add together. Normally this means that you will create a summary work file that another program will print/update from. The actual runtime performance will probably not be 5 times better unless you have a multiprocessor machine and run the jobs in a pool large enough for all 5 jobs to run without thrashing. This depends heavily on the nature of the tasks you intend to parallel. You may be better off running the separate jobs in separate pools or in one large one. The overhead of calculating the partitioning, dispatching, starting and running several jobs plus the cost of the final summariser at the end can't really be calculated in advance. The general technique I have used is to run a crude data analyser over the input file. For instance, if I am interested in sales by week, I might SQL the file GROUP BY week to get a list of weeks in the file. That list is then fed into a scheduler, which submits the parallel jobs with the appropriate parameter. Each of those jobs does its required processing and also writes to the summary file. Finally, after each of the parallel jobs is complete, the summary job runs and accumulates the individual summary records. There is no automatic iSeries mechanism that I know about to do any of these tasks. You will need to write all the code yourself. sbmjob monitor jobq(a) select week,count(*) from input group by week order by 1 begin loop fetch week... sbmjob parallel parm(week) jobq(b) repeat loop until end of records job parallel call clpgm week ovrdbf... call rpgpgm week either setll/reade on key week or do other processing that will limit this execution to the record range of interest update summary record call tellMonitorThisWeekIsDone week job monitor begin loop rcvmsg week update dbMonitor that week is done check all weeks for completion Complete? sbmjob summary jobq(a) terminate Not complete? repeat loop This is all from memory and you will definitely need to revise it for your situation. I use data queues to send messages between processes. Be cautious when thinking about this. There isn't an easy way that I know of to model the performance behaviour of such a system like this in advance. That means you will probably be forced to actually design and run the parallel model in order to be able to compare it to the single job process you already have in place. Perhaps there is a better way to approach your performance issues. Run Performance tools or PEX (depending on the version of OS/400 you're on) and collect actual statistics on the current process. By using the collected data you will probably find some place that is taking up much of your time, whether it is disk access or even a calculation loop. I strongly suggest you do this before thinking about a parallel run. --buck
As an Amazon Associate we earn from qualifying purchases.
This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].
Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.