On 29-May-2014 10:13 -0500, Wilson Jonathan wrote:
I personally would process the files differently... ignore the
times, and instead "move" a file once processed, or as the first
stage of the process, (and/or change its time on the I via the I)
from a holding folder, to a processed folder or rename the file to a
"I've done this one" naming structure. (my fave way was to rename
files from ".txt" to ".txt.processing" and finally ".txt.done" which
meant they were self documenting which was a godsend if something

One obvious problem would be if multiple jobs might process the
file(s) and need some kind of "I'm doing this one, so you can't" kind
of "locks" for wont of a better term, although the same problem holds
true if using date/time changes.

I had a process once that dealt with the concurrency issue by effectively the following algorithm:

1. rename filename to filename.interrupted
2. process file data [from renamed file]
3. rename filename.interrupted to filename.completed

If ever there were any /interrupted/ processes, the file names were quite clear on that effect, the files remained in the same directory as any new files requiring the processing, and the recovery targeted only those files where prior processing had been interrupted. IIRC the rename to the same name completed without error, and so that same program could be invoked for both new files and the recovery without any specific handling for a duplicate name; the recovery was activated by a script listing the interrupted files, and then invoking that same program for each file in that list. An in-use condition on the first rename indicated, presumably, that another job was either creating or renaming the file. I do recall someone commented that the extension /inprogress/ was to their liking, but I suggested that was no more accurate than interrupted, and given the short amount of time that the "process file data" was in progress, as compared to the much longer time period that the unprocessed file would sit awaiting recovery, the term /interrupted/ was more often [for much longer] the accurate descriptor.

This thread ...


Return to Archive home page | Return to MIDRANGE.COM home page