RE: framework question -- JAVA400-L


> -----Original Message-----
> From: Joe Pluta [mailto:joepluta@xxxxxxxxxxxxxxxxx]
> Sent: Monday, July 05, 2004 7:44 PM
> To: 'Java Programming on and around the iSeries / AS400'
> Subject: RE: framework question
> 
> 
> Ah, this is indeed a good example, Charles.  It's probably the ONLY
> reason for using commitment control - as a multiple-record lock.  But
> let's take a closer look at your examples:
> 
> For instance, you don't REALLY want to commit transactions 
> over multiple
> machines, do you?  That means that if one machine goes down for any
> reason, the other machine can't process.  This seems to defeat the
> purpose.
> 
> Or how about committing across storage pools that go offline? 
>  Same bad
> decision.  These should be queued asynchronous updates, and 
> thus cannot
> be committed.

Not quite Joe.  IF and only IF it makes sense for you to be able to update
and continue on the local machine when the remote machine is down then sure
you don't need to commit across the machines.  With separate and distinct
processes going on batch updates of the remote systems makes sense.

But, on the other hand if you can't allow the local machine to be out of
sync with the remotes then CC ( or two-phase CC since we're talking
local-remote) makes sense.  

Sure a programmer could build in some of this himself but why waste time
writing what's already provided by the DB?


> 
> Okay, how about the record deletion issue?  What if someone 
> deletes the
> record I was going to update?  Well, that means I blew an application
> lock somewhere.  This is the crux of my argument, in fact.  Any set of
> related records that need to be updated ought to have a single record
> that is at their root.  Now this may not always be the case; 
> let's leave
> that case, shall we?
> 
> So, we agree that for this discussion, you can lock that 
> single record.
> And now, there is no reasonable way someone can update records in the
> set unless they go outside the boundaries of my application.  You say
> "but what about ODBC access"?  And now you know why I hate 
> ODBC, or any
> other sort of unmanaged access to the database.  By enforcing proper
> application design (which to me involves wrapping database files in
> servers) then there's no need for commitment control to prevent
> destructive co-access.

I agree that in the right environments, proper application design mitigates
much of the need for CC.

Indeed the right environments can be found historically in iSeries shops
which is why so few use CC.

But nobody I know works in those environments anymore.  I'd love to be able
to control all aspects of accessing the data on the iSeries; I can try by
providing stored procedures, ect. and work with the PC people.  But if I'm
too forceful about the right way to access data on the 400, then the data
ends up moving to the SQL server.

Interestingly enough, I'm pretty sure CC has been available since v1r1 of
OS/400 if not the System/38; perhaps the System/36 and prior even.  This is
not a "new" idea.  Some environments have always needed it.

Let's take a closer look at the "proper application design".  Historically
on the System/38 and AS/400 an application consists a set of programs and
the files used by that application looked upon as a single entity.  However
now-a-days we know that set of files is special.  The files and the more
importantly the relationships between make them a Relational Database.   By
taking advantage of the features provided by all modern RDBMS's including
the iSeries we can just about guarantee the integrity of the RDB our
application system is built on even if we can't control the all the
applications that will eventually be used to access it.

As an added benefit, by letting the RDBMS handle the integrity of the DB my
application code is simpler to write.  Once I understand how to take
advantage of it.


> 
> So that leaves those situations where you don't have a record to lock.
> Well, when you think about it, what exactly are those 
> situations?  Mass
> price updates?  File reorgs?  In most of the cases I can think of, the
> lock at this point should be on the file itself.
> 
> So, once again I fail to find an absolute necessity for commitment
> control.  Granted the multiple record lock looks at first 
> glance like it
> requires some form of CC, but upon further review it seems that
> commitment control is really more of a false sense of security than
> anything.
> 
> Especially given the cost.  We've already discussed the large 
> amounts of
> disk overhead associated with journaling and commitment 
> control.  There
> is one other HUGE issue.

What's large?  I've got 12 35GB disks of which 2 are used to hold journal
receivers.  That's less than 20% and I've got journal receivers on there
from 5 months ago which is considerably longer than usual.  I could easily
use just 1 disk.  For resiliency, I'll probably end up configuring them as a
mirrored pair.

The Journal performance Redbook talks about a customer whose EOD batch job
results in 16 million journal entries and uses 15GB of DASD.  That's huge to
me, but not a big chunk of the 1.9TB they had configured.


> 
> It's synchronous.  That is, as far as I know, only one transaction can
> be occurring on a file at a time.  You have in effect single-threaded
> your entire system around the database, especially as you open up your
> commitment control to those multiple-file transactions you're pointing
> to as your example.  If, for example, every transaction that hits a
> master file record has to lock it, that means that your most popular
> item in your shop can only handle one order at a time.

That is not correct.  Commitment control doesn't prevent multiple processes
each with their own transactions from updating the same files at the same
time.  Depending on the isolation level you are using.

Can multiple transactions affect the same record at the same time, of course
not.  You can't do it without CC either.  But that doesn't  mean the most
popular item in you shop can only handle one order at a time.  It only means
only one process could update the on-hand of a given item at a time. It's a
matter of application and process design.  See below.


> 
> That also begs the design question: when you update an order, do you
> lock the item records, too?  Because when you write the item, 
> you should
> be updating the allocation quantities on the inventory 
> records.  You're
> also updating A/R totals, which means customer records may need to be
> locked.  Where exactly do you draw the line at what constitutes a
> "transaction"?

A very good question.  The answer is it depends ;-)  

That's something that needs to be considered during the design of the
system.  It's not a trivial answer to come up with either.  Which is why
lazy designers don't bother with it.  That's also why it can be difficult to
retrofit an existing application.

In your example above you say "when you write the item you should be
updating the allocation quantities on the inventory records."  Should you
really?  Disregarding the CC transaction requirements, allocating inventory
at order entry can lead to problems if the order is suspended or just never
completed.  Not hard to deal with, but again something that relates to the
design.  Let's say our application instead allocates inventory when the
order is completed.  Now you've got two separate transactions.  One starts
when you begin to enter an order and ends when the order is completed.  The
other starts when the order is completed and ends after all necessary files
are updated.

Let's consider your most popular item example from above.  Really it's not
the single most popular item that is the problem it's the N most popular
items.  Specifically what to do when multiple orders whose sets of items
have an intersection of N items when N >= 2.  In other words we're looking
at a situation where a deadlock could occur if you are doing CompleteOrder()
from multiple processes.  How do we handle it?

1) Catch the deadlock error, do a rollback, maybe a random wait, try again.
2) Don't use multiple processes, have CompleteOrder() pass the order number
to a server job that runs in the background to update the files.  Single
threaded?  Yep, but each transaction would be very fast.  Besides it's not
any different than wrapping the DB files in server jobs.
3) Add some intelligence to CompleteOrder().  Process the order detail in
item# sequence.  You'd never get a deadlock then.  Worse case when all
orders have the same set of items would you'd still get them all done as
fast as you would in a single process model.


> 
> Questions like this have some serious ramifications, especially if you
> don't design your system to be as fast as possible.  In fact, in order
> to make sure you don't lock your database any longer than necessary,
> you'll probably have to create shadow records of the transactions so
> that you can apply them in the background.  In fact, you may find
> yourself doing what SAP does, which is basically chugging all database
> updates into a batch queue.
> 
> And once you've done that, you have only one program updating 
> each file.
> And it does it in batch.  And guess what?
> 
> It no longer needs commitment control.


Unless the system abends from external or internal causes.  Does it happen
often?  Nope.  But it does happen and it can be a real pain to recover from.
If your data is critical then your data is critical.


Charles