Does it make it more efficient by making
memory contiguous similar to how a HD defrag works?
An interesting analogy, never thought of it that way, but it's not that
far off. There are several advantages to GC, and I'd recommend some
googleing, but those that come to mind (from the .net GC since it's the
one I'm familiar with) are:
Allocations are faster -- since the GC knows it's going to clean up the
heap(s) later anyway, there's not the traditional list of memory areas
from which it can allocate memory. Traditionally a heap allocator ends
up with holes in memory since memory is allocated and released at
different times. The allocator keeps track of these holes and uses them
to fulfill heap requests later. This is not needed in a GC environment
since everything will compress anyway, so allocations are just an
advancement of a pointer.
Page fault advantages -- All modern systems have some degree of virtual
memory. With memory pressure (and who has enough memory that there's not
pressure) "memory" will be paged out to disk and then faulted in when
needed. This is evident in the page-file in windows and the db and
non-db faults in IBM i. Here's the problem. I (the program) am using two
pieces of memory, variable v1 one at address 12345 and v2 at address
67890, and I'm swapped out. Now I get a shot at the processor and I
access v1, boom, page-fault, and disk IO. OK, now I've got v1 in memory
and I run two more lines of code and then access v2. Again, boom,
page-fault, and more disk IO. Under a GC model those references (may)
get compacted so v1 is at address 34567 and v2 is at address 34570. In
all likelihood the single page fault for v1 would have brought v2 into
memory too.
Process cache advantages -- Just like the paging problem above,
processors (Intel/AMD/Power) have on-chip and on-die cache (the
so-called L1, L2 and L3 cache). Memory is fast, but cache is faster. If
you can get everything you need in cache then you'll run faster. Moving
my references to v1 and v2 near each other in memory means it's more
likely that they'll both be in cache, and therefore I'll run faster.
NUMA advantages -- NUMA (Non-Uniform-Memory-Access) basically says some
memory is "closer" (read: faster) to some CPUs in a machine then it is
to other CPUs in the same machine. This occurs more as you get to 16, 32
and 64 way machines, but some laptops with dual-core AMD chips are
actually NUMA machines. This is basically a case that sits between the
L1/L3 cache problem and the virtual-memory problem. The processor will
obviously use memory anywhere in the system, and any memory is faster
than disk. But if a processes memory is in the memory that's closer to
the home-cpu then you'll get faster memory access, and again, that's
good. :)
Now, are _any_ of these issues something that you, the application
developer, should worry about. NO! And will any of this directly impact
the speed at which your program generates an invoice. NO! Not directly.
There's nothing we as application developers to often enough that cares
about this stuff (maybe with the exception of the page-fault issue).
However, our applications are layered ontop of more low-level core
functions that are accesses over and over and over. It's these core
functions that gain performance advantages. And since these core
functions are used by our application programs we gain the benefit of
these changes.
-Walden
As an Amazon Associate we earn from qualifying purchases.