Discussion:
Why is ReentrantLock faster than synchronized ?
(too old to reply)
Ernst, Matthias
2005-02-24 15:53:45 UTC
Permalink
Hi,

a recent article by Brian Goetz made me wonder about this again. Brian
demonstrates, how ReentrantLock has a performance and scalability
advantage over synchronized.

Did anyone investigate why that would be the case? Is it due to the fact
that the VM has lock-enable *any* java.lang.Object through things like
header displacement, lock inflation/deflation? Or has GC an advantage
over manual management of lock records?

Other than that I cannot think of any edge ReentrantLock could have over
synchronized: both codes are generated inline by the Hotspot compiler,
both can and probably do use the same suspension/resumption methods, the
same atomic instruction sequences, ...

Just wondering
Matthias
--
Meet us at CeBIT: Launching CoreMedia CMS 2005, Hall 3, D09.
Doug Lea
2005-02-24 16:45:31 UTC
Permalink
Post by Ernst, Matthias
a recent article by Brian Goetz made me wonder about this again. Brian
demonstrates, how ReentrantLock has a performance and scalability
advantage over synchronized.
Did anyone investigate why that would be the case? Is it due to the fact
that the VM has lock-enable *any* java.lang.Object through things like
header displacement, lock inflation/deflation? Or has GC an advantage
over manual management of lock records?
There's some friendly competition among those doing Java-level sync
vs VM-level sync. The underlying algorithms are increasingly
pretty similar. So you should expect relative performance differences
in typical server applications to fluctuate across releases. The main
goal of ReentrantLock is NOT to uniformly replace builtin sync, but to
offer greater flexibility and capabilities when you need them, and to
maintain good performance in those kinds of applications.

The people doing VM-level sync also have to concentrate on issues that
we don't with ReentrantLock, like the need to use only a few bits of
object header space to avoid bloat in objects that are never
locked. This impacts impmentation details allowing ReentrantLock to
sometimes require a few cycles less overhead.

Also VM-levl support must deal with the fact that many
programs/classes do a lot of sync that is entirely useless because it
can never be contended. (We assume people hardly ever do this with
ReentrantLock so don't do anything special about it.) Techniques to
greatly cheapen this case are probably coming soon in hotspot and
other VMs. (This case is already pretty cheap on uniprocessors, but
not multiprocessors.)

There are currently still a few things that can be done at JVM level
that we can't do at Java level. For example, adapting spinning to vary
with load averages. We're working on leveling the playing field here
though :-)

-Doug
Brian Goetz
2005-02-24 22:08:43 UTC
Permalink
Post by Ernst, Matthias
Did anyone investigate why that would be the case? Is it due to the fact
that the VM has lock-enable *any* java.lang.Object through things like
header displacement, lock inflation/deflation? Or has GC an advantage
over manual management of lock records?
Other than that I cannot think of any edge ReentrantLock could have over
synchronized: both codes are generated inline by the Hotspot compiler,
both can and probably do use the same suspension/resumption methods, the
same atomic instruction sequences, ...
Performance is a moving target. In the first JVM, performance for
everything sucked (locking, garbage collection, allocation, you name it)
because the first JVM was a proof-of-concept and performance wasn't the
goal. Once the VM concept was proven, engineering resources were then
allocated to improve performance, and there is no shortage of good ideas
for making things faster, so performance in these areas improved and is
improving with each JVM version.

So, one factor in why ReentrantLock is faster than built-in
synchronization is that the JSR 166 team spent some effort building a
better lock -- not because the JVM folks didn't have access to the same
papers on lock performance, but because they had other priorities of
where to spend their efforts. But they will get around to it and the
scalability gap will surely close in future JVM versions.

Interestingly, the algorithm used under the hood of ReentrantLock is
easier to implement in Java than in C, because of garbage collection --
a C version of the same algorithm would be a lot more work and would
require more bookkeeping in the algorithm. As a result, the approach
taken by ReentrantLock makes more garbage and uses less locking than the
obvious C analogue, and it turns out that, given the current relative
cost between memory management and memory synchronization, an algorithm
that makes more garbage and uses less coordination is more scalable.
This week. Might be different next week. Performance is a moving target.
Loading...