I was thinking about this on my way home, and I think Marko's initial
proposition is correct -- but that we shouldn't concern ourselves with it.
I still think it comes down to the meaning of "subsequent," but I think
wall-clock is not a good definition. In a multi-core machine, it's
conceivable that two actions would happen close enough that it's impossible
to tell which one happened first (by wall clock time). Forcing a consensus
about which came first would be expensive, and it would probably be
prohibitive on a many-core machine -- let alone a multi-node JVM (I was
only partially joking about that before).
So, what happens if we have some loose definition of "subsequent," probably
with transitivity and perhaps some other basic properties? If we do that,
then Marko's initial proposition holds. For instance, the JVM would be
allowed to have a while(someVolatileBool) thread go on forever, having been
determined to be subsequent all other actions.
But there are lots of things which the JVM is allowed to take its time on,
but shouldn't. I haven't read anything in the JLS that requires "int i = 1"
to complete before the heat death of the universe, and yet we would surely
file a bug if it were not nearly instantaneous.
I would argue that subsequent-ness is the same. There's nothing to prevent
the JLS from delaying actions in some horrid way (w.r.t. "subsequent" and
thus visibility), but we would surely file a bug if it were not extremely
quick on a single-node machine, and appropriately quick on a distributed
JVM.
On Aug 17, 2012, at 9:24 PM, "Boehm, Hans" <***@hp.com> wrote:
My reading is that subsequent in chapter 17 of the spec refers to
synchronization order, except in one case in which it refers to a
hypothetical sequentially consistent memory model.
I agree that the originally described behavior is allowed. Theres
another complication here that can explain this behavior (as can delayed
stores). AFAIK, the spec does not clearly require that wall clock time
is perfectly synchronized across threads, and implementations may, for good
reason, fail to ensure that, particularly in this case. We had this
discussion in connection with C++11, and that spec does require that a
specific clock is consistent with happens-before. But even that spec,
which is strict enough to raise some controversy, isnt sufficient here,
since there is no happens-before relation between the threads.
Hans
*From:* concurrency-interest-***@cs.oswego.edu [mailto:
concurrency-interest-***@cs.oswego.edu] *On Behalf Of *David Holmes
*Sent:* Friday, August 17, 2012 5:03 PM
*To:* Yuval Shavit; Vitaly Davidovich
*Cc:* concurrency-***@cs.oswego.edu
*Subject:* Re: [concurrency-interest] Relativity of guarantees provided by
volatile
Ah I knew there was something about "immediate" in there somewhere :) So if
instructions were immediate (ie they completed as soon as they started)
then wall clock observations would be consistent with actual executions.
But on real systems instructions are not immediate so you have to take the
completion of the instruction as the point at which to make a wall clock
observation.
Aside: as well as being atomic and immediate you also need to preclude two
instructions from executing simultaneously :)
David (now departing to enjoy the weekend :) )
-----Original Message-----
*From:* Yuval Shavit [mailto:***@akiban.com]
*Sent:* Saturday, 18 August 2012 9:49 AM
*To:* Vitaly Davidovich
*Cc:* ***@ieee.org; concurrency-***@cs.oswego.edu
*Subject:* Re: [concurrency-interest] Relativity of guarantees provided by
volatile
(Resending -- I'd accidentally sent this only to Nathan)
So, Nathan, you're essentially saying that "subsequent" is defined in terms
of clock time. If we take that as the definition, then I agree that it
works as we'd all expect. Furthermore, I agree that that's the intuitive
definition, as well as the de-facto one. But it doesn't seem to be the one
explicitly defined by the JLS. (And if it is the one implicitly defined by
the JLS, I there's a serious bug waiting for any distributed-node JVM whose
nodes are traveling away from each other at relativistic speeds! ... though
I'm no physics expert.)
That said, 17.4.3 does imply that the reads will be viewable in a
wall-clock-sequential way, albeit informally
Sequential consistency is a very strong guarantee that is made about
visibility and ordering in an execution of a program. Within a sequentially
consistent execution, there is a total order over all individual actions
(such as reads and writes) which is consistent with the order of the
program, and each individual action is atomic and is immediately visible to
every thread.
(emphasis on "is immediately visible")
On Fri, Aug 17, 2012 at 7:41 PM, Vitaly Davidovich <***@gmail.com>
wrote:
David and you may be right in the theoretical aspect. In practice, I can't
fathom how a JVM can do this type of analysis. That's an issue that I have
with JMM's wording of happens-before -- it doesn't translate to reality, it
seems like.
Sent from my phone
On Aug 17, 2012 7:30 PM, "Yuval Shavit" <***@akiban.com> wrote:
Sure, but it could decide that the execution order is [w1, w2, w3, r]. In
fact, as far as we know, the thread may have been scheduled such that that
was the clock-time ordering, too.
On Fri, Aug 17, 2012 at 7:21 PM, Vitaly Davidovich <***@gmail.com>
wrote:
Two volatile writes emit a store-store barrier in between, which to me
means they cannot be collapsed and must be made visible in that order (on
non-TSO this would require a h/w fence). In other words, I don't think
compiler can remove the redundant stores as if this was a non-volatile
field, where it's a perfectly valid (and good) optimization.
Sent from my phone
On Aug 17, 2012 7:18 PM, "David Holmes" <***@aapt.net.au> wrote:
How does it violate the JMM? There is nothing to establish that any read
has to have occurred prior to w=3. An external observer may say "hey if
we'd actually written w=1 at this point then the read would see 1" but that
is irrelevant. The program can not tell the other writes did not occur.
David
-----Original Message-----
*From:* Vitaly Davidovich [mailto:***@gmail.com]
*Sent:* Saturday, 18 August 2012 9:12 AM
*To:* ***@ieee.org
*Cc:* Marko Topolnik; concurrency-***@cs.oswego.edu
*Subject:* Re: [concurrency-interest] Relativity of guarantees provided by
volatile
I don't think the writes to w can be reduced to just the last one as it
would violate the JMM. R may only see the last one due to interleaving
though. Not sure if that's what you meant.
Sent from my phone
On Aug 17, 2012 7:03 PM, "David Holmes" <***@aapt.net.au> wrote:
Hi Marko,
I think the "surprise" is only in the way you formulated this. Said another
way a write takes a finite amount of time from when the instruction starts
to execute to when the store is actually available for a read to see.
(Similarly a read takes a finite amount of time.) So depending on those two
times a read and write that happen "around the same time" may appear to have
occurred in either order. But when you program with threads you never know
the relative interleavings (or should never assume) so it makes no
difference how the program perceives the order compared to how some external
observer might perceive it.
As for your optimization to "chunk" volatile writes, I don't see a problem
here if you are basically asking if given:
w = 1; // w is volatile
w = 2;
w = 3;
that this could be reduced to the last write alone? I see no reason why not.
Without some additional coordination between a reader thread and the writer
thread, reading w==3 is a legitimate outcome. If you are thinking about how
the hardware might chunk things then that is a different matter. We have to
use the hardware in a way that complies with the memory model - if the
hardware can't comply then you can't run Java on it.
David Holmes
------------
-----Original Message-----
Topolnik
Sent: Saturday, 18 August 2012 7:24 AM
Subject: [concurrency-interest] Relativity of guarantees provided by
volatile
Consider the following synchronization order of a program
- thread R begins;
- thread R reads a volatile int sharedVar several times. Each
time it reads the value 0;
- thread R completes;
- thread W begins;
- thread W writes the sharedVar several times. Each time it
writes the value 1;
- thread W completes.
- thread R reads 0 at t = {1, 4, 7, 10};
- thread W writes 1 at t = {0, 3, 6, 9}.
As far as the Java Memory Model is concerned, there is no
contradiction between the synchronization order and the
wall-clock times, as the JMM is wall-clock agnostic. However, I
have yet to meet a single Java professional who wouldn't at least
be very surprised to hear that the specification allows this.
I understand that the SMP architecture that dominates the world
of computing today practically never takes these liberties and
makes the volatile writes visible almost instantaneously. This
may change at any time, however, especially with the advent of
massively parrallel architectures that seem to be the future. For
example, an optimization technique may choose to chunk many
volatile writes together and make them visible in a single bulk
operation. This can be safely done as long as there are no
intervening read-y actions (targets of the synchronizes-with
edges as defined by JLS/JSE7 17.4.4).
1. Is there a loophole in my reasoning?
2. If there is no loophole, is there anything to worry about,
given that practically 100% developers out there consider as
guaranteed something that isn't?
-Marko
_______________________________________________
Concurrency-interest mailing list
http://cs.oswego.edu/mailman/listinfo/concurrency-interest