java – Volatile and cache processor


Hello, this is the question. As I understand it, Volatile variables must be permanently written to memory. And if it's an ordinary variable, then it is usually stored in the processor's cache for faster access. Actually, when using Volatile, it is written directly to the RAM.

There is, say, some kind of processor with 4 cores. Accordingly, there are 4 independent processors, each of which has its own caches, for example, the first and second levels, and the third is common. Well, actually, we launched several threads. Let's say each of them runs on a separate core, which has its own cache. We make the variable Volatile, and it is written directly to memory. Now, if you run the program on a single-core processor, the OS simulates the parallelism of threads. It turns out that the processor has one cache and all threads work on the same processor. The question is this. Is an ordinary variable cached at the processor level or at the software level? That is, even if one core and several threads are running, then they divide the processor caches into independent memory areas, imitating multicore. Or do they share the same cache together, so Volatile is meaningless? Explain, please, how everything happens. Thank you.


I meant that a variable for quick access is not immediately written to memory, for example, x = 1 will not be immediately written to memory, since this requires many processor cycles, but will, if possible, be written to the processor cache, and only then written to memory. If you do not put volatile and if threads change it at the same time, then the new value will be stored in the cache, and not written directly to memory. If the threads run on several cores, then everything is clear, each core has its own cache, and if you try to read, and at the same time the last value is stored in some cache, then we get the wrong thing, unless, of course, from the wrong thread read which last one wrote down. But if on one core, then there is only one cache and all threads write there, then the meaning of volatile? Or I misunderstand something.


That is, when trying to read, if many threads have changed it, then each has its own cached one and when trying to read it will not be what is needed.

When reading, each processor will see a local copy of the "variable in memory" (from its cache), but when writing to such a variable, it will inform the other processors that they must update their caches in the specified cache line before the next reading.

Typically, such relationships give the cache coherency property. There are processors that have this property, and there are those for which you need to adapt the code in order to maintain data integrity. And volatile alone is not always possible here ( on ARMs, for example ).

ps: only this, again, has nothing to do with volatile . Roughly speaking, this is a property of the shared memory of the SMP architecture .

in this architecture, there is no place for caches and so on. the boring things you write about with such enviable persistence

@Barmaley , I've been thinking a little more about "there is no place for caches and other boring things" and nevertheless I come to the conclusion that without this knowledge, even in java you can count on adequate performance characteristics of your code for SMP . Here's a real life example with arrays .

But when I was looking for this material, I actually thought not even about some peculiarities of the structure of arrays, but about banal parallel processing of a classic array : we divide the array into n parts and process each one in our own stream (with some share of writing to the array, of course, ie, for example: "parallel data filling").

If you do not think within the framework of this seemingly simple task about the possibility of displacing the cache line by a neighboring processor, and the resulting cache miss , then you will not get adequate results for it (as far as I know, java is not yet able to read the thoughts of a programmer and add padding'и under the border of the cache line for arrays that he decides to parallelize in this way).

Scroll to Top