C++ implementation of call_once


I want to understand how std::call_once works. And most importantly – is it lock-free. Here they are trying to implement it using a mutex. If call_once can only be implemented using a mutex, what problems could arise with this code?

#include <iostream>
#include <thread>
#include <atomic>

using namespace std;
using my_once_flag = atomic<bool>;

void my_call_once(my_once_flag& flag, std::function<void()> foo) {
    bool expected = false;
    bool res = flag.compare_exchange_strong(expected, true,
                                            std::memory_order_release, std::memory_order_relaxed);
my_once_flag flag;
void printOnce() {
    my_call_once(flag, [](){
       cout << "test" << endl;
int main() {
    for(int i = 0; i< 50; ++i){
    return 0;


The standard does not impose restrictions on the implementation of std::call_once , therefore, its implementation can be either with or without locks (I don’t know if such an implementation is possible).

As for your implementation: it's simply wrong. Let us have 2 threads that enter the function at the same time and get on the line: flag.compare_exchange_strong one of them will set the flag, and the other will leave with full confidence that the function has already been called. But the matter has not yet reached the function call at all! Therefore, in a correct implementation, at the entrance to call_once , all threads should line up in a queue if someone has already started executing the function.

Of course, if the function were "pure" (pure) it would be possible to arrange speculative execution, with the adoption of the result from the thread that first completed its execution. But the standard does not impose any restrictions on the function that can be executed in call_once . Therefore, in general, I do not see how it can be implemented in a non-blocking form.

Scroll to Top