How to add a field to a large PostgreSQL table without locking?

Question:

There is a table like this:

create table email_stats (
    id bigserial primary key,
    mailing_list_id int not null,
    sended_date timestamp not null,
    emails_send bigint not null default 0,
    emails_clicks bigint not null default 0
);

Now we need to add a new field to it. So the task is simple,

alter table email_stats 
    add column emails_paid_clicks bigint not null default 0;

Here's just a problem: the tablet is already several tens of gigabytes in size, and this alter table blocks all writing to the table for a long time.

How can a field be added without system downtime?

PS: strange, but did not find such a common question here

Answer:

If you are a happy user of PostgreSQL 11 or newer (but you don’t need to install beta versions in working projects), then all tricky tricks are now hidden inside and it’s enough to directly do alter table :

set statement_timeout to '1s';
alter table email_stats 
    add column emails_paid_clicks bigint not null default 0;

The patch adopted in postgresql 11 allows you to no longer copy the entire table again when adding a new field with a default value. And the only problem with long table locks is when the alter table is forced to wait for an exclusive lock to be obtained because some other transactions hold this lock. Therefore, it makes sense to set a small statement_timeout and repeat attempts to execute alter table .


If you have postgresql version below 11, then everything is not so simple. In the given request

alter table email_stats 
    add column emails_paid_clicks bigint not null default 0;

default 0 causes a long table lock. PostgreSQL takes a lock on the table and starts iterating through all the records in the table at the data level, and only after that releases the lock. But if you insert NULL by default, then PostgreSQL will update only the metadata of the table in the service directory, which is very fast, but this is not what is required.

Therefore, it is possible to make the desired change, but it will not look like this at all, but in many actions.

Let's start making changes

First we add a new field as null by default – it's fast, just change the table's metadata. Then we set the desired default value – this value will already be applied for new inserted rows.

begin;
set local statement_timeout = '1s';
alter table email_stats 
    add column emails_paid_clicks bigint;
alter table email_stats 
    alter column emails_paid_clicks set default 0;
commit;

A separate explanation deserves why I changed the statement_timeout in the transaction. This setting limits the maximum query execution time. This alter table still requires a lock on the table, albeit for a short time, and there is a hidden rake here: what if the alter table cannot take the lock due to some other ongoing transaction? For example, a simple insert in another transaction will not allow you to take a lock to change the structure. But at the same time, a running alter table will already block all subsequent writing queries to this table. A short statement_timeout will quickly kill the alter table and the service will continue to run. And an attempt to add a field can be painlessly repeated a little later until it eventually succeeds.

There is also an interesting rake that an explicit indication of default null in an alter table in some cases can be considered a NULL value of a non-equivalent data type and the database will start rewriting the table again. In order not to catch this rare rake, it is better not to specify default null explicitly, but to leave the one that is implied by add column itself. By default, NULL is assumed there.

The field has been added, now default is in the data

Long process, need inquiries

update email_stats 
    set emails_paid_clicks = default 
    where emails_paid_clicks is null 
        and /* следующий кусочек данных */

put down the desired value in the available table data. You need to update the data in pieces (for which you left an open condition in the request), making pauses between updates, monitor the replication lag (if any) and the autovacuum processes. There are several approaches for the update itself, the simpler one would be to update by primary or any unique key. We take any programming or scripting language of our choice and do something like this:

$current_id = (select min(id) from email_stats)
$maxid = (select max(id) from email_stats)

while ($current_id < $maxid) {
    $batch_to = $current_id + 10000 // максимальный размер пачки для одного обновления
    update email_stats 
        set emails_paid_clicks = default 
        where emails_paid_clicks is null 
            and id >= $current_id and id < $batch_to

    $current_id = $batch_to
    sleep(5) -- задержка между обновлениями чтобы меньше мешать сервису
}

During the execution of such a script, you can occasionally do vacuum email_stats so that the plate does not increase much in size. Especially if autovacuum is not configured aggressively enough and does not have time to clean up after the script.

The size of one update and the amount of pause between updates should be selected according to the load profile of a particular service. Small updates and long pauses will not bother anyone, but the script will simply run for a very long time.

The request example is not the most efficient, its execution time will greatly float from id gaps in the data and because the data will most likely be in different memory pages, but it is simple and you can easily adjust the maximum size for one update.

Of the pitfalls at this stage: it is much easier to catch a deadlock by fighting with the application for updating rows if the application wants to update several rows from the same batch, but in a different order. You can set the connection in which the script of this update is running to set deadlock_timeout = 100ms , then when deadlocking, as a rule, our script will be killed, and not a useful application transaction.

Set not null

Now we should not have null values ​​in the added field in the table, we can put down not null.

alter table email_stats alter column emails_paid_clicks set not null;

This request will unfortunately put a write lock. But the execution time is much less than updating the entire table with overwriting the default value.

With a minimal write lock, you will have to abandon the native not null property, but you can add a check constraint with a similar property. First, add a constraint indicating not valid (similar to statement timeout for a transaction)

begin;
set local statement_timeout = '1s';
alter table email_stats 
     add constraint emails_paid_clicks_not_null 
     check (emails_paid_clicks is not null) not valid;
commit;

Then in another transaction without statement_timeout

alter table email_stats validate constraint emails_paid_clicks_not_null;

Checking the constraint will not block the write.

For postgresql 11 and newer, this entire instruction has become much shorter, but for beauty I will add a little more, it was not in vain that I tried, the patch wrote . Starting with postgresql 12 alter table set not null may skip validating the data in the table if there are enough constraints to believe that the field is not NULL . Yes, I'm just talking about the check constraint that I just described how to create without a long lock. After the constraint has passed the validate constraint , you can call set not null and remove the no longer needed check . This will still require a table lock, but now a short one.

begin;
set local statement_timeout = '1s';
alter table email_stats alter column emails_paid_clicks set not null;
alter table email_stats 
     drop constraint emails_paid_clicks_not_null;
commit;

Ready

That's it, the field has been added.


By the way, it makes sense to think about whether to use NULL by default. Such a field is not only much easier to add, but also stored much more compactly. The NULL value is one bit in the row header bitmask, the bigint value 0 already occupies 8 bytes.

Scroll to Top