sql – What is the problem with N + 1 queries?

Question:

Whenever we work with an ORM, it is common to fall into the problem of N+1 queries . It's something about performance, even called antipattern .

But what is this problem really, why does it happen, what are its main causes and how, in theory, to solve them?

I've also heard that to solve it, just practice eager loading . But to what extent is it beneficial and able to solve this problem?

Answer:

The problem is not unique to ORMs, although many find it because it is a common problem but not inherent in them. And maybe they don't believe in people's ability to do it wrong manually :).

In ORM it appears more because a naive implementation will force the problem to occur every time.

Alias ​​is not just an issue with the ORM itself, but with modeling objects with related data. Either use a database with a non-relational model, which has its problems, or adopt the relational model in the application.

Still you can create some problem when mixing the two models.

The problem is common when there is an object and other N related ones, hence the name N + 1 which is the "parent" of these N. The problem is clear when the query takes the main data, let's say it is an invoice, and then will get the lines of items that make up the note. Going to fetch data individually from the database can become a very high cost, especially in the poorly thought-out architecture that many people do (some out of necessity).

Usually at least 1 + 1 is needed, which is a failure of the relational models to communicate (not the model, but the way they communicate in current implementations, which I consider a mistake, and instead of fixing it, they created it another worse model, is the history of our area, they solve a problem with another problem, but nothing that another problem cannot solve this one too). You can avoid this a little, but with low efficiency.

Alias, that's why some people like to use a NoSQL DB as a relational proxy . Again, the complexity of the solution increases because the tool has problems that are easy to solve, but nobody does.

But in that case if the size is big it won't be a big problem.

The problem with eager loading is that it can bring information that you won't even use. But it depends a lot on the problem, there are cases where this is rare to happen, there are others that even if it does not even tickle, and in many cases the fact of coming more than you need generates such a small overhead that a simple extra query will be worse. , that is, a 1 + 2 can already be worse. Imagine reading a single invoice and it brings all lines from all invoices to avoid N + 1, total waste.

That's the problem with automated solutions or programmers who don't understand what they're doing and adopt solutions automatically. The real solution is to understand what will happen in that case and decide what is most interesting. Even manually, it is complicated to attend to all cases, it depends on the query. The ORM may have a mechanism that tries to "guess" what the best strategy is.

In many cases there is a lot of repetition of information because of the way it is conventional to work with tabular data, generally using the JOIN .

Most of the time, bringing everything at once is usually more interesting than bringing one by one.

For lack of a better solution it would be something like:

SELECT * FROM Nf
SELECT * FROM NfItem

If you have 1000 notes and on average exactly 10 items per note, there will be 11,000 lines in all, with 2 queries, one large and one huge.

As opposed to the N+1 form:

SELECT * FROM Nf
SELECT * FROM NfItem WHERE NfNumero == Nf.Numero
SELECT * FROM NfItem WHERE NfNumero == Nf.Numero
SELECT * FROM NfItem WHERE NfNumero == Nf.Numero
SELECT * FROM NfItem WHERE NfNumero == Nf.Numero
.
.
.
Tantos quantos forem a quantidade de notas fiscais existentes.

I put it on GitHub for future reference .

Here you will also have 11 thousand lines, but with 1000 small queries and 1 large.

The code is pretty abstract, just for illustration.

Try frying one fillet of potatoes at a time and a bunch of fillets at a time. The first method ends fast individually, but the whole gets tragic. The second method takes longer, but when it's done, everything is ready. It's only a problem if you find out that you've only sold 3 fillets, and you fried the whole package.

Scroll to Top