Random sampling of mysql records

Question:

How can you randomly select records from a table using a single query?

PS: There is a table of articles, as well as a pivot table (articles on the topic), with a link to the first table. It is necessary to add N records to the pivot table, pointers to articles in a random way.

Is there any way to randomly select rows from a table in a query?

Answer:

If the size of the table of articles articles not large, you can sort the records in random order, for example, using the ORDER BY RAND() construct. You can insert records into the theme_articles table using the INSERT ... SELECT ...

INSERT INTO
  theme_articles
SELECT
  *
FROM
  articles
ORDER BY RAND()
LIMIT 2;

Unfortunately, the ORDER BY RAND() construct has a sad glory, as it most often means sorting data on the hard disk. You can reduce the likelihood of this event by reducing the size of the staging table. Even if EXPLAIN reports sorting in a file, with a small table size, there is a high probability that the sorting will still take place in RAM (especially if the value of the tmp_table_size directive is high enough – several tens of megabytes).

In any case, it makes sense to sort in random order not the entire table, but only the primary keys – this will reduce the size of the intermediate table being sorted.

INSERT INTO
  theme_articles
SELECT
  m.*
FROM
  (SELECT id FROM articles ORDER BY RAND() LIMIT 2) AS r
LEFT JOIN
  articles AS m
USING(id);

This option is already suitable for medium tables. If you have the opportunity to randomly select primary keys outside the DBMS, this is the most ideal option in the case of giant tables (reaching gigabyte sizes). As a rule, in such cases, the DBMS is used in conjunction with some NoSQL solution that is completely located in RAM. You can try to extract the IDs of random articles from it (which is obviously faster than MySQL tied to transactions and hard disk) and use the first query, using the IN construction to select articles

INSERT INTO
  theme_articles
SELECT
  *
FROM
  articles
WHERE
  id IN (252, 1024);

The queries above assume that the table structure is identical, the primary key is named id , and the number of random records to be inserted is 2.

Scroll to Top