Question:
There is a site on which a voting system (like / dislike) should be organized. If you write information to the database in the form номер_записи
– id_пользователя
, then everything seems to be fine. But. Let's take 100 records and 100 users. Each of them will vote for each record, as a result, the database will contain 10,000 records, and this is a problem.
After reading a little, I changed the table and it became of the form:
номер_записи
– id_всех_пользователей_которые_голосовали
.
Under the same conditions, instead of 10,000, we get only 100 records in the database.
Adding a new user id
goes like this:
- fetch the entire row from the database, where the record number is N
-
SELECT FROM WHERE users_id LIKE '%id_пользователя%'
– a query to check if thisid
already exists in a row - in case of successful verification, add a new
id
to the end of the line and enter a new line in the database.
To what extent is this the correct approach to storing this kind of information? What are the alternatives? Ready to hear and take note.
Answer:
Estimate how often what is required.
- amount (or two amounts: for and against) for a particular vote – super-often – every time you open the page with this vote.
- Did this user vote for this poll? – less often , when receiving a new voice.
- who voted up/down for this question? – even less often , if at all, this is open infa.
- how and where a given user voted – also, probably very rarely , if at all.
In total, we store the current amounts in some kind of fast cache, such as memcached/apc/redis.
When a new vote enters the database, you need to check whether you voted? and save the vote + update the cached amount. It's good to have a separate table just for votes:
id_опроса id_юзера голос_плюс_минус (1 бит)
The main index is composite – for both id – because. we will search both by the question-user (is it possible to take a vote – there can be only one entry with a pair qid,uid
) and by poll (who-how). And still an index only on the user (where voted).
Yes, there will be a record in the database and two indexes for every vote cast. At least 10 million is normal. Each entry is two 32-bit integers and 1 bit. When the project grows, you will begin to scale horizontally – say, poll id less than X will go to additional. mysql server.