Question:
Consider that we have models A
, B
, C
and D
, each one referring to the previous one, that is, A
has many B
s, B
has many C
s and C
has many D
s:
no redundancy
A
| id | ... |
------------
| ... | ... |
B
| id | a_id | ... |
--------------------
| ... | ... | ... |
C
| id | b_id | ... |
--------------------
| ... | ... | ... |
D
| id | c_id | ... |
--------------------
| ... | ... | ... |
Would it be recommended to include more columns in C
and D
indicating which record A
and B
are referenced?
with redundancy
C
| id | a_id | b_id | ... |
---------------------------
| ... | ... | ... | ... |
D
| id | a_id | b_id | c_id | ... |
----------------------------------
| ... | ... | ... | ... | ... |
It's a redundancy, but I usually do this for easier queries, it's useful to avoid JOIN
s. I also believe that saving JOIN
s improves performance.
Is this considered good or bad practice? (If not in all cases, at least when the data is immutable) Is there any better solution for this?
Answer:
My personal opinion is that the benefit doesn't outweigh the ripple effect of health updates and validations on a large mass of data.
- Many DBMSs implement execution plans where dependency between structures is taken into account. By increasing the complexity of the relationships described by foreign keys, you may be hampering the evaluation of this plan;
- If you delete or modify an item from table A , all dependent records of B , C and D still need to be validated for consistency. The proposed model does not offer any benefit for data change operations (
UPDATE
/DELETE
). - If you change the
A.ID
type, you will need to make changes to all tables, not just theB.A_ID
type. - Several ORM tools rely on a correct representation of the interdependence between tables in the database.
If your only advantage of this implementation is the agility for creating queries , I would go the extra mile to describe the interrelationship correctly.
(Additionally, there is a systems development principle called ' do n't repeat yourself ', which some DBAs also use because they consider the practice a good choice for data hygiene.)