Question:
-------------------
- TABLE -
-------------------
ID | GRUPO | OBJETO
---|-------|-------
1 | 1 | 1
2 | 1 | 2
| |
3 | 2 | 1
4 | 2 | 2
| |
5 | 3 | 1
6 | 3 | 2
7 | 3 | 3
| |
5 | 4 | 1
6 | 4 | 3
| |
7 | 5 | 1
I need to do a query to return all groups that contain exactly the same objects . I've already used subquery with IN
but objects 1 and 2 are also in group 3 .
For example, if I query using where GRUPO = 1
, I only need to return a total of two records (group 1 and group 2).
The objects in group 1 ( 1 and 2 ) are also in group 2. But group 3 contains an object in addition, group 4 contains two objects but one of them is different from group 1, and group 5 contains only one object. ; therefore, groups 3, 4 and 5 should not be considered.
There are some relationships but they don't make much difference in the case.
- Group 1 bought: orange and apple
- Group 2 bought: orange and apple
- Group 3 bought: orange, apple and banana
If I consult by GRUPO = 1
, I need to return ONLY and ONLY who else bought ONLY oranges and apples . I don't care which orange and apple groups they are in, so IN
doesn't work.
Answer:
I think the following solution will give you the result you want.
This query returns all groups that have exactly the same elements as group 1.
SELECT T.GRUPO
FROM TESTE T
LEFT JOIN
(
SELECT DISTINCT T1.OBJECTO,
(SELECT COUNT(DISTINCT T2.OBJECTO)
FROM TESTE T2
WHERE T2.GRUPO = T1.GRUPO) TOTAL
FROM TESTE T1
WHERE T1.GRUPO = 1 --filtras aqui o grupo
) SS
ON SS.OBJECTO = T.OBJECTO
GROUP BY T.GRUPO
HAVING COUNT(DISTINCT T.OBJECTO) = COUNT(DISTINCT CASE WHEN SS.OBJECTO IS NOT NULL THEN SS.OBJECTO END)
AND COUNT(DISTINCT T.OBJECTO) = MAX(TOTAL)
The logic behind this query is as follows:
the subquery
SELECT DISTINCT T1.OBJECTO,
(SELECT COUNT(DISTINCT T2.OBJECTO)
FROM TESTE T2
WHERE T2.GRUPO = T1.GRUPO) TOTAL
FROM TESTE T1
WHERE T1.GRUPO = 1
aims to return all objects from group 1. This is the basis for the process. Initially, this instruction only returned the objects, it was necessary to change it to also return the total number of objects.
With this result we have everything we need to look for the remaining groups that have exactly the same elements. This is done through the following instructions:
HAVING COUNT(DISTINCT T.OBJECTO) = COUNT(DISTINCT CASE WHEN SS.OBJECTO IS NOT NULL THEN SS.OBJECTO END)
AND COUNT(DISTINCT T.OBJECTO) = MAX(TOTAL)
The first means, "all elements of the group (COUNT(DISTINCT T.OBJECT)) must be in group 1 (COUNT(DISTINCT CASE WHEN SS.OBJECT IS NOT NULL THEN SS.OBJECT END)) "
Finally, the instruction COUNT(DISTINCT T.OBJECTO) = MAX(TOTAL) guarantees that the group must have the same number of elements as group 1. This instruction is fundamental to exclude groups that, as in your example, only have Orange. Orange is in group 1 but is fanciing the apple.
MAX is necessary because in the HAVING instruction we can only use aggregate functions or constants.
If you want to get the objects for each of the groups (the objects will always be the same), just use the previous query to filter the results, for example like this
SELECT T.GRUPO,
T.OBJECTO
FROM TESTE T
INNER JOIN
(
SELECT T.GRUPO
FROM TESTE T
LEFT JOIN
(
SELECT DISTINCT T1.OBJECTO,
(SELECT COUNT(DISTINCT T2.OBJECTO)
FROM TESTE T2
WHERE T2.GRUPO = T1.GRUPO) TOTAL
FROM TESTE T1
WHERE T1.GRUPO = 1 --filtras aqui o grupo
) SS
ON SS.OBJECTO = T.OBJECTO
GROUP BY T.GRUPO
HAVING COUNT(DISTINCT T.OBJECTO) = COUNT(DISTINCT CASE WHEN SS.OBJECTO IS NOT NULL THEN SS.OBJECTO END)
AND COUNT(DISTINCT T.OBJECTO) = MAX(TOTAL)
) X
ON T.GRUPO = X.GRUPO
ORDER BY 1, 2
Here is also the SQLFiddle