Question:
I need to capture only the where
clause of several queries to analyze the filters used. For example:
select "DIM_1"."col1",
"DIM_2"."col2" ,
"DIM_3"."col3" ,
"DIM_4"."col4" ,
"FAT_1"."col5"
from "FAT_1",
"DIM_1",
"DIM_2",
"DIM_3",
"DIM_4"
where "DIM_1"."col1" IS NOT NULL
AND "DIM_2"."col2" LIKE ('SUCCESS')
AND "DIM_3"."col3" BETWEEN 20161213 AND 20161222
AND "DIM_4"."col4" > 0
I created a list with SQL, and then I tried to apply regular expressions to extract the where part, but without success, here's what I tried:
`for line in sql:`
`if re.search(r'[where]\W',line):`
`where.append(line)`
Unfortunately I couldn't extract only the where part, can you tell me what mistake I made and how to fix it?
Answer:
I believe that what you want is to get the where
clauses.
First let's check what can come after the where
command.
where
post-following commands
According to postgres .
I only sell the most common
- GROUP BY
- HAVING
- ORDER BY
- LIMIT
- OFFSET
- "NADA" – Because it can be just the
where
without any subsequent command.
REGEX
- pattern :
(?<=where)(.*?)((ORDER BY|GROUP BY|HAVING|LIMIT|OFFSET|$).*)
- flags :
si
Explanation
-
(?<=where)
– ensures that what we are searching for comes afterwhere
-
(.*?)
– everything that follows will be the clauses. -
((ORDER BY|GROUP BY|HAVING|LIMIT|OFFSET|$).*)
– guarantees that it will end at one of the commands or at the end($
). - Flag :
s
– says that the.
(dot) must include\n
in the search. - Flag :
i
– case-insensitive – allows you to search for uppercase or lowercase.