Question:
There is a file with html code. You need to find all tags with style=""
and remove style=""
I tried to write, something like that turned out.
import re
pattern = 'style="(?P<text>[(\w;:-)]*)'
re_style = re.compile(pattern)
with open('index.html', 'r') as open_file:
for line in open_file:
if re_style.search(line):
print(re_style.search(line).groups())
But style="(?P<text>[(\w;:-)]*)
searches up to the first non-letter. To search for text in style=""
I wanted to use groups, but then how can I remove the found group from the string?
Answer:
Can someone help.
import re
import sys
def replaceLine(fileName, sourseText, replaceText):
file = open(fileName, 'r')
text = file.read()
file.close
file = open(fileName, 'w')
file.write(text.replace(sourseText, replaceText))
file.close
pattern = 'style\s*=\s*"([^"]*)"'
re_style = re.compile(pattern)
with open(sys.argv[1], 'r') as open_file:
for line in open_file:
if re_style.search(line):
replaceLine(sys.argv[1], re_style.search(line).groups()[0], '')