How to make this regular expression in python 3.6

Question:

I need to make a regular expression to extract the links from this string:

links =('href=http://www.ufjf.br/cdara/sisu-2/sisu-2017-1a-edicao/lista-de-espera-sisu-3/?id_curso=01GV&id_grupo=70>ADMINISTRAÇÃO - GOVERNADOR VALADARES - DIURNO - SISU - GRUPO A</a></li><li><a href=http://www.ufjf.br/cdara/sisu-2/sisu-2017-1a-edicao/lista-de-espera-sisu-3/?id_curso=01GV&id_grupo=71>ADMINISTRAÇÃO - GOVERNADOR VALADARES - DIURNO - SISU - GRUPO B</a></li>

The string is much longer. I only put a part because the rest is repeated. Here's what I've tried:

campus1 = re.findall("href", links)
campus2 = re.findall("http", links)
campus3 = re.findall("href=http", links)
campus4 = re.findall("hre", links)
campus5 = re.findall("a", links)
campus6 = re.findall("<a> <\a>", links)

When I print or leave the letters separate or leave the link and these names (which later I'll also have to think of an expression to get only these college names) Anyone any ideas? What comes out is this when I run campus1 = re.findall("href", links), for example: 'href', 'href', 'href', 'href', 'href', 'href', 'href ', 'href', 'href', 'href', 'href', 'href'… That is, it returns all the "href's" of the string. I would like to extract only the links, for example:

http://www.ufjf.br/cdara/sisu-2/sisu-2017-1a-edicao/lista-de-espera-sisu-3/?id_curso=01GV&id_grupo=70

All links like that are in this string.

Answer:

Do like this :

import re
s = "<li><a>href=http://www.ufjf.br/cdara/sisu-2/sisu-2017-1a-edicao/lista-de-espera-sisu-3/?id_curso=01GV&id_grupo=70>ADMINISTRAÇÃO - GOVERNADOR VALADARES - DIURNO - SISU - GRUPO A</a></li><li><a href=http://www.ufjf.br/cdara/sisu-2/sisu-2017-1a-edicao/lista-de-espera-sisu-3/?id_curso=01GV&id_grupo=71>ADMINISTRAÇÃO - GOVERNADOR VALADARES - DIURNO - SISU - GRUPO B</a></li>"
print(re.findall(r'href=[\'"]?([^\'" >]+)', s))

See on Ideone

Regex Explanation (in English)

Scroll to Top