Question:
I use the requests
library, I make a request:
req = requests.post(f'example.com/api', headers=headers).text
The response comes in JSON format , which looks like this:
{"description":"\u003Cp\u003E\u003Cb\u003E\u041a\u043e\u043c\u043f\u0430\u043d\u0438\u044f"}
After the information is stored in a variable of the str type, you need to convert from this encoding into a normal human-readable text and output it to the console. If, for example, put the text into a variable manually and make a replay:
somevar = '{"description":"\u003Cp\u003E\u003Cb\u003E\u041a\u043e\u043c\u043f\u0430\u043d\u0438\u044f"}'.replace('\\', '\\\\')
Outputs:
<p><b>Компания
But if you try to convert the same text that comes and is stored in the req variable as a response to the request (it is exactly the same) and display it on the screen:
print(req.replace('\\', '\\\\'))
Then this text is displayed in its raw form:
{"description":"\\u003Cp\\u003E\\u003Cb\\u003E\\u041a\\u043e\\u043c\\u043f\\u0430\\u043d\\u0438\\u044f"}
I played as much as possible. And with the help codecs.decode()
I tried to decode it and with the help json.loads()
tried to collect a dictionary from it and extract the text by the description
key and found an online converter where you can convert JSON to plain raw text and it worked, but that's just me implementation is necessary:) Help people kind. I usually rarely ask questions, all the time Google yes Google, but here it’s not at all. Half a day today was carried with this and there is no sense.
UPDATE: in general, this is probably how it should have been done initially, I'm sorry. I give the full source text in order to clarify the situation as much as possible for more experienced comrades who want to help. You can reproduce this and see the full JSON response, in general, everything that I have in my collection
import requests
if __name__ == '__main__':
url = 'https://www.work.ua/ru/jobs/2429327/'
headers = {
'Authority':'www.work.ua',
'Method':'POST',
'Scheme':'https',
'Accept':'application/json, text/javascript, */*; q=0.01',
'Accept-Encoding':'gzip, deflate, br',
'Accept-Language':'ru-UA,ru-RU;q=0.9,ru;q=0.8,en-US;q=0.7,en;q=0.6',
'Content-Length':'0',
'Dnt':'1',
'Origin':'https://www.work.ua',
'Referer':f'{url}',
'Sec-Fetch-Dest':'empty',
'Sec-Fetch-Mode':'cors',
'Sec-Fetch-Site':'same - origin',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36',
'X-Requested-With':'XMLHttpRequest'
}
req = requests.post(f'{url}ajax/get-jobs-data/', headers=headers).text
print(req)
You just need to somehow convert the answer into a human-readable form, but I still can’t get it.
Answer:
Actually, everything you need has already been said in the comments. Everything works if we replace the text
attribute with the json()
method. That is, the penultimate line should look like this:
req = requests.post(f'{url}ajax/get-jobs-data/', headers=headers).json()
Result (output is incomplete):
{'description': '<p><b>Компания ***, официальный партнер **** в\xa0Украине, проводит набор ...'}
I will only draw your attention to the fact that in the response you get raw html markup with a large number of escaped characters.