Question:
I'm having problem when python reads xlsx with pandas. When you run des_pt = (f_pt.head()[pt][0]).encode('utf-8').strip()
and put the pt variable. There is an encoding problem because some characters are in utf-8.
import pandas as pd
create_result = open('resultado.json', 'w')
i = 0
file_name_pt = pd.ExcelFile('pt.xlsx', encoding='utf-8')
file_name_en = pd.ExcelFile('en.xlsx')
f_pt = pd.read_excel(file_name_pt, sheet_name='Sheet1')
title_pt = f_pt.columns[1:]
f_en = pd.read_excel(file_name_en, sheet_name='Sheet1')
title_en = f_en.columns[1:]
create_result.write('{\n"resultados": [\n')
while i <= 25:
for pt,en in zip(title_pt, title_en):
print pt
pt = pt.encode('utf-8').strip()
en = en.encode('utf-8').strip()
print pt
des_pt = (f_pt.head()[pt][0]).encode('utf-8').strip()
des_en = (f_en.head()[en][0]).encode('utf-8').strip()
print des_pt
create_result.write('{\n"id":%s,\n"nome":"%s",\n"name":"%s",\n"descricao":"%s",\n"description":"%s",\n"combinacoes":[]},\n'%(i, pt, en, '', des_en))
i+=1
create_result.write(']\n}')
create_result.close()
print 'Done'
The error message
Traceback (most recent call last): File "/Users/atila/Desktop/PyAutomate/firjan_result_generator/firjangenerator.py", line 23, in <module> des_pt = (f_pt.head()[pt][0]).encode('utf-8').strip() File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/frame.py", line 2688, in __getitem__ return self._getitem_column(key) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/frame.py", line 2695, in _getitem_column return self._get_item_cache(key) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/generic.py", line 2486, in _get_item_cache values = self._data.get(item) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/internals.py", line 4115, in get loc = self.items.get_loc(item) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/indexes/base.py", line 3066, in get_loc return self._engine.get_loc(self._maybe_cast_indexer(key)) File "pandas/_libs/index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'T\xc3\xa9cnico em Energias Renov\xc3\xa1veis'
Answer:
Would you be able to provide the format and some samples of the content (spreadsheets) you are trying to read?
With the description of your question, the only thing I can contribute is the following:
The error KeyError: 'T\xc3\xa9cnico em Energias Renov\xc3\xa1veis'
happens because it is accessing a structure of key, value (key, value) and it cannot find the key, in this case the string 'T\xc3\ xa9cnico in Renewal Energies\xc3\xa1veis'.
To be able to help you more, maybe make available the content (at least the first lines) of the files you are reading.