## Question:

I'm trying to analyze some shoe sales data, but I'm having difficulty creating a function to find the number that the customer bought the most in the previous year.

I have a table with this data:

```
Cód. Cliente CPF Nome Sexo Tamanho
5879099 37513584800 LOJA MASCULINO 35
5879099 37513584800 LOJA MASCULINO 23
5879099 37513584800 LOJA MASCULINO 17
5879099 37513584800 LOJA MASCULINO 37
5879099 37513584800 LOJA MASCULINO 17
3353800 2613618809 DULIO JOSE DE SOUSA DAMICO MASCULINO 35
3353800 2613618809 DULIO JOSE DE SOUSA DAMICO MASCULINO 39
3112300 29953652805 ROSANA DA SILVA FAGUNDES FEMININO 34
6116202 39285701884 ANA CAROLINA DE FARIAS FRANCISCO FEMININO 31
```

The table is much more than this, just a few lines of example.

Well, what I need to know is what is the size that is most repeated by the client's CPF.

What number did he buy the most?

I couldn't find a way to do this if someone has a light.

Thanks,

## Answer:

Yuri you could use *PIVOT TABLE* (pivot table) on pandas

It would be something like this:

```
import pandas as pd
import numpy as np
df = pd.read_excel("SEU ARQUIVO")
table = pd.pivot_table(df,index=["CPF","Tamanho"],
values=["Tamanho"],
aggfunc=[np.count_nonzero],fill_value=0)
```

I used the *'read_excel'* just as an example, in your case just fill the dataframe with your data.

The *'index'* parameter assembles the PivotTable columns, that is, the category columns you want to use

and in *'aggfunc'* ( Aggregation function ) I'm using count

This Link has interesting content about *Pivot Table* that can help you more.