Question:
I have a monthly sales table like the following:
create table ventas (
id int NOT NULL AUTO_INCREMENT,
year int,
month int,
monto numeric(15,2),
PRIMARY KEY (id)
);
insert into ventas (year, month, monto) values (2018, 1, 100);
insert into ventas (year, month, monto) values (2018, 1, 300);
insert into ventas (year, month, monto) values (2018, 3, 340);
insert into ventas (year, month, monto) values (2018, 5, 200);
insert into ventas (year, month, monto) values (2018, 5, 100);
insert into ventas (year, month, monto) values (2018, 7, 100);
insert into ventas (year, month, monto) values (2018, 8, 100);
insert into ventas (year, month, monto) values (2018, 9, 200);
insert into ventas (year, month, monto) values (2018,11, 350);
insert into ventas (year, month, monto) values (2018,12, 440);
I am trying to make a report of these sales per month, I tried it like this:
select year,
month,
sum(monto) as total
from ventas
group by year,
month
And I get something like this:
| year | month | total |
|------|-------|-------|
| 2018 | 1 | 400 |
| 2018 | 3 | 340 |
| 2018 | 5 | 300 |
| 2018 | 7 | 100 |
| 2018 | 8 | 100 |
| 2018 | 9 | 200 |
| 2018 | 11 | 350 |
| 2018 | 12 | 440 |
Which is correct, but as can be seen there are "gaps", that is, months without values, I would like to have a report, but with the 12 months and complete those that have not had sales with a 0, that is to say something like this
| year | month | total |
|------|-------|-------|
| 2018 | 1 | 400 |
| 2018 | 2 | 0 |
| 2018 | 3 | 340 |
| 2018 | 4 | 0 |
| 2018 | 5 | 300 |
| 2018 | 6 | 0 |
| 2018 | 7 | 100 |
| 2018 | 8 | 100 |
| 2018 | 9 | 200 |
| 2018 | 10 | 0 |
| 2018 | 11 | 350 |
| 2018 | 12 | 440 |
Important
It may not be sales, or it may be some other type of data, it is important to understand the conceptual problem, which is, what do we do when we are missing information in a table? When we do not have sensor readings for all hours, when there are accounting accounts that do not register movements in certain months, when we want to list the sales of all the branches, but there are branches that have not had sales, when we want to know how many people occupied a room, but there are rooms that have never been occupied, etc.
Answer:
Before trying anything weird, be clear that we are missing data, we don't have data for months 2, 4, 6 and 10. So there is no magic possible to fix this. Whatever solution we face, the first thing to do is "invent" these rows that we don't have. And then how do we "invent" the missing rows? . In my opinion, the best alternative is:
Manage a period table
You do not have it? It is a good time to create it, believe me these tables are tremendously useful. For example, a fairly flexible period table form would be:
create table Periodos (
PeriodoId int,
Tipo char(1),
Nombre varchar(100),
FechaDesde date,
FechaHasta date,
PRIMARY KEY (PeriodoId, Tipo)
);
insert into Periodos(PeriodoId, Tipo, Nombre, FechaDesde, FechaHasta)
SELECT 201801, 'M', 'Enero/2018', '20180101', '20180201' UNION
SELECT 201802, 'M', 'Febrero/2018', '20180201', '20180301' UNION
SELECT 201803, 'M', 'Marzo/2018', '20180301', '20180401' UNION
SELECT 201804, 'M', 'Abril/2018', '20180401', '20180501' UNION
SELECT 201805, 'M', 'Mayo/2018', '20180501', '20180601' UNION
SELECT 201806, 'M', 'Junio/2018', '20180601', '20180701' UNION
SELECT 201807, 'M', 'Julio/2018', '20180701', '20180801' UNION
SELECT 201808, 'M', 'Agosto/2018', '20180801', '20180901' UNION
SELECT 201809, 'M', 'Septiembre/2018', '20180901', '20181001' UNION
SELECT 201810, 'M', 'Octubre/2018', '20181001', '20181101' UNION
SELECT 201811, 'M', 'Noviembre/2018', '20181101', '20181201' UNION
SELECT 201812, 'M', 'Diciembre/2018', '20181201', '20190101';
PeriodId
: it will be ourid
of the period, the important thing is that it respects the order of the periods, so that by ordering by this column, we obtain the natural order of the periods-
Tipo
: A "luxury" that we can afford, to manage different types of periods, for exampleM
for monthly periods, it could beY
to carry annual periods,S
to carry weekly periods,F
for fiscal periods, etc. - The
Nombre
is simply the description that we are going to display -
FechaDesde
andFechaHasta
are useful when, for example, we have to obtain a period given a date, thatFechaHasta
has one more day, it is simply a help for theBETWEEN
to work perfectly.
Now that we have this table of periods, it is extremely easy to solve this problem. Actually, what we have to think about is that the conceptual problem is: (a) list all the periods (b) obtain the total sales of each one:
select p.PeriodoId,
p.Nombre,
ifnull(sum(v.monto),0) as total
from Periodos P
left join ventas v
on P.PeriodoId = (v.year*100) + v.month
where p.tipo = 'M'
group by p.PeriodoId,
p.Nombre;
And now if:
| PeriodoId | Nombre | total |
|-----------|-----------------|-------|
| 201801 | Enero/2018 | 400 |
| 201802 | Febrero/2018 | 0 |
| 201803 | Marzo/2018 | 340 |
| 201804 | Abril/2018 | 0 |
| 201805 | Mayo/2018 | 300 |
| 201806 | Junio/2018 | 0 |
| 201807 | Julio/2018 | 100 |
| 201808 | Agosto/2018 | 100 |
| 201809 | Septiembre/2018 | 200 |
| 201810 | Octubre/2018 | 0 |
| 201811 | Noviembre/2018 | 350 |
| 201812 | Diciembre/2018 | 440 |
- We need to do
ifnull(sum(v.monto),0)
to put a 0 in those periods where we have no sales. - In the
join
we doon P.PeriodoId = (v.year*100) + v.month
to generate from a year and month the id with which we have defined the period. It is a way, you could also handle a simpler table of periods only with years and months
Important : You must generate the period tables to cover the full reasonable universe of cases. For example, what is your first date registered in the system? What is the last? to those limits add a few more periods up and down, enough to not have future problems (or at least not be alive when the claims come).