recombination algorithm


Imagine the following scenario: a gas station is fined for tax evasion for issuing an invoice for tax coupons already issued – what happens is that each vehicle of the companies associated with the station supplied fuel and at the end of the month the station issued an invoice along with the respective invoice.

The operation, however, went wrong for almost 2 years. There was no evasion under any circumstances, however, the tax coupon numbers were not referenced to the invoices.

We are talking about an immense mass of data, where we basically need a match for the amount of liters per type of fuel x value – for example: an invoice of R$ 32,127.12 and 19,047.61 liters of diesel oil has to be " regrouped" with N tax coupons.

However, we have the following problems: fuel prices vary, as the invoice can be a combination of N pumps x N fiscal printers, that is, we are talking about a stratospheric mass of data.

However, knowing that we can limit the recombination "search" radius by date (last 30 days) – (which in sum data volume to trillions of combinations in this period), could we use some tree algorithm? Or some variant algorithm of the traveling salesman?


Looks like you're dealing with a knapsack problem , in English Knapsack problem . It's an NP-complete problem, so it doesn't even have a quick fix, and performance only gets worse with increasing items. He mentioned that it is a huge mass, so without much hope.

The advantage, in your case, is that the tax receipts have a date, and possibly a time. Assuming the account closes , what you can do is sort these coupons by date and time, and go backwards by adding up all the items until it matches or exceeds the closest date grade. He equaled, got lucky: removes the used coupons, marks the note as ok, and starts over. Failed the sum you "give up" of the last coupon still alive, and repeat the process. This solution assumes that the system that issues orders does a similar process, taking only open coupons and closing in packages.

But the hypothesis that the account closes is actually very fragile. An alternative from there is to close sums of notes per day, for the days you "are sure" in certain notes, and then keep pinching the "uncertain" coupons in the note of the previous or subsequent month, depending on whether there is space in this or that.

Solution really, only you getting from the station(s) that prepared this list of coupons and notes. Anything else, even the perfect knapsack solution of all coupons on all bills does not guarantee that this was the original relationship. It's just an approximation like any other on this indeterministic problem.

Scroll to Top