Question:
There is an array like
ar=[283462197, 191777391, 243143621, 451231707, 217268739, ] // и там их 1 310 341 штук
you need to select unique values, I do it through this function
var uniqueAr = function(ar) {
var existing = {},
result = [];
var length = ar.length;
for (i = length; i--;) {
if (!existing.hasOwnProperty(ar[i])) {
result.push(ar[i]);
existing[ar[i]] = true; //any value will do
}
}
return result;};
IMHO it works very fast for 80.774ms
As a result, I have 114262 elements, I do my business and it turns out that 73928 of them need to be deleted, and then the problems begin, I do this:
grdb.user_ids_clear//массив после уникализации
banIds// те что нужно удалить, само собой тоже уникальны
console.time("исключение банов");
var tar = [];
var exist = false;
var banIdslength = banIds.length;
for (let i = grdb.user_ids_clear.length; i--;) {
exist = false;
for (let ii = banIdslength; ii--;) {
if (banIds[ii] === grdb.user_ids_clear[i]) {
exist = true;
break;
}
}
if (!exist) tar.push(grdb.user_ids_clear[i]);
}
console.timeEnd("исключение банов");
And it took 893288.239мс
, which is simply unacceptable, I'll forgive you to explain how it turns out, why the uniqueization procedure of the same complexity is done 4 orders of magnitude faster with a 10 times larger array.
Answer:
In the first case, you are using a hash table, which is any large object. You have only one pass through the array, and the running time is proportional to the size of the array.
In the second case, you use a linear array search every time – and therefore the running time is proportional to the product of the lengths of the arrays.
You cannot use linear search in this problem. Instead, you need to convert the second array to a hash table in the same way as it was done when searching for unique records. And already then to do on this table search.
const bansSet = {};
for (let ii = banIdslength; ii--;) {
bansSet[banIds[ii]] = true;
}
for (let i = grdb.user_ids_clear.length; i--;) {
const exist = bansSet.hasOwnProperty(grdb.user_ids_clear[i]);
// ...
}
PS I still recommend using Set or Map instead of objects – they should work faster on modern browsers:
const bansSet = new Set(banIds);
for (let i = grdb.user_ids_clear.length; i--;) {
const exist = bansSet.has(grdb.user_ids_clear[i]);
// ...
}