Question:
Let me explain: I have a java application that dumps data from all the processes running on my computer into a text file through a command; This file will be generated every 5 seconds for example (thread). The generated file has about 130,000 lines, so that looping inside the file to find a string is not going to be very effective in terms of process speed.
I need to find a text string within this file for example: \Device\0000005x
and once found, go back a few lines in the file to find the name of the process that is executing it, some programmers have suggested the use of bases document data (NoSQL) but I'm sure they have the function I need.
The format in which the processes appear within the file is as follows:
(each process is delimited by a line of dashes "-", I think this can be useful when it comes to finding the name of the process that is right on the next line):
--
explorer.exe pid: 4632 WATCUT\tofpo
4: Process
8: Mutant
C: Unknown type
10: Unknown type
14: Directory
18: Key
--
SynTPEnh.exe pid: 3692 WATCUT\tofpo
4: Event
8: WaitCompletionPacket
C: IoCompletion
10: TpWorkerFactory
14: IRTimer
18: WaitCompletionPacket
60: Key HKLM\SYSTEM\ControlSet001\Control\Nls\Sorting\Versions
64: File \Device\DeviceApi
68: IRTimer
This is just an example, the text file is huge as I said and consists of more than 125000 lines. Someone who has done something similar or knowledge of NoSQL databases that can shed some light?
Answer:
If you need to search for a string from time to time there is no problem searching line by line whether it contains what you want or not. I also use a text file of the size you say and it takes less time to go through the refresh rate of the file you tell us.
I would make a copy of the file so that in the middle of the search, not by chance that it refreshes and I would look for the string from the beginning to the end of the file in case it comes out several times throughout the entire file and I would save that copy for if I need more information later.
Finally it would go back to the previous line that contains "pid" and take that line or until the first space if you only want the name of the executable. If your search appears in several lines it would give several concatenated results.
If that doesn't work for you, tell us and we will find another solution, but it should be enough.
I would not use a database if you do not make much use of searches since keeping the database updated is going to lose more performance than the searches you have to do
UPDATE:
This is a code example:
public static void main(String[] args){
String descarga = args[0];
String buscado = args[1];
boolean encontrado = false;
String strLineaPid = null;
try {
// Abrimos el archivo
FileReader fstream = new FileReader(descarga);
// Creamos el Buffer de Lectura
BufferedReader buffer = new BufferedReader(fstream);
// Leer el archivo linea por linea
String strLinea;
while ((strLinea = buffer.readLine()) != null) {
//Guardo temporalmente la linea del proceso
if(strLinea.contains("pid"))
strLineaPid = strLinea;
//Rompo el bucle y finalizo
if(strLinea.contains(buscado)){
encontrado = true;
break;
}
}
buffer.close();
}catch (Exception e){
System.err.println("Ocurrió un error: " + e.getMessage());
e.printStackTrace();
}
if(encontrado)
System.out.println("El proceso con " + buscado + " es " + strLineaPid);
else
System.out.println("Ningún processor con la palabra " + buscado);
}
It is made to pass the path and the search word through parameters. I ran it by putting the "Test path-to-file searchword" The output it gave me was:
The process with searchword is anotherProcess.exe pid: 2 lele
The file used contained the following:
process.exe pid: 1 lala
another line
another line
–
otherProcess.exe pid: 2 lele
searched line
–
otherProcessMas.exe pid: 3 lili
another line
I hope this is worth it for you.