Python – Breaking apart text file



I need to separate a few lines from the file and add in another file according to the line. That is, a file containing 6 words will be added, according to the word for a specific file.

Those 6 words can grow to 8, 10, etc and then you have to create 8, 10 files, and so on.

At first I tried to create a matrix in which each line would be responsible for a line containing the word.

But I couldn't, because when I tried to add to the line, there was no way, because I couldn't unless I specified the input with a row and column.

For example, I want all lines containing orange to go to file orange.txt and all lines containing plum to go to file plum.txt.

The idea would be to make the code without "if plum", "if orange".

I tried to play the words in a vector but to play in the file I couldn't get it without the if… Ex:

frutas = ['laranja', 'ameixa']

with open('frutas.txt', 'r') as arq_fruta:
  for line in arq_fruta:
    coluna = line.split()
    for i in range (len(frutas)):
      if(coluna[1] == variaveis[0]):
        laranja.append(coluna[0] +' '+ coluna[3]+'\n')

The last line I couldn't put as a vector for example, something like:

fruta[i].append(coluna[0] +' '+ coluna[3]+'\n') #só como exemplo, nao funciona

where fruit[0] would be the vector of all lines containing only orange and fruit[1] all lines with plum.

I tried to create a matrix, but it didn't work, as the matrix asks for the row and column for input, but I don't have these infos, since I'm going to read the and supposedly throw it to the file.

And speaking of files, I also tried to make something that was "direct" but it doesn't work either.

for i in range(1, len(frutas)):
   arq = open(frutas[i]+'.txt','w')

Is there a more "correct" way to do this? I didn't get success, only with the code with "if" which would make it have a lot of changes if I had to include another fruit for example.


A more direct way would be for you to create a list of open files, where you have a file for each fruit. That way you can have code that writes to all files directly without having to split it into lists in memory. The code will be able to handle files of any size as it writes directly to the destination.

frutas = ['laranja', 'ameixa']
arquivos = [open(fruta + '.txt', 'w') for fruta in frutas]

with open('frutas.txt', 'r') as arq:
    for linha in arq:
        for fruta, arquivo in zip(frutas, arquivos):
            if fruta in linha:

If you really want to separate into variables in memory, one solution is to combine dictionaries with lists, this can be facilitated by collections.defaultdict :

import collections

frutas = ['laranja', 'ameixa']
por_fruta = collections.defaultdict(list)

with open('frutas.txt', 'r') as arq:
    for linha in arq:
        for fruta, arquivo in zip(frutas, arquivos):
            if fruta in linha:

So you have all the lists in the dictionary por_fruta … to save to file later:

for fruta, linhas in por_fruta.items():
    with open(fruta + '.txt', 'w') as f:
Scroll to Top