There is a list
lst , which has values interspersed with delimiter elements. For example,
["spam", "ham", None, "eggs", None, None, "bacon"] . I want to get a list of lists by splitting
lst on the delimiter
sep = None , i.e., get
[["spam", "ham"], ["eggs"], ["bacon"]] .
I looked through the standard library but didn't find anything similar. It’s hard to search on PyPi, a quick run didn’t give anything either. A brazen attempt to exploit
str.split , of course, failed with a
Please advise a more beautiful solution than this eye-catching monster. I don't want to feel like Frankenstein.
from functools import reduce # Fugly. def split_on(sep, lst): """ Given an iterable `lst`, split it into iterable of lists by `sep`. >>> list(split_on(0, [1, 2, 3, 0, 4, 5, 0, 0, 6])) [[1, 2, 3], [4, 5], ] """ s = sep if hasattr(sep, "__call__") else lambda x: x == sep return filter(lambda sublist: len(sublist) > 0, reduce(lambda x, elem: x + [] if elem == sep else x[:-1] + [x[-1] + [elem]], lst, []))
- Apparently, functional programming has left a serious imprint on you 🙂
I note right away that the split semantics for the case of your example implies the return
[["spam", "ham"], ["eggs"], , ["bacon"]].This is true because there is an empty list between
Nonein terms of delimiters.
So, there are several possible solutions. The most
explicitoption implies something along the lines of:
def split_on(what, delimiter = None): splitted = [] for item in what: if item == delimiter: splitted.append() else: splitted[-1].append(item) return splitted
It is clear that this solution works up to the contract of the function regarding work in the case of an empty list –
and a list consisting only of a delimiter –
I defined this contract like this:
[ ] -> [[ ]],
[None] -> [, ]. For the first case, the contract is rather controversial. In the second case, the result is obtained, since empty sequences are essentially located to the left and right of the separator.
In case you want to change this behavior, then modifying the method should not be difficult.
list1 = ["spam", "ham", None, "eggs", None, None, "bacon"] list2 =  list3 = [None] list4 = ["eggs"] print split_on(list1) print split_on(list2) print split_on(list3) print split_on(list4) # Результат: [['spam', 'ham'], ['eggs'], , ['bacon']] [] [, ] [['eggs']]
Of the alternatives, you can write a generator similar to the proposed function with
yield'ами, and I think that you can come up with a solution by breaking the proposed iterable sequence into groups, and then combining the results by
itertools. True, it seems to me that the obviousness of these solutions in comparison with the method proposed above will be somewhat worse.