Sandro Tosi: Python: group a list in sub-lists of n items

2011-04-11

Python: group a list in sub-lists of n items

A long list, and you want to process its items n at a time; easy, but how to split that list in sublists of n elements (except the last one, of course)?

I looked a bit into the stdlib but it doesn't seem to exist anything I could use (oh, did I say I'm still on 2.4?) so I directed my research to Google, and found a nice recipe at ActiveState, but it has the problem it discards the last list, if it has less than n items.

Searching again, I got more lucky with this article: it's a generator of tuples from a list, splitting every n elements and optionally return the last semi-full tuple. I slightly modified it to obtain:

def group_iter(iterator, n=2):
    """ Given an iterator, it returns sub-lists made of n items
    (except the last that can have len < n)
    inspired by http://countergram.com/python-group-iterator-list-function"""
    accumulator = []
    for item in iterator:
        accumulator.append(item)
        if len(accumulator) == n: # tested as fast as separate counter
            yield accumulator
            accumulator = [] # tested faster than accumulator[:] = []
            # and tested as fast as re-using one list object
    if len(accumulator) != 0:
        yield accumulator

How would you have done it?

13 comments:

Jaime said...: What I usually do is this small line...

lists = [original_list[i:i+list_size] for i in xrange(0, len(original_list), list_size)]

It's a little scary the first you see it, but it's easy. Just get the indexes from 0 to the length, in steps of list_size. Then create a sublist for each.; April 11, 2011 at 11:24 AM
Jean-Paul Calderone said...: My favorite solution is zip(*[iter(input)]*N), where input is your input list and N is how many elements you want per sub-list. It's not exactly the same as your solution, since it drops dangling items instead of giving you back a short sublist. Replacing zip() with map(None gives you a solution that None-pads the last sublist if necessary, instead. However, this is "favorite" in a kind of perl golf way, not a use it in real software sort of way.; April 11, 2011 at 11:27 AM
Anonymous said...: I'd adapt the grouper recipe. It's based on izip_longest. I would also upgrade to a recent version of Python -- there are lots of new itertools goodies.; April 11, 2011 at 11:42 AM
Nobu said...: Not as simple or elegant as some of the solutions, and I don't know how efficient it is, but here's something I threw together (substitute spaces for leading dashes; blogger doesn't like <code> tags):

def splitarray(array, gsize):
--arraylen = len(array)
--for i in range(arraylen / gsize):
----yield array[i * gsize:(i * gsize) + gsize]
--if arraylen % gsize != 0:
----yield array[-(arraylen % gsize):]; April 11, 2011 at 12:37 PM
Nobu said...: Actually, looks like mine is the fastest of yours, Jaime's and mine. Of course, this is just a benchmark; it may be different in real-world usage:

http://pastebin.ca/2045170

Mine is t, yours is t2, Jamie's is t3.

Also note, I moved orig = range(23) into the setup part of timeit and that improved the time to 7.75~.; April 11, 2011 at 1:07 PM
Anonymous said...: >>> seq = [1,2,3,4,5,6,7,8,9,10]
>>> [seq[i::num] for i in range(num)]

I wrote a longish blog post about this some time ago... http://www.garyrobinson.net/2008/04/splitting-a-pyt.html; April 11, 2011 at 3:12 PM
Anonymous said...: A simple translation of Jaime's solution to a generator function yields times almost identical to Nobu's:

def split(sequence, size):
for i in xrange(0, len(sequence), size):
yield sequence[i:i+size]; April 11, 2011 at 4:42 PM
Nobu said...: Which is why I still consider myself a novice. ;-)

Much more readable than mine. If I could tell what Jaime's was doing, I might've tried something like that....; April 11, 2011 at 11:38 PM
Jeff said...: Sticking with iterators...

import itertools

def group_iter(iterator,n=2):
  while True:
    li = list(itertools.islice(iterator,n))
    if len(li):
      yield li
    else:
      break; April 12, 2011 at 11:53 AM
Anonymous said...: Following your tangent... I'm curious to know why you're on 2.4. As a small-time package developer, I thought I only had to care about 2.5 and above by now.; April 14, 2011 at 7:23 AM
Robert said...: Nobu

you code doesn't seem to work.
did you actually test it ?; April 19, 2011 at 12:30 AM
Sandro Tosi said...: @all: thanks for all your replies and alternative solutions (some not exactly what I need, but appreciated nonetheless)

@Craig: simply because on the server where I need this snippet I only have 2.4 (and upgrade it is not an option); April 21, 2011 at 5:49 PM
Unknown said...: This comment has been removed by a blog administrator.; September 6, 2020 at 6:41 AM

Sandro Tosi

2011-04-11

Python: group a list in sub-lists of n items

13 comments:

code_highlight

Matplotlib for Python Developers

Labels

Contributors