Splitting a Sequence into Subsequences with Python

Tags: •  •  • 

I’m writing a load balancing algorithm for this cluster I’m working on, and have a bunch of things in a Python list sequence. So came the issue, how to divide the list into sublists for each node in the cluster to compute? I came up with the algorithm:


def split_seq_stride(l, n):
  """Splits a sequence l into a list of subsequences containing at least n
  elements each, not preserving order. The first few subsequences may contain
  n+1 elements, containing the last few elements. (Algorithm is good for load
  balancing)"""
  r=[]
  k=len(l)/n
  [r.append(l[i::k]) for i in range(k)]
  return r

which behaves like:

>>> l = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> print split_seq_stride(l, 3)
[[1, 4, 7, 10], [2, 5, 8], [3, 6, 9]]

Assuming each task is an equal amount of work, this algorithm is a good load balancing algorithm. it prevents any node from having to do any less significant work than the others.

Probably useful for someone: what about splitting up a sequence in-order, with the last subsequence containing fewer elements? This code segment does just that:


def split_seq(l, n):
  """Splits a sequence l into a list of subsequences containing at most n
  elements each, preserving order. The last subsequence may contain less
  than n elements."""
  r=[]
  [r.append(l[s:s+n]) for s in range(0, len(l), n)]
  return r

which behaves like:

>>> l = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> print split_seq(l, 3)
[[1, 2, 3], [4, 5, 6], [7, 8, 9], [10]]

Thanks to yason and others on EFnet’s #python for pointers.


Reply

The content of this field is kept private and will not be shown publicly.
  • You may post code using <code>...</code> (generic) or <?php ... ?> (highlighted PHP) tags.
  • You can use Markdown syntax to format and style the text.
  • Images can be added to this post.
  • You may use [inline:xx] tags to display uploaded files or images inline.
More information about formatting options