Halton sequence in Python

Sometimes when we ask for random we don’t actually mean random by just random. Yes, pseudo-random.

Consider unitary distribution with ranges 0 and 1. Say you want to draw 5 samples. Selecting them at random would mean that we might end up with set of {0, 0.1, 0.02, 0.09, 0.01} or {0.11, 0.99, 0.09, 0.91, 0.01}. Yes, these values don’t seem very random, but that’s the thing about randomness, that it randomly can seem to not be random.

Depending on the purpose of our selection, these values might be just OK. After all, they came from that distribution. However, if our goal is to reconstruct the distribution, or extract information about with limited number of samples, it is often better to draw those samples in pseudo-random way. For example, in accordance to van der Corput sequences for 1D distributions or its generalized version Halton sequence.

The best practice for sampling N dimensional distribution is to use different prime numbers for each dimension. For example, when I need to sample a 5 dimensional unitary distribution, or search space, I will use bases of (5, 7, 11, 13, 17). This is to prevent periodic visits of the same position.

In case you are wondering what’s the difference between actual random and pseudo-random, here is a gist:

Both are good, but the actual random can produce many empty holes. What we like to have is a fair representation of all areas of our search space.

Thus, without further ado, here are some code snippets.

This is a definition of my prime generating generator:

def next_prime():
    def is_prime(num):
        "Checks if num is a prime value"
        for i in range(2,int(num**0.5)+1):
            if(num % i)==0: return False
        return True

    prime = 3
    while(1):
        if is_prime(prime):
            yield prime
        prime += 2

As for Halton sequence, as mentioned before it uses van der Corput sequence. Again, here is the definition:

def vdc(n, base=2):
    vdc, denom = 0, 1
    while n:
        denom *= base
        n, remainder = divmod(n, base)
        vdc += remainder/float(denom)
    return vdc

And finally, definition for the Halton sequence:

def halton_sequence(size, dim):
    seq = []
    primeGen = next_prime()
    next(primeGen)
    for d in range(dim):
        base = next(primeGen)
        seq.append([vdc(i, base) for i in range(size)])
    return seq

To use all of this simply call halton_sequence(size, dim). These variables refer to the number of size of sample poll and the dimension of your problem. So if one wants to sample 3 dimensional space with 10 samples each it would be called as below. (Notice: first dimension has prime value 5, then it’s 7, 11, and following prime values.)

>>> halton_sequence(10, 3)
[
[0, 0.2, 0.4, 0.6, 0.8, 0.04, 0.24000000000000002, 0.44, 0.64, 0.8400000000000001],
[0, 0.14285714285714285, 0.2857142857142857, 0.42857142857142855, 0.5714285714285714, 0.7142857142857143, 0.8571428571428571, 0.02040816326530612, 0.16326530612244897, 0.30612244897959184],
[0, 0.09090909090909091, 0.18181818181818182, 0.2727272727272727, 0.36363636363636365, 0.45454545454545453, 0.5454545454545454, 0.6363636363636364, 0.7272727272727273, 0.8181818181818182]
]
Advertisements

One thought on “Halton sequence in Python

  1. Pingback: Update: Particle Swarm Optimisation in Python | Less than personal.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s