Update: Particle Swarm Optimisation in Python

It came to my attention that my PSO for Python is actually quite popular. Only after few people contacted me I’ve noticed that the public version was not the same that I’ve been using. It wasn’t bad, but definitely not as good. Thus, obviously, I’ve updated the version and cleaned it a bit.

Update programme is available from my github or from Code subpage.

What’s the difference? There are few.
– Initial values, unless provided, are psuedo-random generated using Halton sequence. This prevents from artificial clustering.
– Perturbation in form of a Gaussian noise should mitigate false local minima by forcing particles to search surrounding area.
– Added max_repetition threshold, which states the maximum number of obtaining the same optimum value. Upon reaching threshold program finishes.
– General improvement in performance.
– Improved usage documentation within the file.
– Program is now compatible with Python3.

Feel free to request any features.

There is an idea of adding progressive save, which would quit, resume and modify parameters at any point of computation.

Common dtype for NumPy arrays

Recent challenge I had was to convert two given numpy arrays such it’ll be possible to have them in common type without losing information. One can follow this link for stackoverflow entry.

Initially I thought about comparing dtypes, because in Python 2 it is allowed to do something like:

>>> np.float32 < np.float64
>>> np.int32 < np.float32
>>> np.int64 > np.float16

… which kind of makes sense(?). Well, except that int64 vs float16 which looks suspicious. It turns out that these are type objects and Python is simply comparing their locations in memory[Citation needed] and that is obviously not reliable. Actually, in Python 3 such comparison is forbidden and it fails.

As the answer to my stackoverflow question suggests one could try to use dtype.kind and dtype.itemsize to create own ordering. This is not difficult, but it should contain all types, such as (unsigned) ints, floats, bools, complex…

Fortunately, for my purposes, there is NumPy’s method which does what I want, i.e. numpy.find_common_type. It determines common type following standard coercion rules. With a help of this function my common conversion looks like:

import numpy as np

def common_dtype(x, y):
    dtype = np.find_common_type([x.dtype, y.dtype], [])
    if x.dtype != dtype: x = x.astype(dtype)
    if y.dtype != dtype: y = y.astype(dtype)

    return x, y

What to expect from such function?
float32 for float16 and float32
float32 for int16 and float32
int64 for int32 and uint16
float64 for int32 and float16

Behaves as it should.

Halton sequence in Python

Sometimes when we ask for random we don’t actually mean random by just random. Yes, pseudo-random.

Consider unitary distribution with ranges 0 and 1. Say you want to draw 5 samples. Selecting them at random would mean that we might end up with set of {0, 0.1, 0.02, 0.09, 0.01} or {0.11, 0.99, 0.09, 0.91, 0.01}. Yes, these values don’t seem very random, but that’s the thing about randomness, that it randomly can seem to not be random.

Depending on the purpose of our selection, these values might be just OK. After all, they came from that distribution. However, if our goal is to reconstruct the distribution, or extract information about with limited number of samples, it is often better to draw those samples in pseudo-random way. For example, in accordance to van der Corput sequences for 1D distributions or its generalized version Halton sequence.

The best practice for sampling N dimensional distribution is to use different prime numbers for each dimension. For example, when I need to sample a 5 dimensional unitary distribution, or search space, I will use bases of (5, 7, 11, 13, 17). This is to prevent periodic visits of the same position.

In case you are wondering what’s the difference between actual random and pseudo-random, here is a gist:

Both are good, but the actual random can produce many empty holes. What we like to have is a fair representation of all areas of our search space.

Thus, without further ado, here are some code snippets.

This is a definition of my prime generating generator:

def next_prime():
    def is_prime(num):
        "Checks if num is a prime value"
        for i in range(2,int(num**0.5)+1):
            if(num % i)==0: return False
        return True

    prime = 3
        if is_prime(prime):
            yield prime
        prime += 2

As for Halton sequence, as mentioned before it uses van der Corput sequence. Again, here is the definition:

def vdc(n, base=2):
    vdc, denom = 0, 1
    while n:
        denom *= base
        n, remainder = divmod(n, base)
        vdc += remainder/float(denom)
    return vdc

And finally, definition for the Halton sequence:

def halton_sequence(size, dim):
    seq = []
    primeGen = next_prime()
    for d in range(dim):
        base = next(primeGen)
        seq.append([vdc(i, base) for i in range(size)])
    return seq

To use all of this simply call halton_sequence(size, dim). These variables refer to the number of size of sample poll and the dimension of your problem. So if one wants to sample 3 dimensional space with 10 samples each it would be called as below. (Notice: first dimension has prime value 5, then it’s 7, 11, and following prime values.)

>>> halton_sequence(10, 3)
[0, 0.2, 0.4, 0.6, 0.8, 0.04, 0.24000000000000002, 0.44, 0.64, 0.8400000000000001],
[0, 0.14285714285714285, 0.2857142857142857, 0.42857142857142855, 0.5714285714285714, 0.7142857142857143, 0.8571428571428571, 0.02040816326530612, 0.16326530612244897, 0.30612244897959184],
[0, 0.09090909090909091, 0.18181818181818182, 0.2727272727272727, 0.36363636363636365, 0.45454545454545453, 0.5454545454545454, 0.6363636363636364, 0.7272727272727273, 0.8181818181818182]

Matrix Multiplication with Python 3.5

Only recently I have started to use Python 3. It’s been out for good 8+ years and all these excuses about incompatibility with some packages were just lazy. Most packages I use are already ported and if I ever find that something is incompatible… well, I’ll think then. But for now let me pat myself on the back for this great leap, because:

In Python 3.5.3 (released today) there is an operator for matrix multiplication! Check out: PEP 465 — A dedicated infix operator for matrix multiplication. The choice of operator, @, is a bit unfortunate, because of the decorators and general association with reference/internet, but seeing how few possibilities are left it’s probably the best choice.

Yes, this is big news for me. The number of times I confused myself with my own matrix operations is just too damn high! I cannot agree more with the author of the PEP 465, so let my shamelessly copy&paste (paraphrased) his reasoning. Behold!

(…) encounter many mathematical formulas that look like:

S = ( H β r ) T ( H V H T ) − 1 ( H β r )

Here the various variables are all vectors or matrices (details for the curious: [5] ).

Now we need to write code to perform this calculation. In current numpy, matrix multiplication can be performed using either the function or method call syntax. Neither provides a particularly readable translation of the formula:

import numpy as np
from numpy.linalg import inv, solve

# Using dot function:
S = np.dot((np.dot(H, beta) - r).T,
           np.dot(inv(np.dot(np.dot(H, V), H.T)), np.dot(H, beta) - r))

# Using dot method:
S = (H.dot(beta) - r).T.dot(inv(H.dot(V).dot(H.T))).dot(H.dot(beta) - r)

With the @ operator, the direct translation of the above formula becomes:

S = (H @ beta - r).T @ inv(H @ V @ H.T) @ (H @ beta - r)

Notice that there is now a transparent, 1-to-1 mapping between the symbols in the original formula and the code that implements it.

Of course, an experienced programmer will probably notice that this is not the best way to compute this expression. The repeated computation of H β r should perhaps be factored out; and, expressions of the form dot(inv(A), B) should almost always be replaced by the more numerically stable solve(A, B) . When using @ , performing these two refactorings gives us:

# Version 1 (as above)
S = (H @ beta - r).T @ inv(H @ V @ H.T) @ (H @ beta - r)

# Version 2
trans_coef = H @ beta - r
S = trans_coef.T @ inv(H @ V @ H.T) @ trans_coef

# Version 3
S = trans_coef.T @ solve(H @ V @ H.T, trans_coef)

Notice that when comparing between each pair of steps, it’s very easy to see exactly what was changed. If we apply the equivalent transformations to the code using the .dot method, then the changes are much harder to read out or verify for correctness:

# Version 1 (as above)
S = (H.dot(beta) - r).T.dot(inv(H.dot(V).dot(H.T))).dot(H.dot(beta) - r)

# Version 2
trans_coef = H.dot(beta) - r
S = trans_coef.T.dot(inv(H.dot(V).dot(H.T))).dot(trans_coef)

# Version 3
S = trans_coef.T.dot(solve(H.dot(V).dot(H.T)), trans_coef)

Readability counts! The statements using @ are shorter, contain more whitespace, can be directly and easily compared both to each other and to the textbook formula, and contain only meaningful parentheses. This last point is particularly important for readability: when using function-call syntax, the required parentheses on every operation create visual clutter that makes it very difficult to parse out the overall structure of the formula by eye, even for a relatively simple formula like this one. Eyes are terrible at parsing non-regular languages. I made and caught many errors while trying to write out the ‘dot’ formulas above. I know they still contain at least one error, maybe more. (Exercise: find it. Or them.) The @ examples, by contrast, are not only correct, they’re obviously correct at a glance.

Again: yes!

More links

A while ago I’ve started to taste a bit how it feels to work in industry and it feels quite nice. Maybe that’s the specificity of field projected onto the industry, or being tired of how academia works, but I’m enjoying extremely learning all the details about Computer Science, programming and newest technologies.

In addition to last post about Data Science, which still is my main daily ‘look for’, I’ve started to dive deep into computer science. Obviously there are plenty of good information sources and excellent tutorials. Aggregate that I exploiting right now are:


I’m planning to add some subpage with links for further reference. Any suggestions are welcomed!

Data Science podcasts

I’m an avid podcast listener. Whenever there’s something that only requires sight or not much focus I’ll try to do it with my headphones on. Great thing about podcasts is that they they are more up-to-date than audiobooks and have reasonably short lengths, so there’s always a fit.

Wanting to be more current with machine learning topics I’ve found few podcasts. These are my recommendations:

Machine Learning competition on Seizure Prediction

tl;dr: Read subject and click on link below.

Some of you, i.e. those lucky ones with connection to the outside World, are probably aware about Machine Learning community trying to aggressively change our lives for better. Regardless whether we like it or not, they’re doing it. Some do this for money, others for fame, and those wicked ones just for fun.

Kaggle is a webpage that hosts Machine Learning competitions. They provide data (usually donated by companies or public organizations) and set a goal. These included detecting and classifying specific whales species from satellite images, or driving a remote car based on an hour of recording, or identifying patients that will return based on historic records, or … It’s actually pretty big. Some prices can be as big as $500,000.

Reason behind this email is one of Kaggel’s recent competitions — Predicting seizures in long-term human intracranial EEG recordings. The challenge is to “The challenge is to distinguish between ten minute long data clips covering an hour prior to a seizure, and ten minute iEEG clips of interictal activity.” Yes, many people has tried this and it’s ongoing research in many labs. The difference here is that you can actually see people’s attempts and their codes. You can read their discussions and follow their logics. It looks like an amazing source of information! Moreover, good contestants are really good at machine learning and they often do their work properly, i.e. complete the challenge.

This all is really important to me. As my background is much in EEG analysis and machine learning. The timing is a bit unfortunate, but I might give it a go.