# Common dtype for NumPy arrays

Recent challenge I had was to convert two given numpy arrays such it’ll be possible to have them in common type without losing information. One can follow this link for stackoverflow entry.

Initially I thought about comparing dtypes, because in Python 2 it is allowed to do something like:

>>> np.float32 < np.float64
True
>>> np.int32 < np.float32
True
>>> np.int64 > np.float16
False


… which kind of makes sense(?). Well, except that int64 vs float16 which looks suspicious. It turns out that these are type objects and Python is simply comparing their locations in memory[Citation needed] and that is obviously not reliable. Actually, in Python 3 such comparison is forbidden and it fails.

As the answer to my stackoverflow question suggests one could try to use dtype.kind and dtype.itemsize to create own ordering. This is not difficult, but it should contain all types, such as (unsigned) ints, floats, bools, complex…

Fortunately, for my purposes, there is NumPy’s method which does what I want, i.e. numpy.find_common_type. It determines common type following standard coercion rules. With a help of this function my common conversion looks like:

import numpy as np

def common_dtype(x, y):
dtype = np.find_common_type([x.dtype, y.dtype], [])
if x.dtype != dtype: x = x.astype(dtype)
if y.dtype != dtype: y = y.astype(dtype)

return x, y


What to expect from such function?
float32 for float16 and float32
float32 for int16 and float32
int64 for int32 and uint16
float64 for int32 and float16

Behaves as it should.

# Data Buffer (Python)

For the past few years I’ve been teaching few modules for Master students, such as programming or signal processing. In most of them, one of the main and important tasks is to write a data buffer. Usual requirement is to be able to return last N samples or T seconds. However, conceptually most of students understands, implementing this seems to be quite difficult. But it is not. Unfortunately their first line of action, rather than think themselves, is to search for the solution on the Internet. There are many attempts, better or worse, but since they are looking for it, I could also provide with my own solution. Hopefully it helps others as well.

Below is Data Buffer class that saves size of last data, update via update method. First implementation uses strictly Python’s native lists, whereas the other uses NumPy arrays.

Python’s list:

class DataBuffer(object):
def __init__(self, size):
" Creates 2D list buffer of tuple 'size' size. "

self.size = size
self.buffer =  [ [0]*size[1] for i in range(size[0]) ]

def update(self, data):
" Updates buffer with input data."
self.buffer = [self.buffer[i][1:] + [data[i]] for i in range(self.size[0])]

def getData(self, d1=[0,-1], d2=[0,-1]):
" Returns buffer in given ranges (default: whole buffer)."

# Converts negative indices into absolute value
if d1[0]<0: d1[0] = self.size[0] + int(d1[0]) + 1
if d1[1]<0: d1[1] = self.size[0] + int(d1[1]) + 1

if d2[0]<0: d2[0] = self.size[1] + int(d2[0]) + 1
if d2[1]<0: d2[1] = self.size[1] + int(d2[1]) + 1

# Make sure that indices are in increasing order
d1 = [min(d1), max(d1)]
d2 = [min(d2), max(d2)]

_buf = [ self.buffer[i][d2[0]:d2[1]] for i in range(d1[0],d1[1])]

return _buf



Numpy’s arrays:

class DataBuffer(object):
def __init__(self, size):
"Creates buffer of 'size' size."

self.size = size
self.buffer = np.zeros( size )

def update(self, data):

self.buffer[:, :-1] = self.buffer[:, 1:]
self.buffer[:, -1] = data

def getData(self, d1=[0,-1], d2=[0,-1]):
"Prints on screen content of buffer."

d1 = np.array(d1)
d1[d1<;0] += self.size[0] + 1
d1.sort()

d2 = np.array(d2)
d2[d2<0] += self.size[1] + 1
d2.sort()

return self.buffer[d1[0]:d1[1], d2[0]:d2[1]]


Usage example:

if __name__ == "__main__"

import numpy as np

import random
r = random.randint

size = (5,10)
B = DataBuffer( size )

for i in range(20):
data = [r( 0, 10) for i in xrange(size[0])]
B.update( data)
print

print B.getData([0,2], [1,-2])
print B.getData([2,-1], [1,6])


# EMD on audio – wav and recurrance plots

Update:
Code and direct access to examples can be found on my GitHub reccurrence-plot.

There was an idea that has been bothering me for past few months, but due to time restrictions and many commitments I just couldn’t do it. Then came the conference quite far away and few transit hour at an airport. And now, few words about it.

Question: What do you get when you perform EMD on an audio?

Sound is nothing else than a vibration of the air in area where we happen to be. Human ear can register vibration of frequency roughly between 20 Hz and 20 kHz. In order to fulfil Nyquist sampling (in essence, the fastest frequency one can detect is up-down-up-down…, i.e. half the sampling rate) we need to record with at least 40 kHz sampling rate. Taking into account some additional technical difficulties and just in case sounds are typically recorded at sampling rate around 44 kHz. But there is also stereo, and… well, my point is, there is a lot of data in 1 second of sounds. For most common methods of signal analysing, say short-time Fourier transform or filters, processing is relatively fast. However, in case of EMD it is not that good. The longer the signal the more extrema in it leading to extending time of spline computation (e.g. cubic needs to know all extrema) and increase number of iterations. Needless to say, decomposing 3 min song gives very small ratio of (rewards significance) / (time spent), which is the reason I’ll show you result for few seconds sounds.

First sound (14240 samples) to decompose is disconnect sound. Below is figure where in red is the original signal and following rows are its IMFs (only four first and residue). All plots were normalised so that the biggest deflection is 1.

Signal (red) of disconnect sound and its first five IMFs.

The typical result, nothing new. However, interesting is how those IMFs listen. Their audio version is available here: first, second, third and fourth IMFs. How to interpret them? Allow me not to interpret them. Thought they are quite ‘noisy’, I think they sound rather… interesting. The nicer part of this are the recurrence plots.

Shortly, it is an image describe by a function
$I(t, \tau) = \left\{ \begin{array}{rcl} 0 & \mbox{for} & |x(t)-x(\tau)| < \epsilon \\ 1 & \mbox{for} & |x(t)-x(\tau)| \geq \epsilon \\ \end{array} \right. ,$
meaning that, if two values within signal are closer than $\epsilon$ then we draw a dot. I have modified this function slightly to increase the dynamics and instead of binary output, it gives few more values. Modified function is:
$I_N(t, \tau) = \left\{ \begin{array}{lcr} n & \mbox{for} & n \epsilon \leq |x(t)-x(\tau)| < (n+1) \epsilon \\ N & \mbox{for} & |x(t)-x(\tau)| \geq N \epsilon \\ \end{array} \right.,$
where $\mathbb{N} \ni n \in [0, N-1]$.
Here is a snippet on how to generate plots using Python:


import pylab as plt
import numpy as np

def rec_plot(s, eps=None, steps=None):
if eps==None: eps=0.01
if steps==None: steps=10
N = s.size
S = np.repeat(s[None,:], N, axis=0)
Z = np.floor(np.abs(S-S.T)/eps)
Z[Z>steps] = steps

return Z

s = np.random.random(1000)
plt.imshow(rec_plot(s))
plt.show()


Due to the size of an image and readability, here are only smaller plots (5000 points starting from 2000th point). Below are plots for second IMF, fifth IMF and residue from subtracting five fist IMFs. With slower frequencies one can notice quite mesmerising patterns. First three IMFs are abundant in high frequencies, which makes it fast varying image. The rest, however, gracefully changes it colours in nice patterns.

Recurrence plot for second IMF of disconnected sound.

Recurrence plot for fifth IMF of disconnected sound.

Recurrence plot for all IMFs above fifth of disconnected sound.

Same analysis was performed for chainsaw sound. Below are its EMD decomposition and few recurrence plots. Signal has 21381 samples, which makes it much longer to analyse. Again, for recurrence plots only time series of 3000 points were displayed. I must admit that I feel mesmerised by those patterns. Here are additionally wav files for first, second, third, and fourth IMFs.

Signal (red) of chainsaw sound and its first five IMFs.

Recurrence plot for second IMF of chainsaw sound.

Recurrence plot for fifth IMF of chainsaw sound.

Recurrence plot for all IMFs above fifth of chainsaw sound.

# Bayesian inference in Kuramoto model

For a while now, I’ve been involved in Kuramoto models and adjusting them to data. Recently I’ve stumbled upon using Bayesian inference on predicting parameters in time-evolving dynamics. Team from Physics Department at University of Lancaster has produced many papers, but to provide with some reference it might be worth to look at [1].

Technique refers to general phase $\phi$ dynamics in oscillator, i.e.
$\dot\phi_i = \omega_i + f_i (\phi_i) + g_i (\phi_i, \phi_j) + \xi_i$,
where $\omega_i, f_i(\phi_i)$ and $g_i(\phi_i, \phi_j)$ are intrinsic frequency, self coupling and coupling with other oscillators, respectively. Term $\xi_i$ refers to noise and although authors claim that it can be any type of noise calculations are performed for white Gaussian type. I don’t really want to go much into details, because on first sight it might look complicated. Just to point out few steps that are necessary to introduce my examples:

1. Rewrite model in form of $\dot\phi_i = \sum_{k=-K}^{K} C_{k}^{(i)} P_{i,k} (\Phi) + \xi_i(t)$, where $c$ is parameters vector, $P$ provides significant terms (for example via Fourier decomposition) for our model and $\Phi$ is a vector of all oscillators’ phases (at some moment).
2. Calculate diagonal values of Jacobian matrix, i.e. $\frac{\partial P_{i,k}}{\partial \phi_{i}}$.
3. Set a priori probabilities for parameters $C$ vector and its covariance $\Sigma$ (or concentration $\Xi = \Sigma^{-1}$).

Then the problem is reduced to finding maximum of minus log-likelihood function S. Authors also provide the exact formulas and algorithm, which find the extremum. Tutorial on applying their algorithm can be found in [2].

Based on MatLab code provided be the authors, and available on their webpage [3], I have written my own program in Python (code available in Code section or here ). As an experiment I have used simple Kuramoto model with sinusoidal coupling between phases
$\dot\phi_1 = 28 - 0.3 \sin( \phi_1 - \phi_2) - 0.1 \sin( \phi_1 - \phi_3 ) + 0.0 \sin( \phi_2 - \phi_3)$,
$\dot\phi_2 = 19 + 0.3 \sin( \phi_1 - \phi_2) + 0.0 \sin( \phi_1 - \phi_3 ) - 0.9 \sin( \phi_2 - \phi_3)$,
$\dot\phi_3 = 11 + 0.0 \sin( \phi_1 - \phi_2) + 0.1 \sin( \phi_1 - \phi_3 ) + 0.9 \sin( \phi_2 - \phi_3)$,
which means there are 12 parameters – three intrinsic frequencies ($\omega_i = \{28, 19, 11\}$) and 3×3 K matrix representing coupling between pairs of oscillators. To each oscillator small Gaussian noise was added (mean 0 and standard deviation 0.01). Signals were generated for $t \in [0, 40]$ with sampling step $dt = 0.001$. Furthermore, analysis was performed on 8 s segments with 2 s step. Figure 1 shows instantaneous frequencies ($\dot\phi_i$) for each oscillator. Figure 2 presents calculated parameters for each segment (thus time dependency). To be honest, I am quite surprised how good this technique works. I have performed more experiments, but to keep it short I am posting just one. There are some fluctuations in obtained values, but those changes are very small and most likely, in practical sense, negligible.

Fig. 1. Instantaneous frequencies of all oscillators in experiment.

Fig. 2. Extracted parameters for presented dynamical system. First column shows values on intrinsic frequencies and the rest respective to title coupling value. Horizontal black lines indicate what are expected values.

[1] A. Duggento, T. Stankovski, P. V. E. McClintock, and A. Stefanovska, “Dynamical Bayesian inference of time-evolving interactions: From a pair of coupled oscillators to networks of oscillators,” Phys. Rev. E – Stat. Nonlinear, Soft Matter Phys., vol. 86, no. 6, pp. 1–16, 2012.
[2] T. Stankovski, A. Duggento, P. V. E. McClintock, and A. Stefanovska, “A tutorial on time-evolving dynamical Bayesian inference,” Eur. Phys. J. Spec. Top., vol. 223, no. 13, pp. 2685–2703, 2014.
[3] Nonlinear Biomedical Physics, Lancaster University http://py-biomedical.lancaster.ac.uk/.

# Particle Swarm Optimisation in Python

[Updated version available. See newest post.]

tl;dr: For Python PSO code head to codes subpage.

One might expect, that currently there is everything on the internet and it’s only a matter of time to find something of an interest. At least that’s what I expected. Unfortunately, sometimes finding the right thing and adjusting it to your own problem can take much more time that doing it yourself. This is a rule about which I often forget. And it happened again. I was suggested to try Particle Swarm Optimisation (PSO) for my problem. Since it has been some time since the introduction of that method, and since Python is a quite popular language, I expected that finding code to just do that wouldn’t be a problem. True, it wasn’t hard for few entries, but either software used some obsolete packages, or it was difficult to install it, or the instructions weren’t clearly (to me) written. Thus, after about 2 days of trying to figure out everything and apply it to my problem, I simply gave up. That is, gave up searching and wrote my own code. In about 2 h.

What is Particle Swarm Optimisation (PSO)? Wikipedia gives quite comprehensive explanation, with a long list of references on this subject. However, in just few words: PSO is a metaheuristic, meaning that it does not require any a priori knowledge on gradient or on space. It is an optimisation method with many units (particles) looking independently for the best solution. Each particle updates its position (set of parameters) depending on its velocity vector, the best position it found during search and the best position discovered by any particle in swarm. Algorithm makes all particles go towards the best known solution, however, it does not guarantee that the solution will be globally optimal.

Algorithm depends on: initial positions $\vec{P}_0$, initial velocities $\vec{V}_0$, acceleration ratio $\omega$ and influence variable of the best global $\phi_g$ and the best local $\phi_l$ parameters. One also has to declare how many particles will be used and how many generations is the limit (or what is acceptable error of solution).

My program (shared in a Code section) can be executed in the following way:
 pso = PSO( minProblem, bounds) bestPos, bestFit = pso.optimize() 
where minProblem is a function to be minimised (for an array of N parameters), and bounds is 2D (2 x N values) array of minimum and maximum boundary values. The bestPos is the best set of parameters and bestFit is smallest obtained value for the problem.

A small example provided with the code fits a curve to noisy data. Input signal was
$S(t) = 4 t \cos( 4 \pi t^2) \exp(-3 t^2)$ + U(-0.025,0.025)\$, where $U(a,b)$ is a random noise from unitary [a, b] distribution. Fitted function is $F_{p}(t) = p_1 t \sin( p_2 \pi t^2 + p_3) \exp(- p_4 t^2)$, thus the expected values are P={4, 4, pi/2, 3}. See figure for fitted curve below. Obtained values are P = { 4.61487992 3.96829528 1.60951687 3.43047877} which are not far off. I am aware that my example is a bit to simple for such powerful search technique, but it is just a quick example.

Curve fitting with PSO optimisation method. Blue and red lines are for input data and reconstruction function respectively.

# NumPy’s array dimension

To determine dimension of an NumPy (np) array simply check length of its shape

>>> r = np.random.random((4,2,512,2,256))
>>> r.shape
(4, 2, 512, 2, 256)
>>> Dimension = len(r.shape)
>>> Dimension
5