Pataphysica D: Sound synthesis

Showing posts with label Sound synthesis. Show all posts

Monday, March 30, 2015

Formulating a Feature Extractor Feedback System as an Ordinary Differential Equation

The basic idea of a Feature Extractor Feedback System (FEFS) is to have an audio signal generator whose output is analysed with some feature extractor, and this time varying feature is mapped to control parameters of the signal generator in a feedback loop.

What would be the simplest possible FEFS that still is capable of a wide range of sounds? Any FEFS must have the three components: a generator, a feature extractor and a mapping from signal descriptors to synthesis parameters. As for the simplicity of a model, one way to assess it would be to formulate it as a dynamic system and count its dimension, i.e. the number of state variables.

Although FEFS were originally explored as discrete time systems, some variants can be designed using ordinary differential equations. The generators are simply some type of oscillator, but it may be less straightforward to implement the feature extractor in terms of ordinary differential equations. However, the feature extractor (also called signal descriptor) does not have to be very complicated.

One of the simplest possible signal descriptors is an envelope follower that measures the sound's amplitude as it changes over time. An envelope follower can be easily constructed using differential equations. The idea is simply to appy a lowpass filter (as described in a previous post) to the squared input signal.

For the signal generator, let us consider a sinusoidal oscillator with variable amplitude and frequency. Although a single oscillator could be used for a FEFS, here we will consider a system of N circularly connected oscillators.

The amplitude follower introduces slow changes to the oscillator's control parameters. Since the amplitude follower changes smoothly, the synthesis parameters will follow the same slow, smooth rhythm. In this system, we will use a discontinuous mapping from the measured amplitudes of each oscillator to their amplitudes and frequencies. To this end, the mapping will be based on the relative measured amplitudes of pairs of adjacent oscillators (remember, the oscillators are positioned on a circle).

Let g(A) be the mapping function. The full system is

with control parameters k₁, k₂, k₁, K and τ. The variables θ are the oscillators' phases, a are the amplitude control parameters, A is the output of the envelope follower, and x(t) is the output signal. Since x(t) is an N-dimensional vector, any mixture of the N signals can be used as output.

Let the mapping function be defined as

where U is Heaviside's step function and b_j is a set of coefficients. Whenever the amplitude of an oscillator grows past the amplitude of its neighboring oscillators, the value of the functions g changes, but as long as the relative amplitudes stay within the same order relation, g remains constant. Thus, with a sufficiently slow amplitude envelope follower, g should remain constant for relatively long periods before switching to a new state. In the first equation which governs the oscillators' phases, the g functions determine the frequencies together with a coupling between oscillators. This coupling term is the same as is used in the Kuramoto model, but here it is usually restricted to two other oscillators. The amplitude a grows at a speed determined by g but is kept in check by the quadratic damping term.

Although this model has many parameters to tweak, some general observations can be made. The system is designed to facilitate a kind of instability, where the discontinuous function g may tip the system over in a new state even after it may appear to have settled on some steady state. Note that there is a finite number of possible values for the function g: since U(x) is either 0 or 1, the number of distinct states is at most 2^N for N oscillators. (The system's dimension is 3N; the x variable in the last equation is really just a notational convenience.)

There may be periods of rapid alteration between two states of g. There may also be periodic patterns that cycle through more than two states. Over longer time spans the system is likely to go through a number of different patterns, including dwelling a long time in one state.

Let S be the total number of states visited by the system, given its parameter values and specific initial conditions. Then S/2^N is the relative number of visited states. It can be conjectured that the relative number of states visited should decrease as the system's dimension increases. Or does it just take much longer time for the system to explore all of the available states as N grows?

The coupling term may induce synchronisation between the oscillators, but on the other hand it may also make the system's behaviour more complex. Without coupling, each oscillator would only be able to run at a discrete set of frequencies as determined by the mapping function. But with a non-zero couping, the instantaneous frequencies will be pushed up or down depending on the phases of the linked oscillators. The coupling term is an example of the seemingly trivial fact that adding structural complexity to the model increases its behavioural complexity.

There are many papers on coupled systems of oscillators such as the Kuramoto model, but typically the oscillators interact through their phase variables. In the above model, the interaction is mediated through a function of the waveform, as well as directly between the phases through the coupling term. Therefore the choice of waveform should influence the dynamics, which indeed has been found to be the case.

With all the free choices of parameters, of the b coefficients, the waveform and the coupling topology, this model allows for a large set of concrete instantiations. It is not the simplest conceivable example of a FEFS, but still its full description fits in a few equations and coefficients, while it is capable of seemingly unpredictable behaviour over very long time spans.

Wednesday, January 7, 2015

Decimals of π in 10TET

The digits of π have been translated into music a number of times. Sometimes the digits are translated to the pitches of a diatonic scale, perhaps accompanied by chords. The random appearance of the sequence of digits is reflected in the aimless meandering of such melodies. But wouldn't it be more appropriate, in some sense, to represent π in base 12 and map the numbers to the chromatic scale? After all, there is nothing in π that indicates that it should be sung or played in a major or minor tonality. Of course the mapping of integers to the twelve chromatic pitches is just about as arbitrary as any other mapping, it is a decision one has to take. However, it is easier to use the usual base 10 representation and to map it to a 10-TET tuning with 10 chromatic pitch steps in one octave.

Here is an etude that does precisely that, with two voices in tempo relation 1 : π. The sounds are synthesized with algorithms that also incorporate the number π. In the fast voice, the sounds are made with FM synthesis where two modulators with ratio 1 : π modulate a carrier. The slow voice is a waveshaped mixture of three partials in ratio 1 : π : π².

Despite the random appearance of the digits of π, it is not even known whether π is a normal number or not. Let us recall the definition: a normal number in base b has an equal proportion of all the digits 0, 1, ..., b-1 occuring in it, and equal probability of any possible sequence of two digits, three digits and so on. ("Digit" is usually reserved for the base 10 number system, so you may prefer to call them "letters" or "symbols".) A number that is normal in any base is simply called normal.

Some specific normal numbers have been constructed, but even though it is known that almost all numbers are normal, the proof that a number is normal is often elusive. Rational numbers are not normal in any base since they all end in a periodic sequence, such as 22/7 = 3.142857. However, there are irrational, non-normal numbers, some of which are quite exotic in the way they are constructed.

Sunday, July 28, 2013

On smoothness under parameter changes

Is your synthesizer a mathematical function?

At least it can be considered in such terms. Each setting of all its parameters represents a point in parameter space. The output signal depends on the parameter settings. Assuming the parameters remain fixed over time, the generated audio signal may also be considered as a point in another space. In order to relate these output sequences to perceptually more relevant terms, signal descriptors (e.g. the fundamental frequency, amplitude, spectral centroid, flux) are applied to the output signal.

Now, in order to assess how smoothly the sound changes as one turns any of the knobs that controls some synthesis parameter, the first step is to relate the amount of change in the signal descriptors to the distance in parameter space. The distance in parameter space corresponds to the angle the knob is turned. Let us call this distance Δc. It is trickier to define suitable distance metrics in the space of audio signal sequences, but why not use a signal descriptor φ which itself varies over time and take its time average ⟨φ⟩. The difference Δφ between two such time averages as the synthesizer is run at two different points in parameter space may be taken as the distance metric.

A smooth function has derivatives of all orders. Therefore the smoothness of a synthesis parameter may be described in terms of a derivative of the function that maps points in parameter space to points in the space of signal descriptors. This derivative may be defined as the limit of Δφ/Δc as Δc approaches 0. It makes a significant difference whether a pitch control of an oscillator has been designed with a linear or exponential response. But abrupt changes, corresponding to a discontinuous derivative, will be even more conspicuous when they occur.

Whereas the derivative is about the smoothness locally at each point in parameter space, another way to look at parameter smoothness is to measure the total variation of a signal descriptor as the synthesis parameter goes from one setting to another. As a compromise, the interval over which the total variation is measured may be made really small, so that a local variation can be measured instead over an interval of a parameter.

Is this really useful for anything?

Short answer: Don't expect too much. But seriously, whether we like it or not, science progresses in part by taking vague concepts and making them crisper, by making them quantifiable. "Smoothness" under parameter changes is precisely such a vague concept that can be defined in ways that make it measurable. Such a smoothness diagnostic may be useful in the design of synthesis models and their parameter mappings, as well as perhaps for introducing and testing hypotheses about the perceptual discrimination of similar synthesized sounds.

The paper was presented as a poster at the joint SMAC/SMC conference.

Pataphysica D