FM Synthesis of Instrument Sounds

In class you learned that with signals of the form $x(t)=A \cos(\psi(t))$, the instantaneous frequency of the signal is the derivative of the phase $\psi(t)$. So, if $\psi(t)$ is constant, the frequency is zero. If $\psi(t)$ is linear, $x(t)$ is a sinusoid at some fixed frequency. If $\psi(t)$ is quadratic, $x(t)$ is a chirp signal whose frequency changes linearly versus time.

FM synthesis uses a more interesting $\psi(t)$, one that is sinusoidal. FM, meaning “frequency modulation,” refers to the fact that the frequency of $x(t)$ changes according to the oscillations of $\psi(t)$. This is useful for synthesizing instrument sounds because the proper choice of the modulating frequencies will produce a fundamental frequency and several overtones, as many instruments do.

The general equation for an FM sound synthesizer is: $$x(t)=A(t) \cos[\omega_ct + I(t) \cos(\omega_mt + \phi_m) + \phi_c] \qquad(1)$$ In $(1)$, $A(t)$ is the signal's amplitude. It is a function of time so that the instrument sound can be made to fade out slowly or cut off quickly. Such a function is called an envelope. The frequency $\omega_c$ is called the “carrier” frequency. Note that when you take the derivative of $\psi(t)$ to find $\omega_i(t)$, $\omega_c$ will be a constant in that expression. It is the frequency that would be produced without any frequency modulation. The parameter $\omega_m$ is called the "modulating" frequency. It expresses the rate of oscillation of $\omega_i(t)$. The parameters $\phi_m$ and $\phi_c$ are arbitrary phase constants, usually both set to $-\frac{\pi}2$ so that $x(0)=0$.

The function $I(t)$ has a less obvious purpose than the other parameters of FM signals. It is technically called the “modulation index envelope.” To see what it does, examine the expression for the instantaneous frequency: $$\begin{align} \omega_i(t) & = \frac{d}{dt} \psi(t) \\ & = \frac{d}{dt} [ \omega_ct + I(t) \cos(\omega_mt + \phi_m) + \phi_c] \\ & = \omega_c - I(t) \omega_m \sin(\omega_mt + \phi_m) + \frac{dI}{dt} \cos(\omega_mt + \phi_m) \end{align}$$ If $I(t)$ is constant, then $I(t)\omega_m$ gives the maximum amount by which the instantaneous frequency deviates from $\omega_c$. Beyond that, however, it is difficult to relate $I(t)$ to the sound made by $x(t)$ without some rather complicated analysis. Nonetheless, we would like to characterize $x(t)$ as the sum of several sinusoids instead of a single signal whose frequency changes. In this regard, the following comments are relevant: when $I(t)$ is small $I \approx 1$), low multiples of the carrier frequency $(\omega_c)$ have high amplitudes. When $I(t)$ is large $(I>4)$, both low and high multiples of the carrier frequency have high amplitudes. The net result is that $I(t)$ can be used to vary the overtone content of the instrument sound (overtones are harmonics). When $I(t)$ is small, mainly low frequencies will be produced. When $I(t)$ is large, higher harmonic frequencies can also be produced. For more details see the paper by Chowning [1].

Below are some examples of sounds that can be synthesized with the appropriate choice of $A(t)$, $I(t)$, $\omega_c$, and $\omega_m$. These sounds were originally synthesized by Robbie Griffin.

Instrument	Carrier Frequency (Hz)	Modulating Frequency (Hz)
Brass	900	300
Clarinet	900	600
Bell	110	210
Knocking Sound	80	55

[1] John M. Chowning, “The Synthesis of Complex Audio Spectra by means of Frequency Modulation,” Journal of the Audio Engineering Society, vol. 21, no. 7, Sept. 1973, pp. 526–534.

Jeff Schodorf, Feb-20, 1996