FM Synthesis of Instrument Sounds

In class you learned that with signals of the form $x(t)=A cos(\psi(t))$, the instantaneous frequency of the signal is the derivative of the phase $\psi(t)$. So, if $\psi(t)$ is constant, the frequency is zero. If $\psi(t)$ is linear, $x(t)$ is a sinusoid at some fixed frequency. If $\psi(t)$ is quadratic, $x(t)$ is a chirp signal whose frequency changes linearly versus time.

FM synthesis uses a more interesting $\psi(t)$, one that is sinusoidal. FM, meaning “frequency modulation,” refers to the fact that the frequency of $x(t)$ changes according to the oscillations of $\psi(t)$. This is useful for synthesizing instrument sounds because the proper choice of the modulating frequencies will produce a fundamental frequency and several overtones, as many instruments do.

The general equation for an FM sound synthesizer is: $$x(t)=A(t) cos[\omega_ct + I(t) cos(\omega_mt + \phi_m) + \phi_c] (1)$$ In (1), $A(t)$ is the signal's amplitude. It is a function of time so that the instrument sound can be made to fade out slowly or cut off quickly. Such a function is called an envelope. The frequency $\omega_c$ is called the “carrier” frequency. Note that when you take the derivative of $\psi(t)$ to find $\omega_i(t)$, $\omega_c$ will be a constant in that expression. It is the frequency that would be produced without any frequency modulation. The parameter $\omega_m$ is called the "modulating" frequency. It expresses the rate of oscillation of $\omega_i(t)$. The parameters $\phi_m$ and $\phi_c$ are arbitrary phase constants, usually both set to $-\frac{\pi}2$ so that $x(0)=0$.

The function $I(t)$ has a less obvious purpose than the other parameters of FM signals. It is technically called the “modulation index envelope.” To see what it does, examine the expression for the instantaneous frequency: $$\begin{align} \omega_i(t) & = \frac{d}{dt} \psi(t) \\ & = \frac{d}{dt} [ \omega_ct + I(t) cos(\omega_mt + \phi_m) + \phi_c] \\ & = \omega_c - I(t) \omega_m sin(\omega_mt + \phi_m) + \frac{dI}{dt} cos(\omega_mt + \phi_m) \end{align}$$ If $I(t)$ is constant, then $I(t)\omega_m$ gives the maximum amount by which the instantaneous frequency deviates from $\omega_c$. Beyond that, however, it is difficult to relate $I(t)$ to the sound made by $x(t)$ without some rather complicated analysis. Nonetheless, we would like to characterize $x(t)$ as the sum of several sinusoids instead of a single signal whose frequency changes. In this regard, the following comments are relevant: when $I(t)$ is small $I \approx 1$),low multiples of the carrier frequency $(\omega_c)$ have high amplitudes. When $I(t)$ is large $(I>4)$, both low and high multiples of the carrier frequency have high amplitudes. The net result is that $I(t)$ can be used to vary the overtone content of the instrument sound (overtones are harmonics). When $I(t)$ is small, mainly low frequencies will be produced. When $I(t)$ is large, higher harmonic frequencies can also be produced. For more details see the paper by Chowning. *

Below are some examples of sounds that can be synthesized with the appropriate choice of $A(t)$, $I(t)$, $\omega_c$, and $\omega_m$. These sounds were originally synthesized by Robbie Griffin.

Instrument	Carrier Frequency (Hz)	Modulating Frequency (Hz)
Brass	900	300
Clarinet	900	600
Bell	110	210
Knocking Sound	80	55

*Ref: John M. Chowning, “The Synthesis of Complex Audio Spectra by means of Frequency Modulation,” Journal of the Audio Engineering Society, vol. 21, no. 7, Sept. 1973, pp. 526--534.

Jeff Schodorf
Tue Feb 20 17:10:04 EST 1996