The harmonic spectrum and the formants

As mentioned earlier, musical tone consists of several harmonics, making it a so-called complex tone. The proportional volumes of the harmonics that form the sound may vary, but the first harmonic (the fundamental) is usually the strongest. The proportional volumes can be expressed using a harmonic spectrum, where each harmonic can be represented by (for example) a bar graph that displays the volume of that particular harmonic in proportion to the other harmonics of the sound.

The sound spectrum of a clarinet can be illustrated with the following figure:

The fundamental is clearly the strongest, while the odd harmonics are noticeably stronger than the even harmonics. The spectrum cannot, however, be considered constant. The relative volumes of the harmonics are in a state of continuous change, and the same instrument may produce a variety of different spectra depending on pitch, volume, the individual instrument, the musician playing the instrument, etc.

The figures below represent the sound spectrum of a transverse flute. Instead of using a bar graph, the spectrum is displayed on a curve with a number of peaks, since especially at the top end of the spectrum there are not only harmonics but other partial tones as well.

The first figure is a one-lined C played on a transverse flute; the lowest peak is at approximately 260 Hz. The following peak is at C2 (520 Hz). These two peaks are noticeably higher than the other peaks.

The next figure is a G1 played with the flute. Four peaks can be observed, with the two middle peaks dominating, but the first harmonic is still perceived as the sound's true pitch.

The third figure shows the flute playing in the upper register. Only one clear peak can be observed in the spectrum, while the fourth figure again displays several peaks.

These figures tell us the following: a) the harmonic spectra of various instruments may differ considerably depending on pitch, and b) the human ear often cannot hear even strong individual harmonics as tones but as a part of the timbre of that particular instrument.

What is essential for understanding the tonality here are the successive integer ratios between the frequencies of the harmonics. The second harmonic is not heard as a fundamental tone (pitch), because the next harmonic should have twice the frequency.

The first harmonic can be sensed even if it is left out of the harmonic series. In this case, the timbre becomes thinner. The loudspeaker of a telephone, for example, cannot produce very low frequencies, but this does not prevent us from recognising the pitches sung by a deep-voiced male over the telephone.

The example below illustrates the fact that a sound spectrum is never stable but changes continuously. The higher harmonics of the harmonic series usually become stronger when the volume of the sound increases. In the following example, the second and the third harmonic occasionally become even stronger than the fundamental.

Many solid objects have a characteristic frequency where they vibrate more readily than at other frequencies. This synchronous vibration that occurs in many acoustic instruments at certain pitches is called resonance. Instruments may have several resonant frequencies, and in practice the frequencies are not individual points along the scale but ranges of frequencies. When we are speaking about the human voice (also an instrument), these frequencies are called the formants.

Each formant corresponds to one or several resonant frequencies of the vocal tract. Sound waves that oscillate at the resonant frequency are amplified at some point of the vocal tract. The position and movement of the formants have been found to have a significant impact on the way we recognise speech and especially vowel sounds. In principle, the vocal tract has several (an infinite number of) resonant frequencies, but only a few of them have any significant impact on the way we recognise the quality of vowel sounds, for example. Formants are always independent of the fundamental frequency of the sound.

The figure above displays the formant ranges of vowel sound. For the vowel /a/, for example, they are approximately 600, 1200, 2600 and 3000 Hz.

If the fundamental is higher than the formant range, the tone of the vowel sound changes. This is the reason why it is difficult to identify the correct vowel colours in high soprano arias.