The history of speech synthesis: the era of electrical solutions

Last time, we talked about mechanical devices for speech synthesis — Kempelen’s vocal tract and Joseph Faber's “talking head”. Next in turn are electric synthesizers of the 20th century.


Photo Rock'n Roll Monkey / Unsplash

The first electrical installations


In 1850, German physicist and physiologist Hermann von Helmholtz introduced his resonator theory . He noticed that vowels have different resonant frequencies (formants). These formants are formed during the passage of a sound wave from the vocal cords to the lips. Some waves are reflected from the speaker’s lips and go to the recipient, and some return to the source. The scientist suggested that the human vocal tract can be represented as a sequence of resonators.

At the beginning of the XX century, attempts began to implement such a model based on electrical components. The first synthesizer of this type was developed by physicist John Stewart. His scheme ( publishedin the journal Nature), included an electric buzzer for modeling bundles and a pair of inductive-capacitive resonators. They emulated the physical processes that occur with sound in the throat.

A synthesizer circuit designed by John Stuart

The Stuart device could make sounds consisting of two formants. These are a few simple vowels, as well asdiphthongs. But on this his possibilities ended.

The first electric synthesizer capable of reproducing speechappearedlater - in the 1930s. It was developed by Homer Dudley of Bell Laboratories. At that time, the company was working onvocoder- a tool for compressing speech and saving frequency resources of a radio line in telephone networks. The idea was to transmit key parameters instead of the caller’s voice. A special decoder was installed on the receiving side, which reconstructed and reproduced sound using these parameters. Dudley realized that with minor modifications, the vocoder can be turned into a full-fledged synthesizer. So there was a VODER system - Voice Operating Demonstrator.

The device was presented to the general public at the New York World's Fair in 1939. The VODER design includedtwo sources of sound: a tube noise generator for “deaf” phonemes, and an oscillator for “voiced” ones. There were also ten parallel-connected bandpass filters - they made up the resonance control unit. The operator controlled the system using a hand keyboard, wrist bracelet and foot pedal.

During the demonstrations, the apparatus spoke different languages, sang and answered questions with different intonations. But to unleash the potential of the system, its operator needed years of training.


Shortly after the premiere of VODER, World War II began, and Bell Labs had to curtail further development of the synthesizer. However, the knowledge gained during the work on the project, Homer Dudley used to create encryption technology for telephone conversations.

Speech synthesizers on spectrograms


In 1946, an acoustic spectrograph was invented . And the idea came up - to use spectrograms to control speech synthesizers. One of the first to introduce such a device was L. Schott, an American engineer at Bell Labs. He used a linear light source, translucent spectrographic patterns with varying degrees of transparency. Special photocells mounted opposite the lamp recorded changes in the level of illumination and generated control signals for bandpass filters. Homer Dudley used the exact same filters for his VODER.

Photo 120years.net
other development in this area has presented a group of US scientists led by physicist Franklin Cooper ( Franklin to Cooper Cooper ). Their optical system isPattern Playback - modulated the harmonics of the fundamental tone of 120 Hz, reading images on a moving transparent tape. Visual information was transmitted to the oscillator, turning it into sound.

In a sense, the system resembled Soviet optical synthesizers - Nivoton and Variofon - on which they wrote music for cartoons. However, the Pattern Playback was initially “imprisoned” for the generation of human speech and was able to reproduce whole sentences.


Devices like Pattern Playback and VODER have laid the theoretical foundation for designing formant and articulating synthesizers. They became the prototype of modern computer synthesis. We will talk about them next time.



« Hi-Fi»:

:
:
: « »
« »:
:



All Articles