- Home
- Companies
- Electronic Speech Systems
- Talking Heads
Talking Heads
- By Tom Jeffries
- Published 08/25/2006
- Electronic Speech Systems
-
Rating:




|
The
ESS system is protected by a dozen or so patents so the details remain
secret, but basically it goes like this. They start out by making a
high quality recording of the words they want to use, with a voice they
feel is appropriate. (For example, for an educational program based on'
Kipling's The Jungle Book they used an Indian student of Dr. Mozer's.)
They then digitize the sound (convert it from analog tape-type sound to
"1"s and "0"s that the computer can read) and, using a mini-computer,
crunch the original down to 100th of its original size. This crunching
is the heart of their system. It takes a considerable amount of effort
to decide what information can be thrown away, and which information is
essential to the sound. The original information usually involves about
10,000 complete sound samples per second; the finished product uses
between 90 and 625 bytes per second. |
|
|
On the Commodore 64, they normally use a rate of 375 bytes per second or less, so it's possible to pack quite a lot of speech into a program. To play back the speech on the Commodore 64, ESS uses the machine's own sound device, the SID chip, but in quite an unusual way. All of the registers of SID are shut down except the volume control, which is varied up and down to recreate the original waveform. Since there are only 16 possible settings, the resulting sound can never be as good as an ordinary tape deck, which has the capability of infinite variation, but they do produce easily intelligible speech. ESS's technology can reproduce the accents and inflections of the original speaker quite accurately, like the Indian in Jungle Book, or can change them as needed so that the same vocabulary can produce a human and a robot voice. |
![]() Kipling's Jungle Book by Fisherprice |
|
Kennedy Approach |
|
![]() Kennedy Approach by Microprose |
All
of this technology is pretty impressive, but it's up to the software
companies to put it to use. I asked George Geary of MicroProse
Software, publisher of Kennedy Approach, an air traffic control
simulation, why MicroProse had decided to use speech synthesis in their
program, and his answer was simple and to the point: "To enhance game
play." The voice from the airport control tower (you) alternates with
the voices from the various airplanes in giving and receiving
instructions and really does add a considerable amount of realism to
the simulation. Listen carefully, and you will notice that the voices
of the different pilots are pitched differently - a subtle touch, but I
found that even before I was aware that the voices were different, my
ear knew the difference. MicroProse, which has its speech digitizing done by ESS, is so happy with the effect of speech in Kennedy Approach that it is currently adding a male and a female voice to Solo Flight so that they can rerelease an enhanced version. They do plan to limit their use of speech synthesis to programs where the game play itself will be enhanced by the electronic voice. |
|
Other uses of synthesized speech are more whimsical. No one would argue that speech is a necessary part of Ghostbusters, but it certainly adds a distinctive and humorous touch. According to Brad Fregger, Director of Software Development at Activision, they wanted to "give the game the same feeling as the movie", and voice was one way of accomplishing this. Activision considers voice to be "The icing on the cake - we wouldn't leave out the eggs in order to have the icing", but in this case there was room for both. Personally, I'm glad - what other game says, "He slimed me", when I miss? Likewise, the voices in Jump Jet and Impossible Mission, while adding to the enjoyment and character of the software, are not essential to the game. Robert Botch, Epyx's Vice President of Marketing, said speech was put into Impossible Mission "to add something extra - some realism"; the cry that occurs as your character falls through one of the holes in the floor is certainly realistic enough. |
|
|
A
more serious use of speech synthesis is in educational programs.
According to Todd Mozer, this is the area where ESS expects to see the
greatest use of electronic voices in the future. He said, "There have
been a lot of studies done about the effectiveness of speech in
learning and the results have been extremely positive. Children will
sit in front of a computer longer if it's giving them verbal feedback,
and it provides a much more effective mechanism for teaching. I would
expect that to be a realm where speech takes off." ESS has already
produced speech for several educational programs including Talking
Teacher by Imagic and Cave of the Word Wizard by Timeworks. |
![]() Cave Of The Word Wizard by Timeworks |






