This document is copyright  2001-2003 by Volker Kuhlmann

Created 9, 10 September 2001
Updated 11 Sep 2001: links section
Updated 18 Oct 2001: IBM Java version
Updated 05 Nov 2001: SuSE 7.3
Updated 22 Nov 2001: clarified some points
Updated 04 Jan 2002: how to enroll correctly; reducing waits
Updated 24 Feb 2003: SuSE 8.1
Updated 26 Feb 2003: fixed some details
Updated 14 Mar 2003: added some details; fixed sentences + typos
Updated 05 Apr 2003: added text of error message in 4.1

	1.  Introduction
	2.  Software components
	2.1 Java
	3.  Installation
	4.  Bug fixing before wasting time on anything else
	4.1 /usr/bin/vvsetenv
	4.2 /bin/sh
	4.3 /usr/bin/vvstartdictation
	4.4 /usr/bin/vvstartuserguru
	4.5 /etc/viavoiceps.conf
	4.6 /usr/bin/vvuser
	4.7 /usr/bin/viavoice
	5.  Preparing for starting up ViaVoice
	5.1 $HOME/viavoice/
	5.2 Creating a ViaVoice user
	5.3 Enrollment (Creating a Voice Model)
	5.4 Uninstalling ViaVoice
	6.  Proceed as per instructions
	6.1 Sound drivers
	6.2 Audio mixer
	6.3 Adjustment of audio setup
	6.4 Creating a personal voice model
	6.5 Training the recognition engine
	6.6 Start dictation
	7.  To Do
	8.  Tips
	A.  Links
	B.  Quick Start

1.  Introduction

Apart from some projects which don't look too promising, IBM's ViaVoice
Dictation is the only voice recogniton software available for linux. It
consists of two parts: the recognition engine, and a user interface. The user
interface is programmed in java, and handles user registration + training, and
allows dictation into a text field. Compared with software available for
Microsoft windows, the dictation interface is not very advanced, though (some
of the?) source code is available for anyone wishing to improve on it.

The recognition engine is available for free download from IBM together with a
time-limited evaluation license and a software development kit. There are
projects which make use of the recognition engine, most notably xvoice
( Unfortunately, IBM stopped making
the development kit available in Mar 2002.

This split into engine and user interface is very smart and leaves room for
some interesting possibilities of making use of the engine.

The dictation software and engine runtime could be purchased from IBM for
US$60, unfortunately this was sold out/discontinued in Feb 2003. It includes
some kind of headset and is only available for US English. Unfortunately, IBM
only sells it to people in the USA. Whatever the reason for this daft decision
is, we'll have to live with it for the foreseeable future. (It is perhaps
unwise to complain to the speech recognition group about this via their mailing
lists, these people are very helpful and have probably not made that decision.)

Finding a dealer in the USA who is willing to ship a box to New Zealand for a
lesser fee than FedEx is asking is not something I'd like to get into if I can
avoid it.

An alternative (the only alternative?) is to go via Mandrake. Mandrake has
licensed viavoice for linux for US + UK English, German and French, and there
seem to be some hooks for Spanish. Obviously this is not included in the GPL
version of Mandrake, only in the powerpack (version 8.0, and I believe 7.3).

The powerpack works out at about the same price as IBM's dictation pack, albeit
without headset. I would have preferred to pay my support directly to IBM, but
sorry - see above.

PLEASE NOTE that the instructions in here are SPECIFICALLY FOR THE VIAVOICE
FROM MANDRAKE. Mandrake 8.0 is the one I have. Because I have no copy of IBM's
boxed set available, I am unable to tell to what extent this applies to the
boxed set. It is probably that some of it will, but feedback (thanks Jim Mohr)
has it that not all of it is directly applicable.

The patch for some of the viavoice dictation scripts applies to IBM's boxed set
as well, and to any version of Linux. Pretty much everyone should apply it.

Thanks to David Morgan who said the instructions in here are also useful for
making ViaVoice work on Mandrake (which version?). There are some minor
differences: (section 4.1) SPCH_JAVA needs to be set to /usr/lib/java-1.3/jre,
and the line with LD_LIBRARY_PATH is somewhere else - remove it anyway.

2.  Software components

a) From IBM / Mandrake:

The viavoice software is from CD 4 (Commercial Apps 2) of a Mandrake linux
powerpack 8.0, ISBN 1-57595-493-1. File dates (in GMT), sizes, md5 sums etc:

    5675307 2001-04-18 10:56:28 ViaVoice_Dictation-1.1-0.0.i386.rpm
    1287476 2001-04-18 10:56:29 ViaVoice_TTS_rtk-5.1-1.2.i386.rpm
    2846764 2001-04-18 10:56:29 ViaVoice_runtime-3.1-0.0.i386.rpm
  185041363 2001-04-18 10:58:34 ViaVoice_runtime_US_LangPack-3.1-0.0.i386.rpm

f01b9a13da086e63c006891686c2f619  ViaVoice_Dictation-1.1-0.0.i386.rpm
0aafee2bb6fb82b89abcf8f2c1baefc8  ViaVoice_TTS_rtk-5.1-1.2.i386.rpm
70c5a72d1c17794b6953e8de5924e293  ViaVoice_runtime-3.1-0.0.i386.rpm
8355ddf8db19ae0e4ff5620649257691  ViaVoice_runtime_US_LangPack-3.1-0.0.i386.rpm

These rpm packages are compiled and packaged by IBM (unless the package
information is forged):

> cd /media/cdrom/Mandrake/RPMS4
> rpm -qp --queryformat "%-28{name}, %-3{packager}, %-1{buildhost}\n" Via*
ViaVoice_Dictation          , IBM,
ViaVoice_TTS_rtk            , IBM,
ViaVoice_runtime            , IBM,
ViaVoice_runtime_US_LangPack, IBM,

According to Damon Lynch, the IBM dictation pack contains:


So the software shipped with Mandrake is actually slightly newer. Apart from
having some problems fixed, it also contains new bugs...

b) From SuSE:

The install is on a stock SuSE 8.1 system, running the supplied 2.4.19-4GB
kernel. The CDs contain the Application ID in the iso9660 filesystem:


(you can display this with isoinfo -d -i /dev/cdrom)

c) From IBM:

   19615296   IBMJava2-JRE-1.3.1-3.0.i386.rpm

If you decide to use it, instead of the SuSE-8.1-supplied java which seems to
work as well.

2.1 Java

IBM clearly states that dictation requires java 1.2.2rc4 from This is certainly true for IBM's boxed set of dictation.

The rpms shipped with Mandrake contain a dependency on IBM's java2 1.3, as
shipped with Mandrake and SuSE. As the ViaVoice rpms are packaged by IBM, one
can assume that ViaVoice dictation 3.1 will now also run with IBM's java.

The version of IBM's java shipped with SuSE 7.2 is just fine:

   15983791 2001-05-16 10:37:17 IBMJava2-JRE-1.3-45.i386.rpm

In July 2001 SuSE released an updated java version
IBMJava2-JRE-1.3-67.i386.rpm), which is fine as well.

The version of IBM's java shipped with SuSE 7.3 is fine as well:

   17263096 2001-09-23 20:51:13 IBMJava2-JRE-1.3-109.i386.rpm

With SuSE 8.1, IBM java is no longer part of the distribution, but it can be
downloaded from IBM (a no-charge registration will be required). Try or	 (or search for 'java linux download jre').
This java is fine for ViaVoice.

SuSE 8.1 ships with SunJava2 1.3.1, which seems to be fine as well (it warns
about a few missing keysyms at startup).

Note the ViaVoice in IBM's boxed set will not work with IBM's java earlier than
1.3. It may work with IBM java 1.3 (I can't test this myself). You need to
download the prescribed blackdown version. John Smart reports that ViaVoice
from IBM's boxed set (with the updated runtime
ViaVoice_runtime-3.0-1.2.i386.rpm) works with the SunJava2-1.3 shipped with
SuSE 8.1.

Note that depending on which java you're using, the setting of LD_LIBRARY_PATH
in /usr/bin/vvsetenv may have to be adjusted. See section 4.1, and the comments
in vvsetenv (you'll have to apply the patch first).

3.  Installation

This is very easy: install the IBMJava2-JRE package downloaded from IBM, and
the 4 viavoice rpms from Mandrake CD 4. Use yast, yast2, or just type:

    rpm -Uvh IBMJava2-JRE-1.3.1-3.0.i386.rpm
    rpm -Uvh /media/cdrom/Mandrake/RPMS4/ViaVoice*

This requires about 280MB of disk space, and doesn't include any files created
while running viavoice. If you use the SuSE-supplied sun java, you must install
the ViaVoice rpms with --nodeps.

Some minor things need to be fixed, run (as root):

    chmod a+r /etc/viavoiceps.conf /usr/lib/menu/vv*

Ignore the several errors you see, the rpm install scripts are buggy. (Who had
this idea of calling gless to display the license? Pity it doesn't show if
gless doesn't exist, like on a KDE or non-Gnome system...)

IBM's boxed set probably doesn't contain the 2 files /usr/lib/menu/vv*, they're
Mandrake-specific and useless on a SuSE system. Delete them if you like.

Now, if you think you're done now, better think again and read section 4!

4.  Bug fixing before wasting time on anything else

Before the effort is of any use, we need to do some serious tidying up.

All these modifications are contained in a patch file which can be downloaded

Pretty much all of these changes should be applied by everyone, not just those
using SuSE 8.1.

Apply the patch with (as root):

    umask 22
    patch -b -p0 <viavoice-3.1-SuSE8.1.patch
    rm /usr/bin/vvuser.orig /usr/bin/viavoice.orig
    chmod 755 /usr/bin/vvuser /usr/bin/viavoice

The rehash command is only necessary if you use tcsh as shell. If you don't
know, run it and ignore the error it might give.

Unlike previous versions of this document (for SuSE 7.2/7.3), the following
sections only explain the changes, but don't say what they are or how to apply
them manually. Applying the patch file is much easier, less error prone, and I
don't have time to do things twice.

4.1 /usr/bin/vvsetenv

Several changes (all as root):


Variable SPCH_JAVA is added, this sets the particular java you want to use for
ViaVoice. Edit /usr/bin/vvsetenv after applying the patch, then edit to the one
you wish to use. Unchanged it's setup for IBMJava2 downloaded from IBM. There
are comments in the file.

SPCH_JAVA, once we have made further modifications, will tell the various parts
of dictation where to find the java we want/need to use. IBM should really make
use of such a variable! Especially vvstartdictation will just die otherwise...


This needs to be set depending on the java version you want to use. Edit
vvsetenv (after applying the patch of course), there are comments in there. You
may have to try different settings.

For the SunJava2 1.3.1 on SuSE 8.1, LD_LIBRARY_PATH needs to be set.


Adding these variables (and together with other modifications to the ViaVoice
shell scripts), set the time to waste doing nothing. Values are significantly
reduced compared with what IBM uses. There may be sense in these afterall:
ViaVoice, or parts thereof, don't start up if the recognition engine is still
running from a previous invocation of ViaVoice. The engine takes several
seconds to terminate after the application has finished. If ViaVoice doesn't
start properly after it just terminated, wait a few seconds and try again (the
error is "The Speech System is in use by another application"). Alternatively
change the numbers back to IBM's (comments in file).

I never noticed this to be an issue before, perhaps other conditions prevented
this to become an issue before. Sometimes this can be reproduced, sometimes


This is added to /etc/profile when installing ViaVoice. Good - it also sets
other applications which use the TTS part right. Bad - it only works for bash,
not for other shells. To make it always work for ViaVoice, the setting of it is
also put into vvsetenv (you can remove it from /etc/profile unless you have
applications other than ViaVoice which use text to speech).

4.2 /bin/sh

Various startup scripts do not specify which shell they need to run under, so
the system will take the user's default. If that happens to be e.g. tcsh,
viavoice dictation will only ever spit some garbage onto the screen and
terminate after a second or two.

To fix this, add as very first(!) line (as root):


to these scripts:


4.3 /usr/bin/vvstartdictation

This script contains a major bug: it runs the java from
/usr/lib/ViaVoice/IBMJava2-13, unfortunately, Mandrake installs the java into
/opt/IBMJava2. This means that the reason why ViaVoice starts at all on
Mandrake is due to pure chance!

SuSE installs the java somewhere else again, so we deal with this properly and
make use of the variable SPCH_JAVA introduced into vvsetenv for this reason.
Change the line (towards the end, and as root)

export PATH=/usr/lib/ViaVoice/IBMJava2-13/jre/bin:$PATH


export PATH=$SPCH_JAVA/bin:$PATH

4.4 /usr/bin/vvstartuserguru

This is programmed in java as well, and likewise needs to be told properly
where to find the java to use. Change the line towards the end (as root)

export PATH=/opt/IBMJava2-13/jre/bin:$PATH


export PATH=$SPCH_JAVA/bin:$PATH

4.5 /etc/viavoiceps.conf

This is the absolut bummer of a bug. If this file is missing, running
vvstartuserguru (which is the first thing to do after installation) will
terminate with the completely bogus error "The Speech System is in use by
another application". And that before it even attempts to open the sound
device! Of course there isn't another program using the sound system either
(this can be tested with

    lsof +D /dev

*as root*).

Furthermore, this file is not contained in any of the viavoice rpms, nor is it
created directly by the rpms' installation scripts. Somehow by a minor miracle,
this file turns up out of nowhere when installing the rpms, although I'm not
sure whether that is always the case.

Check if you have it. If not, download it from and copy it to /etc.
Set permissions to 644:

    chmod 644 /etc/viavoiceps.conf

As Damon Lynch tells me, this problem does not occur with the dictation rpms
version 3.0 which are in IBM's boxed set.

4.6 /usr/bin/vvuser

This is a wrapper script to ViaVoice's internal version of vvuser, which IBM
didn't mean users to run, but which we need to run anyway. Especially in case
vvstartuserguru segfaults, there's no alternative to vvuser. vvuser can also do
a few things like create additional voice models for a user, which the
vvstartuserguru GUI allows to select, but not to create. It also reads in the
settings from vvsetenv for you.

It's installed as part of the patch. Otherwise, download it from and copy it to /usr/bin. Make
sure its permissions are 755.

4.7 /usr/bin/viavoice

This wrapper script to vvstartdictation relieves you fom having to source the
vvsetenv file, thus creating a stand-alone application. It's installed as part
of the patch. Otherwise, download it from and copy it to /usr/bin. Make
sure its permissions are 755.

This script also restores mixer settings. It's important to run ViaVoice always
with the same settings.

5.0 Preparing for starting up ViaVoice

5.1 $HOME/viavoice/

This directory is created by dictation sometime during the initial
configuration, when running vvstartuserguru. It has the extremely annoying
effect of causing dictation's java program to segfault early on, or to die with
some other error.

The fix is easy (as a normal user):

	rm -rf $HOME/viavoice

At this early stage there aren't any useful files in there, but keep in mind
that this directory contains your viavoice users, voice models, and voice
training results later on so don't delete it once you've started with the voice
training! If you suspect that its existance may cause a crash, rename it and

It's a major bummer that this directory is created when it's not needed, and
that its existance prevents ViaVoice from getting to a stage where it can be

(IBM: There is more debugging info, see Links section.)

5.2 Creating a ViaVoice user

As per /usr/doc/ViaVoice/en_us.rt.readme.txt, running vvstartuserguru should
finally adjust the sound system and create a ViaVoice user. Unfortunately, it
segfaults when $HOME/viavoice exists, it also segfaults when $HOME/viavoice does
not exist. Catch 22. No comment as to programming quality.

Luckily, there is another way. Make sure /etc/viavoiceps.conf exists, as per
4.5 above. Then run vvuser.

Now create a ViaVoice user (as a normal linux user):

    vvuser -adduser "Any name you like to be called by ViaVoice" -setdefault

This has also created a voice model for the user. You should create a new voice
model (and repeat the enrollment) for each time you have a change in microphone
or shift location. This adds a new voice model to the currently active ViaVoice

    vvuser -addvoicemodel "My other desk" -setdefault

Make a different user current (the user must have been created with -adduser
before). You need this to add a voicemodel to a user which is not currently

    vvuser -user "User name" -setdefault

As there is no way to add more users or voice models using the graphical user
interface, this is a handy program to know of.

5.3 Enrollment (Creating a Voice Model)

IBM says to use vvstartuserguru for this, however it crashes at the very first
time (java segfault). vvstartuserguru does not allow to create additional users
nor to create additional voice models for existing user(s), see 5.2 above. Use

After running vvuser to create a viavoice user, you must still run
vvstartuserguru. Although the audio setup and the story reading can be
performed with vvstartaudioguru and vvstartenrollment, a short speech sample is
not processed without running vvstartuserguru.

NOTE (read this twice!):


There is no warning about this, you will just be wasting your time and then
waste even more time trying to figure out why it doesn't work. Been there, done
that. I would gladly spend quite a bit more money on a product which was
engineered soundly!

5.4 Uninstalling ViaVoice

This is only listed here to mention another bug. Removing the ViaVoice rpms
fails because the script(s) these rpms contain are buggy. To clean up
afterwards, run this command as root:

rpm -e ... (the 4 ViaVoice rpms)
rpm -e --noscripts ViaVoice_runtime_US_LangPack
rm -rf /usr/lib/ViaVoiceTTS /etc/viavoiceps.conf

Remove this line from /etc/profile (and/or the shell you are using):
export ECIINI=/usr/lib/ViaVoiceTTS/eci.ini

(Note: The files created by applying the patch are still not removed.)
(Note 2: The install scripts of these rpms are buggy too. See section 3.)

6.  Proceed as per instructions

Finally, we're at a stage where we can do what the instructions say to start

6.1 Sound drivers

ViaVoice requires sound hardware and a linux driver which can record at 22kHz,
mono, and play back (see ViaVoice docs for details)

I have only one sound card to test with, and that's a Soundblaster PCI 64 with
an Ensoniq 1370 chip. It's a pretty cheap card, but it works (recognition
accuracy may perhaps be better with a higher-quality card).

I am using the alsa 0.9.0.cvs20020903 as shipped with SuSE 8.1. This driver
does not support /dev/sndstat for this card/chip, but it's not needed by
ViaVoice. The example in /usr/doc/ViaVoice/en_us.rt.readme.txt tests sound
playing and recording using /dev/audio, but I am not sure whether /dev/audio is
really required for running ViaVoice. /dev/dsp should be enough.

Alternatively you an try the Linux kernel drivers, but they didn't use to
support PCI sound cards. There are the commercially available OSS drivers as
well, see

Test your audio setup with any program you like, I used sound studio and snd
(both shipped with SuSE 8.1).

It's important that within vvstartuserguru, the volume slider for playback
volume adjustment works, and that the following test of recording quality works
as well. The recording quality check does not try to recognise any words, it
only determines whether the input signal is typical for spoken words. You
should be getting an "excellent". You need at least a "good" to make reasonable
use of ViaVoice.

6.2 Audio mixer

It is important to find some mixer settings (output volume, input gain, etc)
which will work with ViaVoice, and to have these restored each time ViaVoice is
started. If this doesn't happen expect recognition accuracy to be seriously
effected. The viavoice wrapper script, created by applying the patch, 
restores audio levels from a file you previously saved, so each time you start
ViaVoice you're starting with the same mixer settings.

This script looks for these files and restores the mixer settings from the
first one it finds (the names were different for previous versions of the


It then starts vvstartdictation.

The mixer settings file can be created with

    alsactl -f $HOME/viavoice/alsa.state store

or a system-wide one as root with

    alsactl -f /etc/asound.state.viavoice store
    chmod 644 /etc/asound.state.viavoice

after adjusting the mixer settings with a program of your choice. See next

6.3 Adjustment of audio setup

I strongly advise to run alsamixer in a separate terminal window for the
following adjustments. This actually shows you what's going on. Run any other
mixer if you like, as long as you have good control with it. When running
ViaVoice, you'll probably want to mute the CD and line inputs, to avoid
unwanted interference with your microphone input.

Run vvstartuserguru. It will overwrite current mixer settings anyway.

For the playback test, when you move the slider, one of the volume sliders in
alsamixer shold move accordingly. (For me it's the PCM one.) I have a
head-microphone without earphone, so I have to set the desktop speaker volume
very low so the microphone doesn't pick them up too much.

The next step is adjusting the microphone level. vvstartuserguru sometimes
pushes the mic level up to 100%, possibly creating a noisy feedback loop.
Adjust the mic level down with alsamixer, perhaps you want to reduce the total
output volume a bit as well. I get good results from starting with a mic level
close to zero, alsamixer shows how vvstartuserguru increases the mic level
until it can recognise a good signal.

Save the resulting mixer settings with alsactrl as described in the previous
section. It's possible that ViaVoice itself stores the mic level to use for
dictation, but I don't think it stores other things like muting the CD and line
inputs. I find it a good idea to store the alsa values used, you can leave it
out if you like.

6.4 Creating a personal voice model

The next step in vvstartuserguru. Follow the instructions of the program. The
audio level in the little indicator should reach the green area.

6.5 Training the recognition engine

Choose one of the stories. I seem to remember rumours that the longer versions
result in better recognition accuracy later. I have observed vvstartuserguru to
whack up the mic gain to 100% creating a feedback loop, if this happens you
need to set the level back to the value found in the previous step (it's still
useful to keep the window with alsamixer open).

6.6 Start dictation

Run viavoice to test it out.

7.  To Do

Transfer another language from MS-windows to linux, and use that instead. This
will now be difficult - instructions on how to do this are no longer on IBM's
web site, and the required version of ViaVoice Millenium for Windows (demo
version) hasn't been available for download for a while.

Tests of recognition accuracy and system resource use. ViaVoice runs adequately
on a P-III-450MHz with 512MB RAM. I am sure it would run equally well on 256MB

8.  Tips

8.1 More debugging info in the engine log / enable log files

To create router.msg, set the tag api_log_level = 2 in the "defaults" section
of $SPCH_BIN/engine.cfg. The files router.msg and engine.log are written to

8.2 Running ViaVoice as another user

This doesn't seem to be possible any more, and I can't find out why. All of
/dev/{dsp,adsp,snd,audio}* have permissions 666, i.e. are world-writable.
Symptoms are:
a) In vvstartuserguru, the volume adjustment for the output level does nothing,
only the system's mixer's control is effective (e.g. kmix or alsamixer).
b) The input level adjustment, when reading the paragraph for ViaVoice to
adjust its recording quality, plays a short funny sound and highlights the
whole text within a second, and that when the mic input is muted. It then sits
there forever saying "recording is in progress".

This happens with KDE 3.0.5 desktop, both Sun java 2 and IBM java 2, and using
SuSE's very good sux program to change to another user ID inside a konsole.

8.3 Microphone boost

My soundcard's mixer has a "mic boost" setting in alsamixer, as I now
discovered. This hugely increases the microphone gain. Before I needed to run
at 100% mic level, with this boost I can run in the 50s and actually have some
leeway for adjustment. Check your mixer settings, especially if your recording
is quiet and/or you don't get an "excellent" for recording quality.

A.  Links

Here are some related links:
	The patch to apply after installing the rpms. Contains all the
	modifications of sections 4 (excluding /etc/viavoiceps.conf) and the
	scripts which are newly created.
	This is the latest patch; previous ones can be found in the same
	Up-to-date versions of the files which need to be created. Copy to
	/usr/bin. These are included in the patch above.
	Set permissions to 755.
	If you don't have it. Copy to /etc and set permissions to 644.
	More debugging information for IBM's programmers.

	   IBM ViaVoice for Linux
	   IBM ViaVoice Developer's Corner
	   IBM ViaVoice software development kit + FAQ
	   IBM download area             or if it doesn't work:	      (or search for 'java linux download jre')
	IBM java 2, 1.3.1 - you only need the JRE (runtime) to run ViaVoice
	The xvoice project

B.  Quick Start

## As root:

# Install the java JRE from the distribution, or download
# + install IBM java2 1.3.1 (see appendix A for links)

# Mandrake powerpack 8.0 CD 4:
rpm -Uvh /media/cdrom/Mandrake/RPMS4/ViaVoice*
chmod a+r /etc/viavoiceps.conf /usr/lib/menu/vv*

# Download file(s)

umask 22
patch -b -p0 <viavoice-3.1-SuSE8.1.patch
rm /usr/bin/vvuser.orig /usr/bin/viavoice.orig
chmod 755 /usr/bin/vvuser /usr/bin/viavoice

## As normal linux user:

# ViaVioice user setup
vvuser -adduser "Any name you like to be called by ViaVoice" -setdefault

# Audio mixer settings
# See sections 6.2 and following to adjust mixer settings, then to save them
alsactl -f $HOME/viavoice/alsa.state store

# Run ViaVoice dictation

Copyright © 2001–2003 by Volker Kuhlmann
Created: 09 Sep 2001, last updated: 05 Apr 2003         URL: