nvsload
The nvsload utility is a performance test sample application that Nuance uses to test Vocalizer performance. Nuance provides access to this tool to allow users to test performance on their specific configuration. See Performance testing for a detailed explanation of this test application and the test methodology it implements.
To run the sample application, open a command prompt and run the nvsload.exe utility.
The utility is typically found in:
- Linux:
- /usr/local/Nuance/Common/x86/bin/ (32 bit)
- /usr/local/Nuance/Common/amd64/bin/ (64 bit)
- Windows:
- C:\Program Files (x86)\Common Files\Nuance\Common\x86\bin\ (32 bit)
- C:\Program Files\Common Files\Nuance\Common\amd64\bin\ (64 bit)
You can start the 32 bit version (in Linux) or 64 bit version (in Windows) from any directory. However, you can start a specific version (that is, the 64 bit version in Linux or 32 bit version in Windows) from the directory where it is installed. The syntax is:
> nvsload options
Options
Use double quotes around option values to avoid errors. For example, -a "audio/L16;rate=8000" or -c "text/plain;charset=iso-8859-1".
The available options include:
-c minChan[:maxChan[:chanStep]] | The channel range, starting at a minChan minimum channel value. Optionally, you can also specify the maximum channel value (maxChan) and the interval to use between channels (chanStep) By default, the utility uses values 1:1:1. |
-f testTextFile | Name of the directory containing test texts. The utility looks for test texts in "%VOCALIZER_SDK%\test_data" by default. |
-o testTextDomain | A comma-separated list of names of the test text domains, such as "standard,ssml". The default is "standard". |
-d initDelay | Maximum initial delay per channel, measured in seconds. The default is 15 seconds. |
-l testLength | Duration of the test, such as "5 minutes", "72 hours", "3 days", or "10 loops". If the unit is not specified, the utility uses 1 loop as the default, where the number of loops is the number of speak requests per channel. |
-x latency | Maximum acceptable audio latency, measured in milliseconds. A value of 0 means no latency restriction. The default is 0. |
-b | Bombard the engine with requests (default: simulate the audio playback delay) |
-s | Speak texts sequentially. By default, the utility speaks texts in a random order |
-k | Enable audio data checksum. The default is not enabled. |
-p params | Voice parameters, can be specified multiple times. The default value of "vendor=Nuance Communications, Inc" parameter will test all installed voices. |
-r reconCount | Restart connections every N speak requests. The default is 0, meaning no restarts. |
-v logLevel | Level of logging detail, ranging from 0 for summary statistics to 4 for full information. The default is 0 (summary statistics). |
-a audioFormat | MIME audio-type describing the desired audio output format. By default, picks randomly pick between 8 kHz mu-law, |
-t stop% | Stops the specified stop% of speak requests. The default is 0%. |
-S size | Audio buffer size, in bytes. No default. |
-h | Shows a help message which includes this summary of options. |
Modes
There are two main modes of operation for the performance testing application:
- Realtime audio: The realtime audio mode is the default, and simulates an application that makes a speak request, and then waits for approximately the time the audio would play before making the next request. This simulates an application that does constant synthesis with realtime audio playback to end-users.
- Bombard mode: Specified with the -b option, the bombard mode sends requests to the server with only a nominal delay between requests. This simulates the worst case performance.
The user can specify the -s option to use the texts sequentially, rather than the default of choosing each text randomly.
The user can also specify a delay before channels start at the beginning of the test run with the -d option to simulate staggering of requests. Each channel waits a random time up to the specified delay before the first speak request.
Output format
The level of detail provided by the program is specified with the -v option, which takes an integer argument with a default value of 0. The program produces more detailed output as the number increases. Useful values are:
- 0: Just print the test conditions and a table of results.
- 1: As above plus a summary for each individual request and a notification for each under-run.
- 2: As above plus raw timing for each speak request, first audio packet, and leading and ending book marks.
- 3: As above plus speak started and ended notifications.
- 4: As above plus raw timing for every audio packet.
- 5: As above plus port state transitions.
The user can also specify the -x option to specify the maximum acceptable time-to-first-audio latency in milliseconds. Using 0 or omitting the -x option makes all latencies deemed "acceptable". That value is used to print out percentages of "bad latencies" in the table of results, it has no other affect on the test runs or statistics.
Timing information for individual requests are preceded by the thread number.
Note that the channel count printed in the table of results reflects the number of channels successfully opened, which can be less than the user-requested number of channels if errors occurred when opening a channel.
Creating a test file
The test file is selected with the -f option, or defaults to a Vocalizer installation directory that contains news text corpora.
A test file consists of a header section followed by a blank line, then all of the text to be used for testing. The header section specifies how to split the text into smaller parts used for each individual request. Each header line consists of an alphanumeric name (no spaces allowed), then the start line and number of lines for each individual test. The line numbers are absolute, counting the first line in the file as 1. The header can have optional lines with embedded parameters relevant to the test file being used. The only parameter currently interpreted is content-type.
The text file must use a character set that is a superset of ASCII (preserves ASCII characters as-is) due to the header being within the same file as the input text. Good choices are UTF-8 for all languages, or ISO-8859-1 or US-ASCII for Western languages.
An example test file is shown below:
#!content-type text/plain;charset=US-ASCII
ex1 5 1
ex2 6 2
This is an example text for testing.
And this is another example, but
it extends over two lines.
When run, the first instruction (ex1 5 1) takes the fifth line of the file (“This is an example text for testing.”). The second instruction (ex2 6 2) starts in line 6 and takes two lines (“And this is another example, but it extends over two lines.”).