Content
· About this document
· What's new?
· Known issues
· Copyright
[Back to Top]
What's New?
The Microsoft Speech Platform SDK v11 provides new functionality in the Microsoft Grammar
Development Tools to help you validate, debug, test, and optimize grammars for voice
applications. Here is a summary of the new functionality since the previous public v10.2:
The following tools for grammar authoring are new for v11:
CheckPhrase.exe
This new tool takes a phrase and a grammar as input and outputs whether the phrase is in the
grammar. Check Phrase emits the custom pronunciations associated with a found phrase.
Confusability.exe
This new tool identifies phrases in a grammar that are phonetically similar. The tool can help
detect phrases that may ultimately cause users to have a poor experience, such that one phrase
in the grammar is falsely recognized for another phrase that is also in the grammar. The
Confusability tool accepts multiple grammar files as input and performs its analysis on the
entire set of specified input files.
· You can choose whether or not the tool emits confusable results for phrases
whose semantics match. By default, warnings and results are only emitted when
semantics are different.
· For advanced analysis, the /SemanticPath option allows you to filter the results
to compare only the semantic nodes that you specify.
The following is a summary of enhancements made to grammar development tools since the
previous release of Microsoft Speech Platform SDK (v10.2).
GrammarValidator.exe
· Validates the syntax for the grammar element's sapi:alphabet attribute and the
token element's sapi:pron and sapi:display attributes.
PhraseGenerator.exe
· Generates a subset of phrases based on weights in the grammar, and you can
optionally choose not to expand specified rule references.
Simulator.exe
· You can choose whether or not to reuse the engine state in the structure of the
input EMMA, rather than from an option at the command line. That is, the recognition
engine state is now automatically reused only when utterances are contained within an
emma:sequence block. In addition, the /ReuseEngineState command-line option has
been removed.
· Now provides additional data for tuning your grammars, including out-of-
grammar phrase detection and new error-rate categories to give you improved analysis
and metrics for simulated recognition.
· The Excel template for generating graphs and tables from the output of Simulator
Results Analyzer has been discontinued. See the help file section entitled "Generating
Simulations and Performing Analysis" for details on analysis.
· Explicit handling of SRGS "special" rule behavior. Improved handling and more
descriptive output for grammars that contain the NULL, VOID, and GARBAGE rules have
been added across the Grammar Development Tools. See Special Rules Support.
· Support for cookie pass-through. You can now define cookies to use for local or
web provider types in both the Simulator's recognition configuration file and the EMMA
input file. See Simulator Reference Manual and Setting Up the Grammar Development
Tools.
· New installation option. You can choose the path for the installation of the
Speech Platform SDK version 11.
· Windows XP support for SDK tools. The Grammar Development Tools (with the
exception of PrepareGrammar.exe) are now supported on computers running Windows
XP, but only for some providers. Using the "local" provider is not supported on Windows
XP. See Speech Recognition Engine Configuration File Settings for more information.
Important
· All functionality of the SDK outside of the Grammar Development Tools is NOT
supported on Windows XP.
· Leveraging of phrases based on URL. The set of tools will allow you to reference
grammars via URL.
· Floating point representation. Scientific notation for floating point numbers (such
as 0.1E+1) will not be supported.
[Back to Top]
· 32-Bit (x86)
· 64-Bit (x64)
· Windows Vista (x86 & x64) with Service Pack 2 - all editions except Starter Edition
In addition, Windows XP service Pack 3, has limited support under the following conditions:
· x86 only
· The SDK tools on Windows XP will only work when configured against a hosted
(non-local) recognizer
· 1 GHz CPU
· 512 MB RAM
· 10 GB hard drive
[Back to Top]
· MicrosoftSpeechPlatformSDK.msi
· SpeechPlatformRuntime.msi
· Ensure that you have downloaded and installed .NET 4.0 which is a requirement
for the grammar development tools (download here:
http://www.microsoft.com/downloads/en/details.aspx?FamilyID=9cfb2d51-5ff4-4491-
b0e5-b386f32c0992&displaylang=en)
After the above steps, navigate to where you installed the SDK (the default location is
"%ProgramFiles%\Microsoft SDKs\Speech\<version>\Tools" where you’ll find tools helpful for
grammar development and tuning:
· GrammarValidator.exe
· Simulator.exe
· SimulatorResultsAnalyzer.exe
· PhraseGenerator.exe
· CompileGrammar.exe
· CheckPhrase.exe
· Confusability.exe
· PrepareGrammar.exe
For directions on using the tools, go to the "<Install location>\Docs" and open the help file
entitled "MicrosoftSpeechPlatformSDK.chm". On the left hand side of the navigation under the
“Contents” tab, navigate to the node "Microsoft Grammar Development Tools” for specific tool
instructions.
[Back to Top]
Known issues
1. Regarding the PrepareGrammar.exe tool:
· A grammar with an inline custom pronunciation cannot be found using the tool.
This is true for both an input phrase file, as well as a phrase input at the command line.
· DTMF Handling and CheckPhrase.exe: DTMF grammars which leverage the
repeat attribute of the item element (e.g., <item repeat="1-1">, where "1-1" could be
"2-2", "3-3", etc.) may generate phrases in which the digits are space separated.
Therefore, one may choose to leverage PhraseGenerator.exe in order to first see how
digit phrases are consistent with a grammar before passing digit phrases into
CheckPhrase with a DTMF grammar. For example, “3 3 4” may successfully be found in a
DTMF grammar with CheckPhrase.exe, however, “334” may not be found – depending
on the implementation of the DTMF grammar and the ‘repeat’ attribute.
4. When creating grammars containing javascript in the semantics, using javascript that
does not leverage the ‘out’ object may not be reported correctly. For example, the following
will not generate returned semantics from CheckPhrase:
· out.answer='MY_NOMATCH'; foobar=hello
· out.answer='MY_NOMATCH'; out.foobar='hello';
5. Speech synthesis bookmark events may fire in reverse order if the bookmarks are
directly adjacent to each other (with no content between them).
6. Regarding setup: v10.0 engines (voice, tele, trans, TTS) are not supported with the v11
platform.
7. Side by side installation of the Platforms SDK is not supported, such as running both
v10.2 and v11 of the Platform SDK. Please uninstall old Speech platforms or SDKs before
using the new ones.
[Back to Top]
Yes. The SDK grammar authoring tools require .NET 4.0 to run. You can download and install
.NET from here: http://www.microsoft.com/downloads/en/details.aspx?FamilyID=9cfb2d51-
5ff4-4491-b0e5-b386f32c0992&displaylang=en.
Are there separate SDK installations for 32-bit and 64-bit operating systems?
Yes. You can download both 32-bit and 64-bit installations. Note, you can install the 32-bit
version of the SDK on a 64-bit operating system.
In the current version of the SDK and Simulator.exe, one can 'mimic' the reuse of the engine
state (e.g., as with a single caller speaking across dialogue states) by the structuring of the input
data to Simulator. In sum, you put the set of utterances under a single <Sequence> element,
then the engine state will be reused between recognition requests. Note, this is different from
previous SDK versions in which the reuse was specified for all utterances at the command line
level.
[Back to Top]
Copyright
Information in this document, including URL and other Internet Web site references, is subject
to change without notice. Unless otherwise noted, the companies, organizations, products,
domain names, e-mail addresses, logos, people, places, and events depicted in examples herein
are fictitious. No association with any real company, organization, product, domain name, e-
mail address, logo, person, place, or event is intended or should be inferred. Complying with all
applicable copyright laws is the responsibility of the user. Without limiting the rights under
copyright, no part of this document may be reproduced, stored in or introduced into a retrieval
system, or transmitted in any form or by any means (electronic, mechanical, photocopying,
recording, or otherwise), or for any purpose, without the express written permission of
Microsoft Corporation.
Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual
property rights covering subject matter in this document. Except as expressly provided in any
written license agreement from Microsoft, the furnishing of this document does not give you
any license to these patents, trademarks, copyrights, or other intellectual property.
[Back to Top]