Hololens Unity 开发之语音识别-低调大师

Hololens Unity 开发之语音识别

2019-01-28 615

一、概述
HoloToolKit Unity 包提供了三种语音输入的方式：
·
Phrase Recognition 短语识别
* KeywordRecognizer 单一关键词识别
* GrammarRecognizer 语法识别
·
·
Dictation Recognition 听写识别
* DictationRecognizer 将声音识别转化为文字
·
Note: KeywordRecognizer 和 GrammarRecognizer 是主动活跃识别的方式~ 也就是说调用开始识别的方法，那么久处于活跃状态开始识别，而DictationRecognizer只要注册了就就在默默的监听语音输入，一旦监听到关键词~那么久触发回调~
二、Unity开发打开Microphone权限
下面是官方文档讲解如何打开microphone权限
The Microphone capability must be declared for an app to leverage Voice input.
1. In the Unity Editor, go to the player settings by navigating to "Edit > Project Settings > Player"
2. Click on the "Windows Store" tab
3. In the "Publishing Settings > Capabilities" section, check the Microphone capability
三、Phrase Recognition 短语识别
To enable your app to listen for specific phrases spoken by the user then take some action, you need to:
1. Specify which phrases to listen for using a KeywordRecognizer or GrammarRecognizer
2. Handle the OnPhraseRecognized event and take action corresponding to the phrase recognized
使用短语识别嘞~需要做两个步骤：
1. 指定需要监听的短语或者关键词
2. 处理识别到短语或者关键词之后的事件回调 ~ OnPhraseRecognized
1、关键词识别 (直接Demo代码~)
using System.Collections;using System.Collections.Generic;using UnityEngine;using UnityEngine.Windows.Speech;using System.Linq;
public class VoiceInputDemo : MonoBehaviour {

public Material yellow;
public Material red;
public Material blue;
public Material green;

/// <summary>
/// 关键词识别对象
/// </summary>
private KeywordRecognizer keywordRecognizer;

/// <summary>
/// 存放关键词的字典
/// </summary>
private Dictionary<string, System.Action> keywords = new Dictionary<string, System.Action>();
// Use this for initialization
void Start () {

// 向字典中添加关键词，key为关键词， vallue为一个匿名action
keywords.Add("yellow", () =>
{
Debug.Log("听到了 yellow");
transform.GetComponent<MeshRenderer>().material = yellow;
});

keywords.Add("red", () =>
{
Debug.Log("听到了 red");
transform.GetComponent<MeshRenderer>().material = red;
});

keywords.Add("green", () =>
{
Debug.Log("听到了 green");
transform.GetComponent<MeshRenderer>().material = green;
});

keywords.Add("blue", () =>
{
Debug.Log("听到了 blue");
transform.GetComponent<MeshRenderer>().material = blue;
});

// 初始化关键词识别对象
keywordRecognizer = new KeywordRecognizer(keywords.Keys.ToArray());

// 添加关键词代理事件
keywordRecognizer.OnPhraseRecognized += KeywordRecognizer_OnPhraseRecognized;

// 注意：这方法一定要写，开始执行监听
keywordRecognizer.Start();
}

private void KeywordRecognizer_OnPhraseRecognized(PhraseRecognizedEventArgs args)
{

System.Action keywordAction;
// if the keyword recognized is in our dictionary, call that Action.
// 如果关键字在我们的字典中被识别，调用该action。
if (keywords.TryGetValue(args.text, out keywordAction))
{
Debug.Log("听到了，进入了事件方法关键词语： " + args.text.ToString());

// 执行该action
keywordAction.Invoke();
}
}

// Update is called once per frame
void Update () {

}
}
## 2、语法识别 GrammarRecognizer
按照官方文档上来说的我得创建一个 SRGS 的XML文件放在 StreamingAssets 文件夹下~不过我没有做到英文语法输入的需求 ~ 感兴趣的点击 https://msdn.microsoft.com/en-us/library/hh378349 (v=office.14).aspx 自己查看官方文段对SRGS的讲解~
下面贴的一段官方文档的代码
Once you have your SRGS grammar, and it is in your project in a StreamingAssets folder:
<PROJECT_ROOT>/Assets/StreamingAssets/SRGS/myGrammar.xml
Create a GrammarRecognizer and pass it the path to your SRGS file:
private GrammarRecognizer grammarRecognizer;
grammarRecognizer = new GrammarRecognizer(Application.streamingDataPath + "/SRGS/myGrammar.xml");
Now register for the OnPhraseRecognized event
grammarRecognizer.OnPhraseRecognized += grammarRecognizer_OnPhraseRecognized;
You will get a callback containing information specified in your SRGS grammar which you can handle appropriately. Most of the important information will be provided in the semanticMeanings array.
private void Grammar_OnPhraseRecognized(PhraseRecognizedEventArgs args){
SemanticMeaning[] meanings = args.semanticMeanings;
// do something
}
Finally, start recognizing!
grammarRecognizer.Start();
四、听写1、概述
DictationRecognizer 使用这个对象可以识别语音输入转化为文本，使用这个对象有三个步骤~
1. 创建一个DictationRecognizer对象
2. 注册Dictation 事件
3. 开始识别听写
2、开启网络客户端权限
The "Internet Client" capability, in addition to the "Microphone" capability mentioned above, must be declared for an app to leverage dictation.
1. In the Unity Editor, go to the player settings by navigating to "Edit > Project Settings > Player" page
2. Click on the "Windows Store" tab
3. In the "Publishing Settings > Capabilities" section, check the InternetClient capability
3、Demo代码示例~
using System.Collections;using System.Collections.Generic;using UnityEngine;using UnityEngine.Windows.Speech;
public class VoiceDictationDemo : MonoBehaviour
{

private DictationRecognizer dictationRecognizer;

// Use this for initialization
void Start()
{

// 定义一个听写对象
dictationRecognizer = new DictationRecognizer();

// 注册一个结果回调事件
dictationRecognizer.DictationResult += DictationRecognizer_DictationResult;
// 注册一个完成事件
dictationRecognizer.DictationComplete += DictationRecognizer_DictationComplete;
// 注册一个错误事件
dictationRecognizer.DictationError += DictationRecognizer_DictationError;
// 注册一个识别语句的事件
dictationRecognizer.DictationHypothesis += DictationRecognizer_DictationHypothesis;

dictationRecognizer.Start();
}

private void DictationRecognizer_DictationHypothesis(string text)
{
Debug.Log("进入了Hypothesis 的事件回调 ~ " + text);
dictationRecognizer.Start();
}

private void DictationRecognizer_DictationError(string error, int hresult)
{
Debug.Log("进入了Error 的事件回调 ~ " + error + " 状态码 " + hresult);
dictationRecognizer.Start();
}

private void DictationRecognizer_DictationComplete(DictationCompletionCause cause)
{

Debug.Log("进入了Complete 的事件回调 ~ " + cause);
dictationRecognizer.Start();
}

private void DictationRecognizer_DictationResult(string text, ConfidenceLevel confidence)
{
Debug.Log("进入了Result 的事件回调 ~ " + text + " 枚举 " + confidence);
dictationRecognizer.Start();
}

void OnDestroy()
{
// 销毁事件
dictationRecognizer.DictationResult -= DictationRecognizer_DictationResult;
dictationRecognizer.DictationComplete -= DictationRecognizer_DictationComplete;
dictationRecognizer.DictationHypothesis -= DictationRecognizer_DictationHypothesis;
dictationRecognizer.DictationError -= DictationRecognizer_DictationError;
dictationRecognizer.Dispose();
}

}
用有道里面的英语短视频做了下测试~ 几乎能达到百分之九十八以上的识别率。。感叹微软做的挺不错的~
五、同时使用语音识别和听写（文档翻译）
If you want to use both phrase recognition and dictation in your app, you'll need to fully shut one down before you can start the other. If you have multiple KeywordRecognizers running, you can shut them all down at once with:
如果你想同时使用语音识别和听写识别，那么你必须关闭一个再启动另外一个~ 如果你有多个语音识别的对象KeywordRecognizers，那么你可以通过下面的方法把他们全部关闭~
PhraseRecognitionSystem.Shutdown();
In order to restore all recognizers to their previous state, after the DictationRecognizer has stopped, you can call:
当然，你也可以恢复关闭前的所有状态，当在你的听写识别结束的时候，你可以调用下面的方法恢复之前的语音识别~
PhraseRecognitionSystem.Restart();
You could also just start a KeywordRecognizer, which will restart the PhraseRecognitionSystem as well.
当然，你也可以只启动一个KeywordRecognizer语音识别对象~同样的也是用PhraseRecognitionSystem来控制其暂停或者恢复~

更多unity2018的功能介绍请到paws3d学习中心查找。

微信关注我们

原文链接：https://yq.aliyun.com/articles/689316

转载内容版权归作者及来源网站所有！

低调大师中文资讯倾力打造互联网数据资讯、行业资源、电子商务、移动互联网、网络营销平台。持续更新报道IT业界、互联网、市场资讯、驱动更新,是最及时权威的产业资讯及硬件资讯报道平台。

可应用于实际的14个NLP突破性研究成果（三）

8.用于语义角色标注的语言学信息自我注意力方法，作者：EMMA STRUBELL，PATRICK VERGA，DANIEL ANDOR，DAVID WEISS，ANDREW MCCALLUM 论文摘要当前最先进的语义角色标记（SRL）使用深度神经网络，但没有明确的语言特征。之前的工作表明，抽象语法树可以显著改善SRL，从而提高模型准确性。在这项研究中，我们提出了语言学的自我关注（LISA）：该神经网络模型将 multi-head self-attention 与多任务学习相结合，包括依赖解析、词性标注、谓词检测和语义角色标记。与先前需要大量预处理来准备语言特征的模型不同，LISA 可以仅使用原始的 token 对序列进行一次编码，来同时执行多个预测任务。此外，如果已经有高质量的语法分析，则可以在测试时加入，而无需重新训练我们的SRL

2019-01-28

624

阿里云华东区资深客户“百应科技”，专注智能电话机器人呼叫中心，阿里云大数据、深度学习、语音识别(ASR),语义理解(NLP)多伦会话、赋能百应AI+CRM，应用于企业的营销和客户服务领域，帮助企业提高商业效能，降低商业能耗。2018年连续拿到多轮融资。本期节目我们邀请到百应科技CTO式开进行独家专访，揭秘客户上云历程和融资经验。本次拜访视频https://yq.aliyun.com/live/839 关于百应科技百应科技是一家大数据人工智能的公司，用互联网精神和机器学习技术颠覆传统知识管理、智慧服务以及大数据分析，成为智慧大数据时代领先的新一代知识管理、智慧服务、大数据分析处理产品的提供商。百应科技以电话机器人为切入，用AI助力企业和每一个用户的每一次连接。百应科技的公司名称源自成语“一呼百应”：一个人呼喊，马上有很多人响应”。百应

2019-01-30

583

资源下载

更多资源

优质分享App

近一个月的开发和优化，本站点的第一个app全新上线。该app采用极致压缩，本体才4.36MB。系统里面做了大量数据访问、缓存优化。方便用户在手机上查看文章。后续会推出HarmonyOS的适配版本。

Mario

马里奥是站在游戏界顶峰的超人气多面角色。马里奥靠吃蘑菇成长，特征是大鼻子、头戴帽子、身穿背带裤，还留着胡子。与他的双胞胎兄弟路易基一起，长年担任任天堂的招牌角色。

Rocky Linux

Rocky Linux（中文名：洛基）是由Gregory Kurtzer于2020年12月发起的企业级Linux发行版，作为CentOS稳定版停止维护后与RHEL（Red Hat Enterprise Linux）完全兼容的开源替代方案，由社区拥有并管理，支持x86_64、aarch64等架构。其通过重新编译RHEL源代码提供长期稳定性，采用模块化包装和SELinux安全架构，默认包含GNOME桌面环境及XFS文件系统，支持十年生命周期更新。

Sublime Text

Sublime Text具有漂亮的用户界面和强大的功能，例如代码缩略图，Python的插件，代码段等。还可自定义键绑定，菜单和工具栏。Sublime Text 的主要功能包括：拼写检查，书签，完整的 Python API ， Goto 功能，即时项目切换，多选择，多窗口等等。Sublime Text 是一个跨平台的编辑器，同时支持Windows、Linux、Mac OS X等操作系统。