新着情報

トップページ  »   2018年01月12日(金)13:00~16:10 FRIIS講演会「Personalised voices for people with Motor Neurone Disease (also known as ALS) (ALS等の運動ニューロン病の人々のための合成音声のパーソナライズ技術)」Simon King教授 (イギリス・エジンバラ大学) 他

  • 国立大学法人 名古屋工業大学
  • 特別演習スケジュール
  • FRIMS国際シンポジウム

2018年01月12日(金)13:00~16:10 FRIIS講演会「Personalised voices for people with Motor Neurone Disease (also known as ALS) (ALS等の運動ニューロン病の人々のための合成音声のパーソナライズ技術)」Simon King教授 (イギリス・エジンバラ大学) 他

エジンバラ大学サイモン・キング教授、他2名による音声合成研究の最先端に関する講演を開催しますので,博士前期学生,博士後期学生は奮ってご参加ください。

 

日時:2018年01月12日(金)13:00〜16:10(5~8限)

場所: 4号館2階会議室3

主催: 情報科学フロンティア研究院
担当教員:徳田 恵一 教授

 

【講演1】13:00~14:30(5・6限)
演者:英国エジンバラ大学・Simon King教授
演題:Personalised voices for people with Motor Neurone Disease (also known as ALS)
(ALS等の運動ニューロン病の人々のための合成音声のパーソナライズ技術)
内容:エジンバラ大学にて音声技術研究センターのセンター長を務める
Simon King教授に、ALT等で声を失った方のために合成音声をパーソナライズする技術について講演いただきます。

 

 

【講演2】14:40~15:25(7限)
演者:国立情報学研究所・Xin Wang
演題:Autoregressive neural models for statistical parametric speech synthesis
(統計的音声合成のための自己回帰ニューラルネットワーク)
内容:国立情報学研究所のXin Wang氏に最新のディープニューラルネットワークに基づいた音声合成システムについて講演いただきます。

 

 

【講演3】15:25~16:10(8限)
演者:国立情報学研究所・Gustav Eje Henter博士
演題:Perceptual debugging of speech synthesis
(音声合成の知覚的デバッギング)
内容:国立情報学研究所のGustav Eje HENTER博士に合成音声の品質改善のための方法論について講演いただきます。

 

 

================================================================
Personalised voices for people with Motor Neurone Disease (also known as ALS) ================================================================
Speaker:
Simon King
the University of Edinburgh
Abstract:
In the first part of this talk, I'll give an easy-to-understand, non-technical overview of the Speak Unique project, in which we are providing personalised speech communication aids to people who are losing their own voice due to Motor Neurone Disease or other progressive conditions.  We are currently conducting trials in the UK, to measure the improvement to quality of life that these communication aids give.
The second part of the talk will get a little more technical, where I will describe how the technology works.  Using powerful statistical models, and a large database of donated speech from thousands of people, we create accent- and gender-specific "Average Voice Models".  These are then further modified to produce speech that sounds like a particular person.
A unique capability of our approach is that it only needs a small sample of that person's speech and this sample may be
disordered: the person is already becoming hard to understand.
We are able to "repair" the voice by interchanging or interpolating parts of the Average Voice Model into a model learned from the person's own speech.  This results in a computer-generated voice that sounds like a normal, intelligible version of the person, This is finally installed on a mobile device, such as an iPad, for the person to use in daily life.
https://www.speakunique.org
Bio:
Simon King holds M.A. and M.Phil. degrees from Cambridge and a Ph.D. from the University of Edinburgh.  He has been with the Centre for Speech Technology Research at the University of Edinburgh since 1993, where he is now Professor of Speech Processing and the director of the center.  His interests include speech synthesis, recognition and signal processing and he has around 170 publications in these areas.  He has served on the ISCA SynSIG board and co-organizes the Blizzard Challenge.
He previously served on the IEEE SLTC and as an associate editor of the IEEE Transactions on Audio, Speech and Language Processing, and is a current associate editor of Computer Speech and Language.
See also http://frontier.web.nitech.ac.jp/en/archives/faculty/simon-king-ph-d
 
 
================================================================
Autoregressive neural models for statistical parametric speech synthesis ================================================================
Speaker:
Xin Wang
National Institute of Informatics (NII)
Abstract:
The main task of statistical parametric speech synthesis (SPSS) is to convert an input sequence of textual features into a target sequence of acoustic features.  This task can be approached by using various types of neural networks such as recurrent neural networks (RNNs).  Although RNN has shown great performance in SPSS, the standard RNN architecture and the common way of using it are suboptimal.  This talk will explain how a normal RNN is imperfect in modeling the temporal dependency of the target sequence.  Then, this talk will introduce the idea of autoregressive (AR) modeling that may better capture the temporal dependency.  This idea leads to the shallow AR neural models that can alleviate the over-smoothing problem and then deep AR models that enable random sampling for fundamental frequency (F0) generation in SPSS.  This talk will also generalize the shallow AR models to a recent method called AR normalization flow (NF) and show how NF can be used for SPSS.
Bio:
Xin Wang is the 3rd year PhD student from Yamagishi-lab, national institute of informatics.  Before he came to Japan, he obtained the Master’ degree from USTC based on his HMM-based speech synthesis in iFly speech lab.  His main research topic is on parametric speech synthesis using neural networks.
 
 
================================================================
Perceptual debugging of speech synthesis ================================================================
Speaker:
Gustav Eje HENTER
National Institute of Informatics (NII)
Abstract:
Parametric speech synthesis has not yet reached natural segmental quality.  This talk presents "perceptual debugging":
fault-finding techniques that pinpoint which synthesiser design decisions that are responsible for perceptual degradations, to guide future synthesis research.  We cover both methods of dissecting existing synthesisers (for example, isolating the improvements behind the success of DNN-based TTS) as well as for identifying the most important naturalness bottlenecks in leading parametric synthesisers.  The second topic is particularly innovative, as it involves listening to hypothetical future synthesisers beyond our current capabilities, whose output we simulate using repeated readings of the same text.  We find that several factors – not just the switch to DNNs – are responsible for recent improvements in TTS technology, and that statistical parametric speech synthesis quality remains hampered both by our independence assumptions, and (ultimately) by our decision to generate the mean speech as output.  We round off by showing how the use of repeated speech recordings may be extended in the future, to answer questions previously considered unanswerable.
Bio:
Gustav Eje Henter received his MSc and PhD from KTH Royal Institute of Technology in Stockholm, Sweden.  His studies included periods as a visiting student at the University of British Columbia (UBC) in Vancouver, Canada, and at the Victoria University of Wellington (VUW) in Wellington, New Zealand.  Upon completing his PhD in 2013, he took a position as a Marie Curie research fellow at the Centre for Speech Technology Research
(CSTR) at the University of Edinburgh in the UK.  In 2016, he moved from CSTR to a project researcher position in Prof. Junichi Yamagishi's lab at the National Institute of Informatics (NII) in Tokyo, where he remains to this day.
Dr. Henter's primary research topic is statistical speech synthesis, where he is interested in issues surrounding probabilistic modelling, machine learning, perception, and evaluation.
================================================================
 

【大学院科目「特別演習」の履修について】

平成28年度入学以降の大学院のカリキュラムには、最先端研究領域を知るための科目として、下記の科目を開講しています。これらの科目は、海外研究者等の英語によるセミナーへの出席を含めた取組状況により、単位認定を行います。
各科目は1単位の演習のため、2時間相当のセミナーに15回出席するとともに、各回の担当教員の指示による課題に対応してください。
該当するセミナーの案内は学生掲示板及びフロンティア研究院Webサイト( http://frontier.web.nitech.ac.jp/ )にて行いますので、積極的に参加するようにしてください。

・材料・エネルギー特別演習1、材料・エネルギー特別演習2(博士前期課程)
・情報・社会特別演習1、情報・社会特別演習2(博士前期課程)
・材料・エネルギー先進特別演習1、材料・エネルギー先進特別演習2(博士後期課程)
・情報・社会先進特別演習1、情報・社会先進特別演習2(博士後期課程)

※授業運営は以下のとおりとします。
(1)特別演習1、特別演習2及び先進特別演習1、先進特別演習2は開講年次及び開講時期(前期・後期)にかかわらず、出席要件等の取組状況が合格に達した期に、学生が所属する専攻の科目の1または1と2の単位を認定します。
(2)2分野(材料・エネルギー及び情報・社会)のセミナーの区別はせず、ともに出席の要件として認めます。
(3)標準修業年限(博士前期課程:2年、博士後期課程:3年)内に行われるセミナーが出席要件の対象となります。

単位への問合せ先:学生センター1番窓口

 

ページの先頭へ戻る