National Institute of Informatics/SOKENDAI
National Institute of Informatics
Waseda University
The University of Shiga Prefecture
会議概要(会議名, 開催地, 会期, 主催者等)
LREC 2018 Special Speech Sessions "Speech Resources Collection in Real-World Situations"; Phoenix Seagaia Conference Center, Miyazaki; 2018-05-09
抄録(英)
This paper shows the concept and design of our Miraikan SC Corpus. A well-structured and well-prepared corpus would be useful to engineers for understanding the mechanism of speech production and the nature of social interaction with regard to informing the design of their systems. Applications of the corpus range from speech recognition and dialogue processing to human-agent interaction systems, among others. We started collecting audio-visual data using multiple video cameras and microphones in October 2012 at a science museum in Tokyo, Japan. In this paper, we describe the reason why we chose the museum as a research field for data collection, how we audio-video-recorded the interactions, and how we dealt with personal information in the data set, such as participants’ names, jobs, and places of residence.
雑誌名
Proceedings of the LREC 2018 Special Speech Sessions
ページ
30 - 34
発行年
2018-05-09
フォーマット
application/pdf
著者版フラグ
publisher
出版者
Center for Corpus Development, National Institute for Japanese Language and Linguistics