Adjunct Researcher, Spoken Language Division, Research Department, NINJAL
Doctoral Student, Chiba University / Adjunct Researcher, Spoken Language Division, Research Department, NINJAL
Adjunct Researcher, Center for Corpus Development, NINJAL
Center for Corpus Development, NINJAL
Spoken Language Division, Research Department, NINJAL
This paper describes the criteria and composition method of transcription for the Corpus of Everyday Japanese Conversation, which has been in construction since 2016 and will contain 200 hours of various types of conversations in a balanced distribution. As some expressions are extremely informal, hard to hear, or hard to understand, it is necessary to establish clear criteria for transcription to ensure homogeneous transcription quality from a large number of staff. Methods are also required to transcribe no less than 200 hours of conversations efficiently and in a timely manner. As part of this project, procedures for efficient transcription have been considered, and the development of tools and the revision of criteria of transcription have been conducted. This paper presents said transcription criteria and methods.