Department of Linguistic Theory and Structure, NINJAL
Kyushu University
Adjunct Researcher, Center for Corpus Development, NINJAL
Kyushu University
Department of Language Change and Variation, NINJAL
Chiba University
In this paper, we report on the activity of a preparatory project to build a large-scale corpus of conversational Japanese (NINJAL collaborative research project, 2014/7/1-2015/8/31). The overall aims of this project are: i) to establish a corpus design for collecting various kinds of everyday conversations in a balanced manner, ii) to develop a methodology of recording naturally occurring conversations, and iii) to create a transcription system suitable for effectively transcribing natural conversations. This report focuses on the first issue of establishing a corpus design. We first describe our survey of everyday conversational behavior, conducted with about 250 Japanese adults last year, in order to reveal how diverse our everyday conversational behavior is, and to build an empirical foundation for corpus design. The questionnaire included when, where, how long, with whom, and in what kind of activity informants were engaged in conversations. We found that ordinary conversations show the following tendencies: i) they mainly consist of chats, business talks, and consultations; ii) in general, the number of participants is small and the duration is short; iii) many conversations are conducted in private places such as homes, as well as in public places such as offices and schools; and iv) some questionnaire items are related to each other. Based on these results, we discuss how to design a balanced corpus of conversational Japanese.