The National Language Research Institute Research Report ; 124
抄録(英)
The aim of this report is to explain the techniques accumulated in the National Institute for Japanese Language through the construction of the Corpus of Spontaneous Japanese (CSJ) and to make them available to all who are interested. The CSJ is a large-scale database of spoken Japanese which is a result of the ‘Spontaneous Speech: Corpus and Processing Technology' Project jointly conducted by the Communications Research Laboratory, the Tokyo Institute of Technology and the National Institute for Japanese Language.
KOISO Hanae and OGURA Hideki were in charge of editing this report.
Contents:
Chapter 1. MAEKAWA Kikuo: Outline
Chapter 2. KOISO Hanae, NISHIKAWA Ken'ya and MABUCHI Yoko: Transcribed text
Chapter 3. OGURA Hideki: Morphological information
Chapter 4. YAMAGUCHI Masaya: Short-unit and long-unit database
Chapter 5. MARUYAMA Takehiko, TAKANASHI Katsuya and UCHIYAMA Kiyotaka: Phrase unit information
Chapter 6. FUJIMOTO Masako, KIKUCHI Hideaki and MAEKAWA Kikuo: Segmental phoneme information
Chapter 7. IGARASHI Yosuke, KIKUCHI Hideaki and MAEKAWA Kikuo: Prosodic information
Chapter 8. KIKUCHI Hideaki and TSUKAHARA Wataru: XML documents
Chapter 9. MAEKAWA Kikuo: Information retrieval of CSJ
Bibliography
Index