連濁の生起率に基づく日本語複合語の分類 : 連濁データベースによる研究

太田, 聡; オオタ, サトシ; 太田, 真理; オオタ, シンリ; OHTA, Satoshi; OHTA, Shinri

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

連濁の生起率に基づく日本語複合語の分類 : 連濁データベースによる研究

https://doi.org/10.15084/00000814

名前 / ファイル	ライセンス	アクション
papers1009.pdf (827.5 kB)

Item type

紀要論文 / Departmental Bulletin Paper(1)

公開日

2016-01-29

タイトル

連濁の生起率に基づく日本語複合語の分類 : 連濁データベースによる研究

タイトル

Classification of Japanese Compounds Based on the Frequency of Rendaku : A Study Using the Rendaku Database

言語

jpn

キーワード

主題Scheme

Other

主題

連濁

キーワード

主題Scheme

Other

主題

複合語

キーワード

主題Scheme

Other

主題

生起率

キーワード

主題Scheme

Other

主題

クラスター分析

キーワード

主題Scheme

Other

主題

混合正規分布モデル

キーワード

言語

主題Scheme

Other

主題

rendaku

キーワード

言語

主題Scheme

Other

主題

compound word

キーワード

言語

主題Scheme

Other

主題

frequency

キーワード

言語

主題Scheme

Other

主題

cluster analysis

キーワード

言語

主題Scheme

Other

主題

Gaussian mixture model

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_6501

資源タイプ

departmental bulletin paper

ID登録

10.15084/00000814

ID登録タイプ

JaLC

著者

太田, 聡

WEKO 6493

	太田, 聡
ja-Kana	オオタ, サトシ

Search repository

太田, 真理

WEKO 6494

	太田, 真理
ja-Kana	オオタ, シンリ

Search repository

OHTA, Satoshi
OHTA, Shinri

著者所属

内容記述タイプ

Other

内容記述

山口大学

著者所属

内容記述タイプ

Other

内容記述

東京大学

著者所属(英)

内容記述タイプ

Other

内容記述

Yamaguchi University

著者所属(英)

内容記述タイプ

Other

内容記述

The University of Tokyo

抄録

内容記述タイプ

Abstract

内容記述

連濁はもっとも広く知られた日本語の音韻現象の1つである。先行研究では，日本語の複合語は連濁の生起率の違いに基づいて，いくつかのグループに分類されることが提案されている。しかしながら先行研究では，連濁生起率の分類基準が恣意的であった点，またグループの数をあらかじめ仮定していた点に問題があった。そこで本研究では，混合正規分布モデルに基づくクラスター分析と連濁データベース（Irwin and Miyashita 2015）を用いて，日本語複合語を分類する際の最適な分類基準とクラスター数を検討した。複合名詞と複合動詞のどちらも，2つのクラスターを仮定したモデルが最適であり，クラスター同士の分類基準は，複合名詞では連濁生起率が90%，複合動詞では40%であった。これらの結果は先行研究のクラスター数や分類基準とは異なるものであった。我々の結果は，モデルに基づくクラスター分析が言語データに対する最適な分類を行う上で非常に有効であることを示すものである。

抄録(英)

内容記述タイプ

Other

内容記述

Rendaku is one of the most well-known phonological phenomena in Japanese, which voices the initial obstruent of the second element of a compound. Previous studies have proposed that Japanese compound words can be classified on the basis of the frequency of rendaku (rendaku rate). However, since these studies used arbitrary criteria to determine clusters, such as 33% and 66%, as well as arbitrary numbers of clusters, it is crucial to examine the plausibility of such criteria. In this study, we examined the optimal boundary criteria as well as the optimal number of clusters using a clustering analysis based on Gaussian mixture modeling and the Rendaku Database (Irwin and Miyashita 2015). The cluster analyses clarified that the two-cluster model was optimal for classifying both compound nouns and compound verbs. The boundary values of the rendaku rate for these clusters were approximately 90% and 40% for the compound nouns and compound verbs, respectively. These results were inconsistent with the findings of previous studies. Our findings demonstrate that model-based clustering analysis is an effective method of determining optimal classification of linguistic data.

出版者

国立国語研究所

書誌情報

国立国語研究所論集
en : NINJAL Research Papers

号 10, p. 179-191, 発行日 2016-01

ISSN

収録物識別子タイプ

ISSN

収録物識別子

2186-134X

ISSN

収録物識別子タイプ

ISSN

収録物識別子

2186-1358

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AA12536262

フォーマット

内容記述タイプ

Other

内容記述

application/pdf

著者版フラグ

出版タイプ

VoR

出版タイプResource

http://purl.org/coar/version/c_970fb48d4fbd8a85

戻る

views

See details

	Views

Versions

Ver.1

2023-05-15 15:18:28.757341

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR 2.0
JPCOAR 1.0
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

連濁の生起率に基づく日本語複合語の分類 : 連濁データベースによる研究

× 太田, 聡

× 太田, 真理

× OHTA, Satoshi

× OHTA, Shinri

Versions

Share

Cite as

エクスポート