Continuous Speech Recognition Based on General Factor Dependent Acoustic Models

Suzuki, Hiroyuki; Zen, Heiga; Nankaku, Yoshihiko; 南角, 吉彦; ナンカク, ヨシヒコ; Miyajima, Chiyomi; Tokuda, Keiichi; 徳田, 恵一; トクダ, ケイイチ; Kitamura, Tadashi

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

{"_buckets": {"deposit": "f2d6b5e2-7032-4416-b343-031efea52c77"}, "_deposit": {"created_by": 3, "id": "5233", "owners": [3], "pid": {"revision_id": 0, "type": "depid", "value": "5233"}, "status": "published"}, "_oai": {"id": "oai:nitech.repo.nii.ac.jp:00005233", "sets": ["31"]}, "author_link": ["8555", "3028", "464", "17805", "17808", "8864"], "item_10001_biblio_info_28": {"attribute_name": "書誌情報", "attribute_value_mlt": [{"bibliographicIssueDates": {"bibliographicIssueDate": "2005-03-01", "bibliographicIssueDateType": "Issued"}, "bibliographicIssueNumber": "3", "bibliographicPageEnd": "417", "bibliographicPageStart": "410", "bibliographicVolumeNumber": "E88-D", "bibliographic_titles": [{"bibliographic_title": "IEICE transactions on information and systems"}]}]}, "item_10001_description_36": {"attribute_name": "内容記述", "attribute_value_mlt": [{"subitem_description": "This paper describes continuous speech recognition incorporating the additional complement information, e.g., voice characteristics, speaking styles, linguistic information and noise environment, into HMM-based acoustic modeling. In speech recognition systems, context-dependent HMMs, i.e., triphone, and the tree-based context clustering have commonly been used. Several attempts to utilize not only phonetic contexts, but additional complement information based on context (factor) dependent HMMs have been made in recent years. However, when the additional factors for testing data are unobserved, methods for obtaining factor labels is required before decoding. In this paper, we propose a model integration technique based on general factor dependent HMMs for decoding. The integrated HMMs can be used by a conventional decoder as standard triphone HMMs with Gaussian mixture densities. Moreover, by using the results of context clustering, the proposed method can determine an optimal number of mixture components for each state dependently of the degree of influence from additional factors. Phoneme recognition experiments using voice characteristic labels show significant improvements with a small number of model parameters, and a 19.3% error reduction was obtained in noise environment experiments.", "subitem_description_type": "Other"}]}, "item_10001_description_38": {"attribute_name": "フォーマット", "attribute_value_mlt": [{"subitem_description": "application/pdf", "subitem_description_type": "Other"}]}, "item_10001_full_name_27": {"attribute_name": "著者別名", "attribute_value_mlt": [{"affiliations": [{"affiliationNames": [{"affiliationName": "", "lang": "ja"}], "nameIdentifiers": [{"nameIdentifier": "", "nameIdentifierScheme": "ISNI", "nameIdentifierURI": "http://www.isni.org/isni/"}]}], "familyNames": [{"familyName": "Nankaku", "familyNameLang": "en"}, {"familyName": "南角", "familyNameLang": "ja"}, {"familyName": "ナンカク", "familyNameLang": "ja-Kana"}], "givenNames": [{"givenName": "Yoshihiko", "givenNameLang": "en"}, {"givenName": "吉彦", "givenNameLang": "ja"}, {"givenName": "ヨシヒコ", "givenNameLang": "ja-Kana"}], "nameIdentifiers": [{"nameIdentifier": "8555", "nameIdentifierScheme": "WEKO"}, {"nameIdentifier": "1000080397497", "nameIdentifierScheme": "NRID", "nameIdentifierURI": "http://rns.nii.ac.jp/nr/1000080397497"}], "names": [{"name": "Nankaku, Yoshihiko", "nameLang": "en"}, {"name": "南角, 吉彦", "nameLang": "ja"}, {"name": "ナンカク, ヨシヒコ", "nameLang": "ja-Kana"}]}, {"affiliations": [{"affiliationNames": [{"affiliationName": "", "lang": "ja"}], "nameIdentifiers": [{"nameIdentifier": "", "nameIdentifierScheme": "ISNI", "nameIdentifierURI": "http://www.isni.org/isni/"}]}], "familyNames": [{"familyName": "Tokuda", "familyNameLang": "en"}, {"familyName": "徳田", "familyNameLang": "ja"}, {"familyName": "トクダ", "familyNameLang": "ja-Kana"}], "givenNames": [{"givenName": "Keiichi", "givenNameLang": "en"}, {"givenName": "恵一", "givenNameLang": "ja"}, {"givenName": "ケイイチ", "givenNameLang": "ja-Kana"}], "nameIdentifiers": [{"nameIdentifier": "464", "nameIdentifierScheme": "WEKO"}, {"nameIdentifier": "1000020217483", "nameIdentifierScheme": "NRID", "nameIdentifierURI": "http://rns.nii.ac.jp/nr/1000020217483"}], "names": [{"name": "Tokuda, Keiichi", "nameLang": "en"}, {"name": "徳田, 恵一", "nameLang": "ja"}, {"name": "トクダ, ケイイチ", "nameLang": "ja-Kana"}]}, {"nameIdentifiers": [{"nameIdentifier": "3028", "nameIdentifierScheme": "WEKO"}], "names": [{"name": "北村, 正"}]}]}, "item_10001_publisher_29": {"attribute_name": "出版者", "attribute_value_mlt": [{"subitem_publisher": "Institute of Electronics, Information and Communication Engineers"}]}, "item_10001_source_id_30": {"attribute_name": "ISSN", "attribute_value_mlt": [{"subitem_source_identifier": "09168532", "subitem_source_identifier_type": "ISSN"}]}, "item_10001_source_id_32": {"attribute_name": "書誌レコードID（NCID）", "attribute_value_mlt": [{"subitem_source_identifier": "AA10826272", "subitem_source_identifier_type": "NCID"}]}, "item_10001_version_type_33": {"attribute_name": "著者版フラグ", "attribute_value_mlt": [{"subitem_version_resource": "http://purl.org/coar/version/c_970fb48d4fbd8a85", "subitem_version_type": "VoR"}]}, "item_creator": {"attribute_name": "著者", "attribute_type": "creator", "attribute_value_mlt": [{"creatorNames": [{"creatorName": "Suzuki, Hiroyuki", "creatorNameLang": "en"}], "nameIdentifiers": [{"nameIdentifier": "17805", "nameIdentifierScheme": "WEKO"}]}, {"creatorNames": [{"creatorName": "Zen, Heiga", "creatorNameLang": "en"}], "nameIdentifiers": [{"nameIdentifier": "8864", "nameIdentifierScheme": "WEKO"}]}, {"creatorAffiliations": [{"affiliationNameIdentifiers": [{"affiliationNameIdentifier": "", "affiliationNameIdentifierScheme": "ISNI", "affiliationNameIdentifierURI": "http://www.isni.org/isni/"}], "affiliationNames": [{"affiliationName": "", "affiliationNameLang": "ja"}]}], "creatorNames": [{"creatorName": "Nankaku, Yoshihiko", "creatorNameLang": "en"}, {"creatorName": "南角, 吉彦", "creatorNameLang": "ja"}, {"creatorName": "ナンカク, ヨシヒコ", "creatorNameLang": "ja-Kana"}], "familyNames": [{"familyName": "Nankaku", "familyNameLang": "en"}, {"familyName": "南角", "familyNameLang": "ja"}, {"familyName": "ナンカク", "familyNameLang": "ja-Kana"}], "givenNames": [{"givenName": "Yoshihiko", "givenNameLang": "en"}, {"givenName": "吉彦", "givenNameLang": "ja"}, {"givenName": "ヨシヒコ", "givenNameLang": "ja-Kana"}], "nameIdentifiers": [{"nameIdentifier": "8555", "nameIdentifierScheme": "WEKO"}, {"nameIdentifier": "1000080397497", "nameIdentifierScheme": "NRID", "nameIdentifierURI": "http://rns.nii.ac.jp/nr/1000080397497"}]}, {"creatorNames": [{"creatorName": "Miyajima, Chiyomi", "creatorNameLang": "en"}], "nameIdentifiers": [{"nameIdentifier": "17808", "nameIdentifierScheme": "WEKO"}]}, {"creatorAffiliations": [{"affiliationNameIdentifiers": [{"affiliationNameIdentifier": "", "affiliationNameIdentifierScheme": "ISNI", "affiliationNameIdentifierURI": "http://www.isni.org/isni/"}], "affiliationNames": [{"affiliationName": "", "affiliationNameLang": "ja"}]}], "creatorNames": [{"creatorName": "Tokuda, Keiichi", "creatorNameLang": "en"}, {"creatorName": "徳田, 恵一", "creatorNameLang": "ja"}, {"creatorName": "トクダ, ケイイチ", "creatorNameLang": "ja-Kana"}], "familyNames": [{"familyName": "Tokuda", "familyNameLang": "en"}, {"familyName": "徳田", "familyNameLang": "ja"}, {"familyName": "トクダ", "familyNameLang": "ja-Kana"}], "givenNames": [{"givenName": "Keiichi", "givenNameLang": "en"}, {"givenName": "恵一", "givenNameLang": "ja"}, {"givenName": "ケイイチ", "givenNameLang": "ja-Kana"}], "nameIdentifiers": [{"nameIdentifier": "464", "nameIdentifierScheme": "WEKO"}, {"nameIdentifier": "1000020217483", "nameIdentifierScheme": "NRID", "nameIdentifierURI": "http://rns.nii.ac.jp/nr/1000020217483"}]}, {"creatorNames": [{"creatorName": "Kitamura, Tadashi", "creatorNameLang": "en"}], "nameIdentifiers": [{"nameIdentifier": "3028", "nameIdentifierScheme": "WEKO"}]}]}, "item_files": {"attribute_name": "ファイル情報", "attribute_type": "file", "attribute_value_mlt": [{"accessrole": "open_date", "date": [{"dateType": "Available", "dateValue": "2017-01-25"}], "displaytype": "detail", "download_preview_message": "", "file_order": 0, "filename": "E88-D_410.pdf", "filesize": [{"value": "496.9 kB"}], "format": "application/pdf", "future_date_message": "", "is_thumbnail": false, "licensefree": "Copyright(c)2005 IEICE　http://search.ieice.org/index.html", "licensetype": "license_free", "mimetype": "application/pdf", "size": 496900.0, "url": {"label": "本文_fulltext", "url": "https://nitech.repo.nii.ac.jp/record/5233/files/E88-D_410.pdf"}, "version_id": "3300d995-7d0a-46eb-a6e2-8eeebf631bb6"}]}, "item_language": {"attribute_name": "言語", "attribute_value_mlt": [{"subitem_language": "eng"}]}, "item_resource_type": {"attribute_name": "資源タイプ", "attribute_value_mlt": [{"resourcetype": "journal article", "resourceuri": "http://purl.org/coar/resource_type/c_6501"}]}, "item_title": "Continuous Speech Recognition Based on General Factor Dependent Acoustic Models", "item_titles": {"attribute_name": "タイトル", "attribute_value_mlt": [{"subitem_title": "Continuous Speech Recognition Based on General Factor Dependent Acoustic Models", "subitem_title_language": "en"}]}, "item_type_id": "10001", "owner": "3", "path": ["31"], "permalink_uri": "https://nitech.repo.nii.ac.jp/records/5233", "pubdate": {"attribute_name": "公開日", "attribute_value": "2012-11-07"}, "publish_date": "2012-11-07", "publish_status": "0", "recid": "5233", "relation": {}, "relation_version_is_last": true, "title": ["Continuous Speech Recognition Based on General Factor Dependent Acoustic Models"], "weko_shared_id": 3}

Continuous Speech Recognition Based on General Factor Dependent Acoustic Models

https://nitech.repo.nii.ac.jp/records/5233

名前 / ファイル	ライセンス	アクション
本文_fulltext (496.9 kB)	Copyright(c)2005 IEICE　http://search.ieice.org/index.html

Item type

学術雑誌論文 / Journal Article(1)

公開日

2012-11-07

タイトル

言語

タイトル

Continuous Speech Recognition Based on General Factor Dependent Acoustic Models

言語

eng

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_6501

資源タイプ

journal article

著者

Suzuki, Hiroyuki
Zen, Heiga
南角, 吉彦

WEKO 8555
NRID 1000080397497

en	Nankaku, Yoshihiko
ja	南角, 吉彦 ISNI
ja-Kana	ナンカク, ヨシヒコ

Search repository

Miyajima, Chiyomi
徳田, 恵一

WEKO 464
NRID 1000020217483

en	Tokuda, Keiichi
ja	徳田, 恵一 ISNI
ja-Kana	トクダ, ケイイチ

Search repository

Kitamura, Tadashi

著者別名

姓名

Nankaku, Yoshihiko

言語

姓名

南角, 吉彦

言語

姓名

ナンカク, ヨシヒコ

言語

ja-Kana

著者別名

姓名

Tokuda, Keiichi

言語

姓名

徳田, 恵一

言語

姓名

トクダ, ケイイチ

言語

ja-Kana

著者別名

姓名

北村, 正

書誌情報

IEICE transactions on information and systems

巻 E88-D, 号 3, p. 410-417, 発行日 2005-03-01

出版者

Institute of Electronics, Information and Communication Engineers

ISSN

収録物識別子タイプ

ISSN

収録物識別子

09168532

書誌レコードID（NCID）

収録物識別子タイプ

NCID

収録物識別子

AA10826272

著者版フラグ

出版タイプ

VoR

出版タイプResource

http://purl.org/coar/version/c_970fb48d4fbd8a85

内容記述

内容記述タイプ

Other

内容記述

This paper describes continuous speech recognition incorporating the additional complement information, e.g., voice characteristics, speaking styles, linguistic information and noise environment, into HMM-based acoustic modeling. In speech recognition systems, context-dependent HMMs, i.e., triphone, and the tree-based context clustering have commonly been used. Several attempts to utilize not only phonetic contexts, but additional complement information based on context (factor) dependent HMMs have been made in recent years. However, when the additional factors for testing data are unobserved, methods for obtaining factor labels is required before decoding. In this paper, we propose a model integration technique based on general factor dependent HMMs for decoding. The integrated HMMs can be used by a conventional decoder as standard triphone HMMs with Gaussian mixture densities. Moreover, by using the results of context clustering, the proposed method can determine an optimal number of mixture components for each state dependently of the degree of influence from additional factors. Phoneme recognition experiments using voice characteristic labels show significant improvements with a small number of model parameters, and a 19.3% error reduction was obtained in noise environment experiments.

フォーマット

内容記述タイプ

Other

内容記述

application/pdf

戻る

views

See details

	Views

Versions

Ver.1

2023-05-15 13:39:25.050926

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

Continuous Speech Recognition Based on General Factor Dependent Acoustic Models

× Suzuki, Hiroyuki

× Zen, Heiga

× 南角, 吉彦

× Miyajima, Chiyomi

× 徳田, 恵一

× Kitamura, Tadashi

Versions

Share

Cite as

エクスポート