Lip Location Normalized Training for Visual Speech Recognition

Vanegas, Oscar; Tokuda, Keiichi; 徳田, 恵一; トクダ, ケイイチ; Kitamura, Tadashi

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

{"_buckets": {"deposit": "8f70b93a-ac12-47ef-87d4-03bff77ce289"}, "_deposit": {"created_by": 3, "id": "4893", "owners": [3], "pid": {"revision_id": 0, "type": "depid", "value": "4893"}, "status": "published"}, "_oai": {"id": "oai:nitech.repo.nii.ac.jp:00004893", "sets": ["31"]}, "author_link": ["3028", "464", "16006"], "item_10001_biblio_info_28": {"attribute_name": "書誌情報", "attribute_value_mlt": [{"bibliographicIssueDates": {"bibliographicIssueDate": "2000-11-20", "bibliographicIssueDateType": "Issued"}, "bibliographicIssueNumber": "11", "bibliographicPageEnd": "1977", "bibliographicPageStart": "1969", "bibliographicVolumeNumber": "E83-D", "bibliographic_titles": [{"bibliographic_title": "IEICE transactions on information and systems"}]}]}, "item_10001_description_36": {"attribute_name": "内容記述", "attribute_value_mlt": [{"subitem_description": "This paper describes a method to normalize the lip position for improving the performance of a visual-information-based speech recognition system. Basically, there are two types of information useful in speech recognition processes; the first one is the speech signal itself and the second one is the visual information from the lips in motion. This paper tries to solve some problems caused by using images from the lips in motion such as the effect produced by the variation of the lip location. The proposed lip location normalization method is based on a search algorithm of the lip position in which the location normalization is integrated into the model training. Experiments of speaker-independent isolated word recognition were carried out on the Tulips1 and M2VTS databases. Experiments showed a recognition rate of 74.5% and an error reduction rate of 35.7% for the ten digits word recognition M2VTS database.", "subitem_description_type": "Other"}]}, "item_10001_description_38": {"attribute_name": "フォーマット", "attribute_value_mlt": [{"subitem_description": "application/pdf", "subitem_description_type": "Other"}]}, "item_10001_full_name_27": {"attribute_name": "著者別名", "attribute_value_mlt": [{"affiliations": [{"affiliationNames": [{"affiliationName": "", "lang": "ja"}], "nameIdentifiers": [{"nameIdentifier": "", "nameIdentifierScheme": "ISNI", "nameIdentifierURI": "http://www.isni.org/isni/"}]}], "familyNames": [{"familyName": "Tokuda", "familyNameLang": "en"}, {"familyName": "徳田", "familyNameLang": "ja"}, {"familyName": "トクダ", "familyNameLang": "ja-Kana"}], "givenNames": [{"givenName": "Keiichi", "givenNameLang": "en"}, {"givenName": "恵一", "givenNameLang": "ja"}, {"givenName": "ケイイチ", "givenNameLang": "ja-Kana"}], "nameIdentifiers": [{"nameIdentifier": "464", "nameIdentifierScheme": "WEKO"}, {"nameIdentifier": "1000020217483", "nameIdentifierScheme": "NRID", "nameIdentifierURI": "http://rns.nii.ac.jp/nr/1000020217483"}], "names": [{"name": "Tokuda, Keiichi", "nameLang": "en"}, {"name": "徳田, 恵一", "nameLang": "ja"}, {"name": "トクダ, ケイイチ", "nameLang": "ja-Kana"}]}, {"nameIdentifiers": [{"nameIdentifier": "3028", "nameIdentifierScheme": "WEKO"}], "names": [{"name": "北村, 正"}]}]}, "item_10001_publisher_29": {"attribute_name": "出版者", "attribute_value_mlt": [{"subitem_publisher": "Institute of Electronics, Information and Communication Engineers"}]}, "item_10001_source_id_30": {"attribute_name": "ISSN", "attribute_value_mlt": [{"subitem_source_identifier": "09168532", "subitem_source_identifier_type": "ISSN"}]}, "item_10001_source_id_32": {"attribute_name": "書誌レコードID（NCID）", "attribute_value_mlt": [{"subitem_source_identifier": "AA10826272", "subitem_source_identifier_type": "NCID"}]}, "item_10001_version_type_33": {"attribute_name": "著者版フラグ", "attribute_value_mlt": [{"subitem_version_resource": "http://purl.org/coar/version/c_970fb48d4fbd8a85", "subitem_version_type": "VoR"}]}, "item_creator": {"attribute_name": "著者", "attribute_type": "creator", "attribute_value_mlt": [{"creatorNames": [{"creatorName": "Vanegas, Oscar", "creatorNameLang": "en"}], "nameIdentifiers": [{"nameIdentifier": "16006", "nameIdentifierScheme": "WEKO"}]}, {"creatorAffiliations": [{"affiliationNameIdentifiers": [{"affiliationNameIdentifier": "", "affiliationNameIdentifierScheme": "ISNI", "affiliationNameIdentifierURI": "http://www.isni.org/isni/"}], "affiliationNames": [{"affiliationName": "", "affiliationNameLang": "ja"}]}], "creatorNames": [{"creatorName": "Tokuda, Keiichi", "creatorNameLang": "en"}, {"creatorName": "徳田, 恵一", "creatorNameLang": "ja"}, {"creatorName": "トクダ, ケイイチ", "creatorNameLang": "ja-Kana"}], "familyNames": [{"familyName": "Tokuda", "familyNameLang": "en"}, {"familyName": "徳田", "familyNameLang": "ja"}, {"familyName": "トクダ", "familyNameLang": "ja-Kana"}], "givenNames": [{"givenName": "Keiichi", "givenNameLang": "en"}, {"givenName": "恵一", "givenNameLang": "ja"}, {"givenName": "ケイイチ", "givenNameLang": "ja-Kana"}], "nameIdentifiers": [{"nameIdentifier": "464", "nameIdentifierScheme": "WEKO"}, {"nameIdentifier": "1000020217483", "nameIdentifierScheme": "NRID", "nameIdentifierURI": "http://rns.nii.ac.jp/nr/1000020217483"}]}, {"creatorNames": [{"creatorName": "Kitamura, Tadashi", "creatorNameLang": "en"}], "nameIdentifiers": [{"nameIdentifier": "3028", "nameIdentifierScheme": "WEKO"}]}]}, "item_files": {"attribute_name": "ファイル情報", "attribute_type": "file", "attribute_value_mlt": [{"accessrole": "open_date", "date": [{"dateType": "Available", "dateValue": "2017-01-25"}], "displaytype": "detail", "download_preview_message": "", "file_order": 0, "filename": "E83-D_1969.pdf", "filesize": [{"value": "2.5 MB"}], "format": "application/pdf", "future_date_message": "", "is_thumbnail": false, "licensefree": "Copyright (c) 2000 IEICE　http://search.ieice.org/index.html", "licensetype": "license_free", "mimetype": "application/pdf", "size": 2500000.0, "url": {"label": "本文_fulltext", "url": "https://nitech.repo.nii.ac.jp/record/4893/files/E83-D_1969.pdf"}, "version_id": "033189ee-d5cd-4b26-af72-537ef8c0101f"}]}, "item_language": {"attribute_name": "言語", "attribute_value_mlt": [{"subitem_language": "eng"}]}, "item_resource_type": {"attribute_name": "資源タイプ", "attribute_value_mlt": [{"resourcetype": "journal article", "resourceuri": "http://purl.org/coar/resource_type/c_6501"}]}, "item_title": "Lip Location Normalized Training for Visual Speech Recognition", "item_titles": {"attribute_name": "タイトル", "attribute_value_mlt": [{"subitem_title": "Lip Location Normalized Training for Visual Speech Recognition", "subitem_title_language": "en"}]}, "item_type_id": "10001", "owner": "3", "path": ["31"], "permalink_uri": "https://nitech.repo.nii.ac.jp/records/4893", "pubdate": {"attribute_name": "公開日", "attribute_value": "2012-11-07"}, "publish_date": "2012-11-07", "publish_status": "0", "recid": "4893", "relation": {}, "relation_version_is_last": true, "title": ["Lip Location Normalized Training for Visual Speech Recognition"], "weko_shared_id": 3}

Lip Location Normalized Training for Visual Speech Recognition

https://nitech.repo.nii.ac.jp/records/4893

名前 / ファイル	ライセンス	アクション
本文_fulltext (2.5 MB)	Copyright (c) 2000 IEICE　http://search.ieice.org/index.html

Item type

学術雑誌論文 / Journal Article(1)

公開日

2012-11-07

タイトル

言語

タイトル

Lip Location Normalized Training for Visual Speech Recognition

言語

eng

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_6501

資源タイプ

journal article

著者

Vanegas, Oscar
徳田, 恵一

WEKO 464
NRID 1000020217483

en	Tokuda, Keiichi
ja	徳田, 恵一 ISNI
ja-Kana	トクダ, ケイイチ

Search repository

Kitamura, Tadashi

著者別名

姓名

Tokuda, Keiichi

言語

姓名

徳田, 恵一

言語

姓名

トクダ, ケイイチ

言語

ja-Kana

著者別名

姓名

北村, 正

書誌情報

IEICE transactions on information and systems

巻 E83-D, 号 11, p. 1969-1977, 発行日 2000-11-20

出版者

Institute of Electronics, Information and Communication Engineers

ISSN

収録物識別子タイプ

ISSN

収録物識別子

09168532

書誌レコードID（NCID）

収録物識別子タイプ

NCID

収録物識別子

AA10826272

著者版フラグ

出版タイプ

VoR

出版タイプResource

http://purl.org/coar/version/c_970fb48d4fbd8a85

内容記述

内容記述タイプ

Other

内容記述

This paper describes a method to normalize the lip position for improving the performance of a visual-information-based speech recognition system. Basically, there are two types of information useful in speech recognition processes; the first one is the speech signal itself and the second one is the visual information from the lips in motion. This paper tries to solve some problems caused by using images from the lips in motion such as the effect produced by the variation of the lip location. The proposed lip location normalization method is based on a search algorithm of the lip position in which the location normalization is integrated into the model training. Experiments of speaker-independent isolated word recognition were carried out on the Tulips1 and M2VTS databases. Experiments showed a recognition rate of 74.5% and an error reduction rate of 35.7% for the ten digits word recognition M2VTS database.

フォーマット

内容記述タイプ

Other

内容記述

application/pdf

戻る

views

See details

	Views

Versions

Ver.1

2023-05-15 13:44:38.806426

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

Lip Location Normalized Training for Visual Speech Recognition

× Vanegas, Oscar

× 徳田, 恵一

× Kitamura, Tadashi

Versions

Share

Cite as

エクスポート