Please use this identifier to cite or link to this item: http://hdl.handle.net/10174/41221

Title: Domain Adaptation Speech-to-Text for Low-Resource European Portuguese Using Deep Learning
Authors: Medeiros, Eduardo
Corado, Leonel
Rato, Luís
Quaresma, Paulo
Salgueiro, Pedro
Editors: Reina, Daniel Gutiérrez
Keywords: machine learning
deep learning
deep neural networks
speech-to-text;
automatic speech recognition
NVIDIA NeMo
GPUs
data-centric
Portuguese language
Issue Date: 24-Apr-2023
Publisher: MDPI
Citation: Medeiros, E., Corado, L., Rato, L., Quaresma, P., & Salgueiro, P. (2023). Domain Adaptation Speech-to-Text for Low-Resource European Portuguese Using Deep Learning. Future Internet, 15(5), 159. https://doi.org/10.3390/fi15050159
Abstract: Automatic speech recognition (ASR), commonly known as speech-to-text, is the process of transcribing audio recordings into text, i.e., transforming speech into the respective sequence of words. This paper presents a deep learning ASR system optimization and evaluation for the European Portuguese language. We present a pipeline composed of several stages for data acquisition, analysis, pre-processing, model creation, and evaluation. A transfer learning approach is proposed considering an English language-optimized model as starting point; a target composed of European Portuguese; and the contribution to the transfer process by a source from a different domain consisting of a multiple-variant Portuguese language dataset, essentially composed of Brazilian Portuguese. A domain adaptation was investigated between European Portuguese and mixed (mostly Brazilian) Portuguese. The proposed optimization evaluation used the NVIDIA NeMo framework implementing the QuartzNet15×5 architecture based on 1D time-channel separable convolutions. Following this transfer learning data-centric approach, the model was optimized, achieving a state-of-the-art word error rate (WER) of 0.0503.
URI: https://www.mdpi.com/1999-5903/15/5/159
5
http://hdl.handle.net/10174/41221
ISSN: 1999-5903
Type: article
Appears in Collections:INF - Publicações - Artigos em Revistas Internacionais Com Arbitragem Científica

Files in This Item:

File Description SizeFormat
futureinternet-15-00159 (1).pdf514.62 kBAdobe PDFView/Open
FacebookTwitterDeliciousLinkedInDiggGoogle BookmarksMySpaceOrkut
Formato BibTex mendeley Endnote Logotipo do DeGóis 

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Dspace Dspace
DSpace Software, version 1.6.2 Copyright © 2002-2008 MIT and Hewlett-Packard - Feedback
UEvora B-On Curriculum DeGois