TY - JOUR
T1 - Code-Switching in Automatic Speech Recognition
T2 - The Issues and Future Directions
AU - Mustafa, Mumtaz Begum
AU - Yusoof, Mansoor Ali
AU - Khalaf, Hasan Kahtan
AU - Rahman Mahmoud Abushariah, Ahmad Abdel
AU - Kiah, Miss Laiha Mat
AU - Ting, Hua Nong
AU - Muthaiyah, Saravanan
N1 - Publisher Copyright:
© 2022 by the authors.
PY - 2022/9/23
Y1 - 2022/9/23
N2 - Code-switching (CS) in spoken language is where the speech has two or more languages within an utterance. It is an unsolved issue in automatic speech recognition (ASR) research as ASR needs to recognise speech in bilingual and multilingual settings, where the accuracy of ASR systems declines with CS due to pronunciation variation. There are very few reviews carried out on CS, with none conducted on bilingual and multilingual CS ASR systems. This study investigates the importance of CS in bilingual and multilingual speech recognition systems. To meet the objective of this study, two research questions were formulated, which cover both the current issues and the direction of the research. Our review focuses on databases, acoustic and language modelling, and evaluation metrics. Using selected keywords, this research has identified 274 papers and selected 42 experimental papers for review, of which 24 (representing 57%) have discussed CS, while the rest look at multilingual ASR research. The selected papers cover many well-resourced and under-resourced languages, and novel techniques to manage CS in ASR systems, which are mapping, combining and merging the phone sets of the languages experimented with in the research. Our review also examines the performance of those methods. This review found a significant variation in the performance of CS in terms of word error rates, indicating an inconsistency in the ability of ASRs to handle CS. In the conclusion, we suggest several future directions that address the issues identified in this review.
AB - Code-switching (CS) in spoken language is where the speech has two or more languages within an utterance. It is an unsolved issue in automatic speech recognition (ASR) research as ASR needs to recognise speech in bilingual and multilingual settings, where the accuracy of ASR systems declines with CS due to pronunciation variation. There are very few reviews carried out on CS, with none conducted on bilingual and multilingual CS ASR systems. This study investigates the importance of CS in bilingual and multilingual speech recognition systems. To meet the objective of this study, two research questions were formulated, which cover both the current issues and the direction of the research. Our review focuses on databases, acoustic and language modelling, and evaluation metrics. Using selected keywords, this research has identified 274 papers and selected 42 experimental papers for review, of which 24 (representing 57%) have discussed CS, while the rest look at multilingual ASR research. The selected papers cover many well-resourced and under-resourced languages, and novel techniques to manage CS in ASR systems, which are mapping, combining and merging the phone sets of the languages experimented with in the research. Our review also examines the performance of those methods. This review found a significant variation in the performance of CS in terms of word error rates, indicating an inconsistency in the ability of ASRs to handle CS. In the conclusion, we suggest several future directions that address the issues identified in this review.
KW - automatic speech recognition system
KW - bilingual speech recognition
KW - code-switching
KW - evaluation metrics
KW - language and acoustic models
KW - multilingual speech recognition
UR - http://www.scopus.com/inward/record.url?scp=85139976524&partnerID=8YFLogxK
U2 - 10.3390/app12199541
DO - 10.3390/app12199541
M3 - Review article
AN - SCOPUS:85139976524
SN - 2076-3417
VL - 12
JO - Applied Sciences (Switzerland)
JF - Applied Sciences (Switzerland)
IS - 19
M1 - 9541
ER -