Analisis Perbandingan Rasio Training Data Model Speech Recognition Untuk Gangguan Bicara

(1)

41

Analisis Perbandingan Rasio Training Data Model Speech Recognition untuk Gangguan Bicara, Alfeto, Universitas Multimedia Nusantara

DAFTAR PUSTAKA

[1] C. Mitchell et al., “Prevalence of aphasia and dysarthria among inpatient stroke survivors: describing the population, therapy provision and outcomes on discharge,” Aphasiology, vol. 35, no. 7, pp. 950–960, Jul. 2021, doi:

10.1080/02687038.2020.1759772.

[2] H. P. Rowe, S. E. Gutz, M. F. Maffei, K. Tomanek, and J. R. Green,

“Characterizing Dysarthria Diversity for Automatic Speech Recognition: A Tutorial From the Clinical Perspective,” Front. Comput. Sci., vol. 4, p. 770210, Apr. 2022, doi: 10.3389/fcomp.2022.770210.

[3] J. R. Deller, M. S. Liu, L. J. Ferrier, and P. Robichaud, “The Whitaker database of dysarthric (cerebral palsy) speech,” J. Acoust. Soc. Am., vol. 93, no. 6, pp. 3516–3518, Jun. 1993, doi: 10.1121/1.405684.

[4] F. Rudzicz, A. K. Namasivayam, and T. Wolff, “The TORGO database of acoustic and articulatory speech from speakers with dysarthria,” Lang. Resour.

Eval., vol. 46, no. 4, pp. 523–541, Dec. 2012, doi: 10.1007/s10579-011-9145-0.

[5] R. L. MacDonald et al., “Disordered Speech Data Collection: Lessons Learned at 1 Million Utterances from Project Euphonia,” in Interspeech 2021, ISCA, Aug. 2021, pp. 4833–4837. doi: 10.21437/Interspeech.2021-697.

[6] S. Alharbi et al., “Automatic Speech Recognition: Systematic Literature Review,” IEEE Access, vol. 9, pp. 131858–131876, 2021, doi:

10.1109/ACCESS.2021.3112535.

[7] B. Vachhani, C. Bhat, and S. K. Kopparapu, “Data Augmentation Using Healthy Speech for Dysarthric Speech Recognition,” in Interspeech 2018, ISCA, Sep. 2018, pp. 471–475. doi: 10.21437/Interspeech.2018-1751.

[8] A.-L. Georgescu, A. Pappalardo, H. Cucu, and M. Blott, “Performance vs.

hardware requirements in state-of-the-art automatic speech recognition,” EURASIP J. Audio Speech Music Process., vol. 2021, no. 1, p. 28, Dec. 2021, doi:

10.1186/s13636-021-00217-4.

(2)

42

Analisis Perbandingan Rasio Training Data Model Speech Recognition untuk Gangguan Bicara, Alfeto, Universitas Multimedia Nusantara

[9] V. Panayotov, G. Chen, D. Povey, and S. Khudanpur, “Librispeech: An ASR corpus based on public domain audio books,” in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), South Brisbane, Queensland, Australia: IEEE, Apr. 2015, pp. 5206–5210. doi:

10.1109/ICASSP.2015.7178964.

[10] S. Kriman et al., “QuartzNet: Deep Automatic Speech Recognition with 1D Time-Channel Separable Convolutions.” arXiv, Oct. 22, 2019. Accessed: Jun. 12, 2023. [Online]. Available: http://arxiv.org/abs/1910.10261

[11] T. Magnuson and M. Blomberg, “Acoustic analysis of dysarthric speech and some implications for automatic speech recognition”.

[12] H. Kim et al., “Dysarthric speech database for universal access research,”

in Interspeech 2008, ISCA, Sep. 2008, pp. 1741–1744. doi:

10.21437/Interspeech.2008-480.

[13] M. Geng et al., “Investigation of Data Augmentation Techniques for Disordered Speech Recognition,” in Interspeech 2020, ISCA, Oct. 2020, pp. 696–

700. doi: 10.21437/Interspeech.2020-1161.

[14] R. Ardila et al., “Common Voice: A Massively-Multilingual Speech Corpus.” arXiv, Mar. 05, 2020. Accessed: Jun. 12, 2023. [Online]. Available:

http://arxiv.org/abs/1912.06670