Publications & databases

Publications:

Simantiraki, O., Charonyktakis, P., Pampouchidou, A., Tsiknakis, M., Cooke, M. (2017) Glottal Source Features for Automatic Speech-Based Depression Assessment. Proc. Interspeech 2017, 2700-2704, DOI: 10.21437/Interspeech.2017-1251.

Llorach G., Blat J. (2017) Say Hi to Eliza. In: Beskow J., Peters C., Castellano G., O’Sullivan C., Leite I., Kopp S. (eds) Intelligent Virtual Agents. IVA 2017. Lecture Notes in Computer Science, vol 10498. Springer, Cham. DOI: 10.1007/978-3-319-67401-8_34

Hendrikse, M., Llorach, G., Grimm, G., Hohmann, V. Influence of visual cues on head and eye movements during listening tasks in multi-talker audiovisual environments with animated characters. Speech Communication, Vol 101, July 2018, p. 70-84. DOI: 10.1016/j.specom.2018.05.008.

Govender, A., King, S. (2018). Using Pupillometry to Measure the Cognitive Load of Synthetic Speech. Proc. Interspeech 2018, 2838-2842, DOI: 10.21437/Interspeech.2018-1174.

Govender, A., King, S. (2018). Measuring the Cognitive Load of Synthetic Speech Using a Dual Task Paradigm. Proc. Interspeech 2018, 2843-2847, DOI: 10.21437/Interspeech.2018-1199.

Simantiraki, O., Cooke, M., King, S. (2018). Impact of Different Speech Types on Listening Effort. Proc. Interspeech 2018, 2267-2271, DOI: 10.21437/Interspeech.2018-1358.

Shifas PV, M., Tsiaras, V., Stylianou, Y. (2018) Speech Intelligibility Enhancement Based on a Non-causal Wavenet-like Model. Proc. Interspeech 2018, 1868-1872, DOI: 10.21437/Interspeech.2018-2119.

Espic calderón, F., Govender, A., Ribeiro, M. S., Valentini Botinhao, C., & Watts, O. (2018). The CSTR entry to the 2018 Blizzard Challenge. In Blizzard Challenge 2018 workshop Hyderabad, India.

Kaplan, E., Wagner, A. & Baskent, D. (2018). Are musicians at an advantage when processing speech on speech? In Parncutt, R., & Sattmann, S. (Eds.) (2018). Proceedings of ICMPC15/ESCOM10 (p. 233-236). Graz, Austria: Centre for Systematic Musicology, University of Graz.

G. Llorach, G. Grimm, M. M. E. Hendrikse, V. Hohmann. Towards Realistic Immersive Audiovisual Simulations for Hearing Research: Capture, virtual scenes and reproduction, Proceedings of 2018 Workshop on Audio-Visual Scene Understanding for Immersive Multimedia (AVSU’18), p. 33-40, 26 October 2018, Seoul, Republic of Korea, DOI: 10.1145/3264869.3264874

Raman, S., Hernaez, I., Navas, E., Serrano, L. (2018) Listening to Laryngectomees: A study of Intelligibility and Self-reported Listening Effort of Spanish Oesophageal Speech. Proc. IberSPEECH 2018, 107-111, DOI: 10.21437/IberSPEECH.2018-23

Serrano, L., Tavarez, D., Sarasola, X., Raman, S., Saratxaga, I., Navas, E., Hernaez, I. (2018) LSTM based voice conversion for laryngectomees. Proc. IberSPEECH 2018, 122-126, DOI: 10.21437/IberSPEECH.2018-26

Llorach G., Agenjo J., Blat J., Sayago S. (2019) Web-Based Embodied Conversational Agents and Older People. In: Sayago S. (eds) Perspectives on Human-Computer Interaction Research with Older People. Human–Computer Interaction Series. Springer, Cham. DOI: https://doi.org/10.1007/978-3-030-06076-3_8

Amy Hall, Jan Rennies-Hochmuth, Axel Winneke (2019). Assessing and reducing listening effort of listening to speech in adverse conditions. DAGA 2019, Conference proceedings, p. 958-961.

Llorach, G., Oetting, D., Krüger M., Vormann, M., Fitschen, C., Schulte, M., Hohmann, V., Meis, M. (2019) Vehicle Noise: Loudness Ratings, Loudness Models and Future Experiments with Audiovisual Immersive Simulations, Proceedings of Internoise 2019.

Raman, S.; Serrano, L.; Winneke, A.; Navas, E.; Hernaez, I. Intelligibility and Listening Effort of Spanish Oesophageal Speech. Appl. Sci. 2019, 9, 3233. DOI: 10.3390/app9163233

Shen C., Janse E., 2019 Articulatory Control in Speech Production. In Sasha Calhoun, Paola Escudero, Marija Tabain & Paul Warren (eds.) Proceedings of the 19th International Congress of Phonetic Sciences, Melbourne, Australia 2019 (pp. 2533-2537). Canberra, Australia: Australasian Speech Science and Technology Association Inc.

Marcoux, K.P, Ernestus, M.T.C, 2019. Pitch in Native and Non-Native Lombard Speech. In Sasha Calhoun, Paola Escudero, Marija Tabain & Paul Warren (eds.) Proceedings of the 19th International Congress of Phonetic Sciences, Melbourne, Australia 2019 (pp. 2605-2609). Canberra, Australia: Australasian Speech Science and Technology Association Inc.

Hendrikse, M. M. E., Llorach, G., Grimm, G., Hohmann, V. (2019). Movement and Gaze Behavior in Virtual Audiovisual Everyday-Life Listening Environments. Trends in Hearing 23 (2019) p. 2331216519872362, DOI: 10.1177/2331216519872362

Sfakianaki, A. (2019). Designing a Modern Greek sentence corpus for audiological and speech technology research. Proceedings of the 14th International Conference on Greek Linguistics (ICGL14), September 5-8, 2019, University of Patras, Greece. Proceeddings not published yet

Cooke, M., King, S., Hazan, V., Stylianou, Y., Janse, E., Baskent, D., Hohmann, V., Winneke, A., Hernaez, I., (2019) Enriched communication across the lifespan- Comunicación enriquecida a lo largo de la vida. Procesamiento del lenguaje natural 63 (2019), pp.175-178. DOI: 10.26342/2019-63-24

Simantiraki, O., Cooke, M. (2019) Listeners’ Speech Rate Preferences in Stationary and Modulated Maskers. ICA 2019 Conference Proceedings, pp. 5736-5738. DOI: 10.18154/RWTH-CONV-239387

Raman, S., Hernaez, I., Navas, E., Serran, L. A Multifaceted Enrichment of Oesophageal Speech. ICA 2019 Conference Proceedings, pp. 5739-5741. DOI: 10.18154/RWTH-CONV-239415

Exenberger, A., Iverson, O.  Speech enrichment: Listening effort and intelligibility. ICA 2019 Conference Proceedings, pp. 5700-5702. DOI: 10.18154/RWTH-CONV-238874

Paulus, M., Hazan, V., Wagner, A., Adank, P. (2019). Talker intelligibility and listening effort: The role of speaking rate. ICA 2019 Conference Proceedings, pp. 5708-5712. DOI: 10.18154/RWTH-CONV-239169

Chermaz, C., Valentini Botinhao, C., Schepker, H., & King, S. Near End Listening Enhancement in Realistic Environments. In ICA 2019 Proceedings, pp. 5731-5735. DOI: 10.18154/RWTH-CONV-239327

Govender, A., King, S. and Valentini-Botinhao, C. (2019). Evaluating Cognitive Load of Text-To-Speech (TTS) synthesis. In ICA 2019 Proceedings, pp. 5759-5763. DOI: 10.18154/RWTH-CONV-239695

Shen C., Cooke M., Janse E. (2019) Individual Articulatory Control in Speech Enrichment. In ICA 2019 Proceedings, pp. 5761-5765. DOI: 10.18154/RWTH-CONV-239282

Marcoux, K.P, Ernestus, M.T.C. (2019) Differences between Native and Non-Native Lombard Speech in terms of pitch range. In ICA 2019 Proceedings, pp. 5713-5720. DOI: 10.18154/RWTH-CONV-239240

Padinjaru Veettil, M.S., Santelli, C., and Stylianou,Y. (2019) Towards a Neural-Based Single Channel Speech Enhancement Model for Hearing-Aids. In ICA 2019 Proceedings, pp. 5745-5748. DOI: 10.18154/RWTH-CONV-239594

Padinjaru Veettil, M.S., Chermaz, C., Chimona, T., Tsiaras, V. and Stylianou, Y. (2019) Benefits of the WaveNet-Based Speech Intelligibility Enhancement for Normal and Hearing Impaired Listeners. In ICA 2019 Proceedings, pp. 5721-5725. DOI: 10.18154/RWTH-CONV-239258

Paul, D., Pantazis, Y. and Stylianou , Y. (2019). Weighted Generative Adversarial Network for many-to-many Voice Conversion. In ICA 2019 Proceedings, pp. 5742-5744. DOI: 10.18154/RWTH-CONV-239420

Kirwan, J., Wagner, A. and Baskent, D. (2019) Pupillary Correlates of Auditory Emotion Recognition in Hearing-Impaired Listeners. In ICA 2019 Proceedings, pp. 5771-5772. DOI: 10.18154/RWTH-CONV-239805

Kaplan, E.C., Baskent, D. and Wagner, A. (2019) Differences in Processing Speech-on-Speech Between Musicians and Non-musicians: The Role of Prosodic Cues. In ICA 2019 Proceedings, pp. 5756-5758. DOI: 10.18154/RWTH-CONV-239680

Llorach, G. and Hohmann, V. (2019) Word error and confusion patterns in an audiovisual German matrix sentence test (OLSA). In ICA 2019 Proceedings, pp. 5749-5751. DOI: 10.18154/RWTH-CONV-239621

Grimm, G. ; Llorach, G. ; Hendrikse, M. ; Hohmann, V. (2019) Audio-visual stimuli for the evaluation of speech-enhancing algorithms. In ICA 2019 Proceedings, pp. 3883-3889. DOI: 10.18154/RWTH-CONV-238907

Hendrikse, M. ; Llorach, G. ; Grimm, G. ; Hohmann, V. (2019) Realistic Audiovisual Listening Environments in the Lab: Analysis of Movement Behavior and Consequences for Hearing Aids. In ICA 2019 Proceedings, pp. 7616-7622. DOI: 10.18154/RWTH-CONV-239167

Hall, A., Winneke, A., Rennies-Hochmuth, J. (2019) EEG alpha power as a measure of listening effort reduction in adverse conditions. ICA 2019 Conference Proceedings, pp. 5752-5755. DOI: 10.18154/RWTH-CONV-239632

Serrano, L., Raman, S., Tavarez, D., Navas, E., Hernaez, I. (2019) Parallel vs. Non-Parallel Voice Conversion for Esophageal Speech. Proc. Interspeech 2019, 4549-4553, DOI: 10.21437/Interspeech.2019-2194.

Paulus, M., Hazan, V., Adank, P. (2019) Talker Intelligibility and Listening Effort with Temporally Modified Speech. Proc. Interspeech 2019, 3128-3132, DOI: 10.21437/Interspeech.2019-1402.

Chermaz, C., Valentini Botinhao, C., Schepker, H., & King, S. Evaluating Near End Listening Enhancement Algorithms in Realistic Environments. In Proceedings Interspeech 2019, 1373-1377, DOI: 10.21437/Interspeech.2019-1800. Nominated for the best student paper at Interspeech 2019.

Govender, A., Wagner, A.E., King, S. (2019) Using Pupil Dilation to Measure Cognitive Load When Listening to Text-to-Speech in Quiet and in Noise. Proc. Interspeech 2019, 1551-1555, DOI: 10.21437/Interspeech.2019-1783.

Muhammed Shifas, P. V., Adiga, N., Tsiaras, V., & Stylianou, Y. (2019). A Non-Causal FFTNet Architecture for Speech Enhancement. Proc. Interspeech 2019, p. 1826-1830. DOI: 10.21437/Interspeech.2019-2622.

Paul, D., Pantazis, Y., Stylianou, Y. (2019) Non-Parallel Voice Conversion Using Weighted Generative Adversarial Networks. Proc. Interspeech 2019, 659-663, DOI: 10.21437/Interspeech.2019-2869.

Eloff, R., Nortje, A., Niekerk, B.V., Govender, A., Nortje, L., Pretorius, A., Biljon, E.V., Westhuizen, E.V.D., Staden, L.V., Kamper, H. (2019) Unsupervised Acoustic Unit Discovery for Speech Synthesis Using Discrete Latent-Variable Neural Networks. Proc. Interspeech 2019, 1103-1107, DOI: 10.21437/Interspeech.2019-1518.

Govender, A., Valentini-Botinhao, C., King, S. (2019) Measuring the contribution to cognitive load of each predicted vocoder speech parameter in DNN-based speech synthesis. Proc. 10th ISCA Speech Synthesis Workshop, 121-126, DOI: 10.21437/SSW.2019-22

Y. Pantazis, D. Paul, M. Fasoulakis and Y. Stylianou (2019). Training Generative Adversarial Networks With Weights, 2019 27th European Signal Processing Conference (EUSIPCO), A Coruna, Spain, 2-6 Sept 2019, pp. 1-5, doi: 10.23919/EUSIPCO.2019.8902934.

Koutsogiannaki, M., Simantiraki, O., Cooke, M., Lallier, M. (2020), Listening effort of natural speaking styles. SpiN 2020, 9-10 January 2020, Toulouse, France. Abstract p.61-62.

Simantiraki, O., Cooke, M., (2020) Exploring listeners’ speech modification preferences. SpiN 2020, 9-10 January 2020, Toulouse, France. Abstract p.15-16.

Raman, S., Winneke, A., Hernaez, I., Navas, E. (2020) Listening effort and oesophageal speech: An EEG study. SpiN 2020 9-10 January 2020, Tolouse, France. Abstract p.60-61.

Simantiraki, O., Cooke, M., Pantazis, Y. (2020) Effects of Spectral Tilt on Listeners’ Preferences And Intelligibility. ICASSP 2020 – 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020, pp. 6254-6258, DOI: 10.1109/ICASSP40776.2020.9054117.

Paulus, M., Hazan, V., Adank, P. (2020). The relationship between talker acoustics, intelligibility, and effort in degraded listening conditions. The Journal of the Acoustical Society of America, 147(5), 3348-3359, DOI: 10.1121/10.0001212.

Shen C., Janse E. (2020). Maximum Speech Performance and Executive Control in Young Adult Speakers. Journal of Speech, Language, and Hearing Research. ePub Ahead of Issue. DOI: 10.1044/2020_JSLHR-19-00257

Simantiraki, O. and Cooke, M., (2020). Exploring listeners’ speech rate preferences. In Proceedings Interspeech 2020, p. 1346-1350. DOI: 10.21437/Interspeech.2020-1832

Chermaz, C., King. S., (2020). A Sound Engineering Approach to Near End Listening Enhancement. In Proceedings Interspeech 2020, p. 1356-1360. DOI: 10.21437/Interspeech.2020-2748

Paul, D., Pantazis, Y., Stylianou, Y. (2020) Speaker Conditional WaveRNN: Towards Universal Neural Vocoder for Unseen Speaker and Recording Conditions. Proc. Interspeech 2020, 235-239, DOI: 10.21437/Interspeech.2020-2786.

Paul, D., Muhammed Shifas, P. V., Pantazis, Y. and Stylianou, Y. (2020). Enhancing Speech Intelligibility in Text-To-Speech Synthesis using Speaking Style Conversion. In Proceedings Interspeech 2020, p. 1361-1365. DOI: 10.21437/Interspeech.2020-2793

Rennies, J., Schepker, H., Valentini-Botinhao, C., Cooke, M. (2020) Intelligibility-Enhancing Speech Modifications — The Hurricane Challenge 2.0. Proc. Interspeech 2020, 1341-1345, DOI: 10.21437/Interspeech.2020-1641.

Govender, A., et al (2020). ASVspoof 2019: A large-scale public database of synthetized, converted and replayed speech. Computer Speech & Language, Vol. 64, November 2020, 101114. DOI: 10.1016/j.csl.2020.101114

Serrano, L., Raman, S., Hernaez, I., Navas, E., Sanchez, J., Saratxaga, I. (2020). A Spanish Multispeaker Database of Esophageal Speech. Computer Speech and Language, Volume 66, March 2021, 101168. DOI: 10.1016/j.csl.2020.101168

Kaplan EC, Wagner AE, Toffanin P and Başkent D (2021) Do Musicians and Non-musicians Differ in Speech-on-Speech Processing? Front. Psychol. 12:623787. DOI: 10.3389/fpsyg.2021.623787

Llorach, G., Kirschner, F., Grimm, G., Zokoll, M.A., Wagener, K.C. and Hohmann, V., (2020). Development and Evaluation of Video Recordings for the OLSA Matrix Sentence Test. arXiv preprint arXiv:1912.04700

Padinjaru Veettil, M.S., S. Claudio, S., Stylianou, Y. (2020) A fully recurrent feature extraction for single channel speech enhancement.” arXiv preprint arXiv:2006.05233

Pantazis, Y., Paul, D., Fasoulakis, M., and Stylianou , Y., Katsoulakis. M, (2020) Cumulant GAN. arXiv preprint arXiv:2006.06625

Llorach, G., Hendrikse, M.M., Grimm, G. and Hohmann, V., 2020. Comparison of a Head-Mounted Display and a Curved Screen in a Multi-Talker Audiovisual Listening Task. arXiv preprint arXiv:2004.01451

Databases:

Shen, Chen; Janse, Esther; King, Simon. (2018). Radboud Lombard Corpus_Dutch, 2017. Radboud University. Centre for Language Studies.

Marcoux, Katherine; Ernestus, Mirjam; King, Simon. (2018). Dutch English Lombard Speech Native and Non-Native (DELNN). Radboud University. Center for Language Studies.

Shen, Chen, & Janse, Esther. (2018). Radboud Lombard Corpus (Dutch). Zenodo. DOI: 10.5281/zenodo.4040685 

Katherine P. Marcoux, & Mirjam Ernestus (2019). Dutch English Native Non-Native Lombard (DELNN) Corpus. Zenodo. DOI: 10.5281/zenodo.4267819    

Maartje M. E. Hendrikse, Giso Grimm, Gerard Llorach, & Volker Hohmann. (2018). Audiovisual recordings of acted casual conversations between four speakers in German. Zenodo. DOI: 10.5281/zenodo.1257333   

Llorach, Gerard, Kirschner, Frederike, Grimm, Giso, & Hohmann, Volker. (2020). Video recordings for the female German Matrix Sentence Test (OLSA). Zenodo. DOI: 10.5281/zenodo.3673062 

Llorach, Gerard, Grimm, Giso, Vormann, Matthias, Hohmann, Volker, & Meis, Markus. (2020). Vehicle driving actions for loudness and annoyance perception. Zenodo.  DOI: 10.5281/zenodo.3822311     

Sfakianaki, Anna; Kafentzis, George; Stylianou, Yannis (2020). Greek Harvard.

Govender, Avashna (2018). Pupillometry toolkit.

Shen, C., Janse, E. (2021). Radboud Tongue Twister Corpus _Dutch. To be uploaded in the near future on Zenodo.

_ _ _ _ _ _ _ _

Unless otherwise noted, this work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.