I study computational models of early phonetic learning, i.e. how infants’ speech perception changes through the course of the first year of life, in particular becoming attuned to their native language(s). For example, by 10-12 months of age, infants growing in a Japanese linguistic environment, do not discriminate between American English /r/ and /l/ anymore (as in ‘lap’ versus ‘rap’) whereas American English natives still do.
Schatz, T., Feldman, N.H. Neural network vs. HMM speech recognition systems as models of human cross-linguistic phonetic perception Proceedings of CCN, 2018 [pdf]
Riad, R., Dancette, C., Karadayi, J., Zeghidour, N., Schatz, T., Dupoux, E. (2018) Sampling strategies in Siamese Networks for unsupervised speech representation learning Proceedings of INTERSPEECH, 2018 [pdf]
Zeghidour, N., Usunier, N., Kokkinos, I., Schatz, T., Synnaeve, G. & Dupoux, E. (2018) Learning filterbanks from raw speech for phone recognition Proceedings of ICASSP, 2018 [pdf]
Schatz, T., Bach, F., Dupoux, E. (2018) Evaluating automatic speech recognition systems as quantitative models of cross-lingual phonetic category perception The Journal of the Acoustical Society of America [pdf]
Schatz, T., Bach, F., Dupoux, E. (2017) ASR Systems as Models of Phonetic Category Perception Proceeding of CCN, 2017 [pdf]
Schatz, T., Turnbull, R., Bach, F., Dupoux, E. (2017) A Quantitative Measure of the Impact of Coarticulation on Phone Discriminability Proceedings of INTERSPEECH, 2017 [pdf]
Schatz, T. (2016) ABX-Discriminability Measures and Applications PhD Thesis [pdf]
Versteegh, M.*, Thiollière, R.*, Schatz, T.*, Cao, X-N., Anguera, X., Jansen, A., Dupoux, E. (2015), The Zero Resource Speech Challenge 2015. Proceedings of INTERSPEECH, 2015. [pdf]
*These authors contributed equally to this work.
Martin, A., Schatz, T., Versteegh, M., Dupoux, E., Mazuka, R., Miyazawa, K., Cristia, A. (2015), Mothers Speak Less Clearly to Infants: A Comprehensive Test of the Hyperarticulation Hypothesis. Psychological Science 26(3), pp. 341-347. [pdf]
Synnaeve, G., Schatz, T., Dupoux, E. (2014). Phonetics Embedding Learning with Side Information. Proceedings of SLT, 2014. [pdf]
Schatz, T., Peddinti, V., Xuan-Nga, C., Bach, F., Hynek, H. & Dupoux, E. (2014). Evaluating Speech Features with the Minimal-Pair ABX task (II): Resistance to Noise. Proceedings of INTERSPEECH, 2014. [pdf]
Fourtassi, A., Schatz, T., Varadarajan, B. & Dupoux, E. (2014). Exploring the Relative Role of Bottom-up and Top-down Information in Phoneme Learning. Proceedings of ACL, 2014. [pdf]
Schatz, T., Peddinti, V., Bach, F., Jansen, A., Hynek, H. & Dupoux, E. (2013). Evaluating Speech Features with the Minimal-Pair ABX Task: Analysis of the Classical MFC/PLP Pipeline. Proceedings of INTERSPEECH, 2013. [pdf]
Jansen, A., Dupoux, E., Goldwater, S., Johnson, M., Khudanpur, S., Church, K., Feldman, N., Hermansky, H., Metze, F., Rose, R., Seltzer, M., Clark, P., McGraw, I., Varadarajan, B., Bennett, E., Borschinger, B., Chiu, J., Dunbar, E., Fourtassi, A., Harwath, D., Lee, C-y., Levin, K., Norouzian, A., Peddinti, V., Richardson, R., Schatz, T. & Thomas, S. (2013). A Summary of the 2012 JHU CLSP Workshop on Zero Resource Speech Technologies and Models of Early Language Acquisition. Proceedings of ICASSP, 2013. [pdf]
Schatz, T. and Oudeyer, P-Y. (2009). Learning Motor Dependent Crutchfield’s Information Distance to Anticipate Changes in the Topology of Sensory Body Maps. Proceedings of ICDL, 2009. [pdf]
(Best student paper award)
A python package for performing ABX experiments on large corpora.
A standardized format for corpora of speech recordings with tools for performing experiments combining ASR and ABX-Discriminability Measures.
A utility for efficiently managing the complex FLows Of Data in modern machine learning and signal processing pipelines.
Not yet released.
Schatz, T., Cao, X-N., Kolesnikova, A., Bergvelt, T., Wright, J., Dupoux, E. (2015) Articulation Index LSCP LDC2015S12. Web Download. Philadelphia: Linguistic Data Consortium, 2015. [Available for free here]