The “linguistic genius” of babies (Kuhl P., 2006 ) is an intriguing topic characterized by a long-lasting debate on the experience independent/dependent nature of acquisition. By following the footsteps of ground-breaking experimental designs (Saffran J., 2003), aim of this project is to combine corpus linguistics and automated processing techniques to shed some light on a critical cognitive development phase occurring between 18 and 36 months: the building of a phonological consciousness (Clements N.,1985). Supposing that “any variation does not randomly vary into any other, but it rather would follow an underlying pattern” (Sauvage J., 2015), mining data to extract patterns could allow us to explore the evolution of minimal pairs in order to observe whether and how successive variations that infants utter on these contrastive units could be considered as traces of an on-going process of setting up an higher level architecture. Data. COLAJE (Morgenstern et al. 2012) is a french database made up of longitudinal records of seven infants filmed one hour every month from their first year until they were six. Each record has been transcripted in three different formats: CHAT, orthographic and IPA. Every utterance is transcribed in two complementary lines named “pho” (what the infant says) and “mod” (what the infant should have said according to the phonetic/phonological norm).
A computationally-based approach to the understanding of child's phonological development: a case study on a set of longitudinal corpora
Andrea Briglia
;Massimo Mucciardi
2020-01-01
Abstract
The “linguistic genius” of babies (Kuhl P., 2006 ) is an intriguing topic characterized by a long-lasting debate on the experience independent/dependent nature of acquisition. By following the footsteps of ground-breaking experimental designs (Saffran J., 2003), aim of this project is to combine corpus linguistics and automated processing techniques to shed some light on a critical cognitive development phase occurring between 18 and 36 months: the building of a phonological consciousness (Clements N.,1985). Supposing that “any variation does not randomly vary into any other, but it rather would follow an underlying pattern” (Sauvage J., 2015), mining data to extract patterns could allow us to explore the evolution of minimal pairs in order to observe whether and how successive variations that infants utter on these contrastive units could be considered as traces of an on-going process of setting up an higher level architecture. Data. COLAJE (Morgenstern et al. 2012) is a french database made up of longitudinal records of seven infants filmed one hour every month from their first year until they were six. Each record has been transcripted in three different formats: CHAT, orthographic and IPA. Every utterance is transcribed in two complementary lines named “pho” (what the infant says) and “mod” (what the infant should have said according to the phonetic/phonological norm).Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.