This article deals with the problem of the status of norm and variation in NLP by proposing examples drawn from previous research concerning computer models used to represent French language acquisition. Two case studies illustrate the choice around the norm-variation axis: the automatic computation of a frequency distribution and the recognition of sequential patterns in words containing specific syllable sequences that are hard to learn due to their inner phonetic difficulty. Whether the level of analysis is the word (first example) or the phoneme (second example), obstacles and trade-offs come up in a similar way. The choice - often difficult and constrained - between the accuracy of the language description and the need to have uniform data for the machine to be easily handled. The avoidable and unavoidable biases, the precautions to be taken beforehand, as well as the advantages and disadvantages of these types of NLP models will be discussed. The article ends by outlining the possible future complementarities between qualitative and quantitative methods in current linguistics.
La norme et la variation dans le cadre du Traitement Automatique du Langage
Andrea Briglia
;Massimo Mucciardi;Pirrotta Giovanni
2023-01-01
Abstract
This article deals with the problem of the status of norm and variation in NLP by proposing examples drawn from previous research concerning computer models used to represent French language acquisition. Two case studies illustrate the choice around the norm-variation axis: the automatic computation of a frequency distribution and the recognition of sequential patterns in words containing specific syllable sequences that are hard to learn due to their inner phonetic difficulty. Whether the level of analysis is the word (first example) or the phoneme (second example), obstacles and trade-offs come up in a similar way. The choice - often difficult and constrained - between the accuracy of the language description and the need to have uniform data for the machine to be easily handled. The avoidable and unavoidable biases, the precautions to be taken beforehand, as well as the advantages and disadvantages of these types of NLP models will be discussed. The article ends by outlining the possible future complementarities between qualitative and quantitative methods in current linguistics.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.