Can Recurrent Neural Networks Validate Usage-Based Theories of Grammar Acquisition? (2024)

https://doi.org/10.3389/fpsyg.2022.741321 ·

Journal: Frontiers in Psychology, 2022

Publisher: Frontiers Media SA

Authors: Ludovica Pannitto, Aurelie Herbelot

Abstract

It has been shown that Recurrent Artificial Neural Networks automatically acquire some grammatical knowledge in the course of performing linguistic prediction tasks. The extent to which such networks can actually learn grammar is still an object of investigation. However, being mostly data-driven, they provide a natural testbed for usage-based theories of language acquisition. This mini-review gives an overview of the state of the field, focusing on the influence of the theoretical framework in the interpretation of results.

List of references

Alishahi, Analyzing and interpreting neural networks for NLP: a report on the first BlackboxNLP workshop, Nat. Lang. Eng., № 25, с. 543
https://doi.org/10.1017/S135132491900024X
Arehalli, Neural language models capture some, but not all, agreement attraction effects, CogSci 2020
Baroni, Linguistic generalization and compositionality in modern artificial neural networks, Philos. Trans. R. Soc. Lond. B Biol. Sci., № 375, с. 1
https://doi.org/10.1098/rstb.2019.0307
Barsalou, The instability of graded structure: implications for the nature of concepts, Concepts and Conceptual Development: Ecological and Intellectual Factors in Categorization, с. 101
Boyd, Input effects within a constructionist framework, Mod. Lang. J., № 93, с. 418
https://doi.org/10.1111/j.1540-4781.2009.00899.x
Chelba, One billion word benchmark for measuring progress in statistical language modeling, arXiv [cs.CL]
Chowdhury, RNN simulations of grammaticality judgments on long-distance dependencies, Proceedings of the 27th International Conference on Computational Linguistics, с. 133
Christiansen, Implicit statistical learning: a tale of two literatures, Top. Cogn. Sci., № 11, с. 468
https://doi.org/10.1111/tops.12332
Christiansen, The now-or-never bottleneck: a fundamental constraint on language, Behav. Brain Sci., № 39, с. 1
https://doi.org/10.1017/S0140525X1500031X
Christiansen, Creating Language: Integrating Evolution, Acquisition, and Processing
https://doi.org/10.7551/mitpress/10406.001.0001
Cornish, Sequence memory constraints give rise to language-like structure through iterated learning, PLoS ONE, № 12, с. 1
https://doi.org/10.1371/journal.pone.0168532
Davis, Discourse structure interacts with reference but not syntax in neural language models, Proc. 24th Conf. Comput. Nat. Lang. Learn, с. 396
https://doi.org/10.18653/v1/2020.conll-1.32
Davis, Recurrent neural network language models always learn English-like relative clause attachment, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, с. 1979
https://doi.org/10.18653/v1/2020.acl-main.179
Dyer, Recurrent neural network grammars, 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference, с. 199
Elman, On the meaning of words and dinosaur bones: lexical knowledge without a lexicon, Cogn. Sci., № 33, с. 547
https://doi.org/10.1111/j.1551-6709.2009.01023.x
Fazekas, Do children learn from their prediction mistakes? a registered report evaluating error-based theories of language acquisition, R. Soc. Open Sci., № 7, с. 180877
https://doi.org/10.1098/rsos.180877
Giulianelli, Under the hood: Using diagnostic classifiers to investigate and improve how language models track agreement information, Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, с. 240
https://doi.org/10.18653/v1/W18-5426
Goldberg, Constructions at Work: The Nature of Generalization in Language
Gomez, Artificial grammar learning by 1-year-olds leads to specific and abstract knowledge, Cognition, № 70, с. 109
https://doi.org/10.1016/S0010-0277(99)00003-7
Gómez, Infant artificial language learning and language acquisition, Trends Cogn. Sci., № 4, с. 178
https://doi.org/10.1016/S1364-6613(00)01467-4
Gulordava, Colorless green recurrent networks dream hierarchically, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, № 1, с. 1195
Hart, Meaningful differences in the everyday experience of young american children, Can. J. History Sport Phys. Educ., № 22, с. 323
Hawkins, Investigating representations of verb bias in neural language models, Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), с. 4653
https://doi.org/10.18653/v1/2020.emnlp-main.376
Hochreiter, Long short-term memory, Neural Comput., № 9, с. 1735
https://doi.org/10.1162/neco.1997.9.8.1735
Hu, A systematic assessment of syntactic generalization in neural language models, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, с. 1725
https://doi.org/10.18653/v1/2020.acl-main.158
Huebner, BabyBERTa: Learning more grammar with small-scale child-directed language, Proceedings of the 25th Conference on Computational Natural Language Learning, с. 624
https://doi.org/10.18653/v1/2021.conll-1.49
Jackendoff, Foundations of Language
https://doi.org/10.1093/acprof:oso/9780198270126.001.0001
Kharitonov, How bpe affects memorization in transformers, arXiv preprint
Kuncoro, What do recurrent neural network grammars learn about syntax?, Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, с. 1249
Kuncoro, LSTMs can learn syntax-sensitive dependencies well, but modeling structure makes them better, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, с. 1426
Lakretz, The emergence of number and syntax units in LSTM language models, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), с. 11
Lepori, Representations of syntax mask useful: Effects of constituency and dependency structure in recursive lstms, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, с. 3306
https://doi.org/10.18653/v1/2020.acl-main.303
Linzen, Syntactic structure from deep learning, Ann. Rev. Linguist., № 7, с. 1
https://doi.org/10.1146/annurev-linguistics-032020-051035
LinzenT. ChrupalaG. AlishahiA. Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Brussels: Association for Computational Linguistics2018
LinzenT. ChrupalaG. BelinlovY. HupkesD. Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Florence: Association for Computational Linguistics2019
Liu, Probing across time: what does roberta know and when?, Findings of the Association for Computational Linguistics: EMNLP 2021, с. 820
https://doi.org/10.18653/v1/2021.findings-emnlp.71
Marvin, Targeted syntactic evaluation of language models, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, с. 1192
https://doi.org/10.18653/v1/D18-1151
McCoy, Revisiting the poverty of the stimulus: hierarchical generalization without a hierarchical bias in recurrent neural networks, CogSci, с. 2096
McCoy, Does syntax need to grow on trees? sources of hierarchical inductive bias in sequence-to-sequence networks, Trans. Assoc. Comput. Linguist., № 8, с. 125
https://doi.org/10.1162/tacl_a_00304
McRae, People use their knowledge of common events to understand language, and do so as quickly as possible, Lang. Linguist. Compass, № 3, с. 1417
https://doi.org/10.1111/j.1749-818X.2009.00174.x.People
Pannitto, Recurrent babbling: evaluating the acquisition of grammar from limited input data, Proceedings of the 24th Conference on Computational Natural Language Learning, с. 165
https://doi.org/10.18653/v1/2020.conll-1.13
Pickering, An integrated theory of language production and comprehension, Behav. Brain Sci., № 36, с. 329
https://doi.org/10.1017/S0140525X12001495
Ramscar, Error and expectation in language learning: the curious absence of mouses in adult speech, Language, № 89, с. 760
https://doi.org/10.1353/lan.2013.0068
Romberg, Statistical learning and language acquisition, Wiley Interdiscipl. Rev. Cogn. Sci., № 1, с. 906
https://doi.org/10.1515/9781934078242
Saffran, Statistical learning by 8-month-old infants, Science, № 274, с. 1926
https://doi.org/10.1126/science.274.5294.1926
Tomasello, Constructing a Language: A Usage-Based Theory of Language Acquisition
Attention is all you need VaswaniA. ShazeerN. ParmerN. UszkoreitJ. JonesL. GomezA. N. Long Beach, CACurran AssociatesAdvances in Neural Information Processing Systems2017
Warstadt, Blimp: the benchmark of linguistic minimal pairs for english, Trans. Assoc. Comput. Linguist., № 8, с. 377
https://doi.org/10.1162/tacl_a_00321
Warstadt, Learning which features matter: RoBERTa acquires a preference for linguistic generalizations (eventually), Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
https://doi.org/10.18653/v1/2020.emnlp-main.16
Can neural networks acquire a structural bias from raw linguistic data? WarstadtA. BowmanS. R. Proceedings of the 42th Annual Meeting of the Cognitive Science Society - Developing a Mind: Learning in Humans, Animals, and Machines, CogSci 20202020
Wilcox, What do RNN language models learn about filler gap dependencies?, Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP
Yu, Word frequency does not predict grammatical knowledge in language models, Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), с. 4040
https://doi.org/10.18653/v1/2020.emnlp-main.331

Publications that cite this publication

Quantum projections on conceptual subspaces

Alejandro Martínez-Mingo, Guillermo Jorge-Botana, José Ángel Martinez-Huertas, Ricardo Olmos Albacete

https://doi.org/10.1016/j.cogsys.2023.101154

Number of citations	0
Number of works in the list of references	52
Journal indexed in Scopus	Yes
Journal indexed in Web of Science	Yes