https://doi.org/10.3389/fpsyg.2022.741321 ·
Journal: Frontiers in Psychology, 2022
Publisher: Frontiers Media SA
Authors: Ludovica Pannitto, Aurelie Herbelot
Abstract
It has been shown that Recurrent Artificial Neural Networks automatically acquire some grammatical knowledge in the course of performing linguistic prediction tasks. The extent to which such networks can actually learn grammar is still an object of investigation. However, being mostly data-driven, they provide a natural testbed for usage-based theories of language acquisition. This mini-review gives an overview of the state of the field, focusing on the influence of the theoretical framework in the interpretation of results.
List of references
- Alishahi, Analyzing and interpreting neural networks for NLP: a report on the first BlackboxNLP workshop, Nat. Lang. Eng., № 25, с. 543
https://doi.org/10.1017/S135132491900024X
- Arehalli, Neural language models capture some, but not all, agreement attraction effects, CogSci 2020
- Baroni, Linguistic generalization and compositionality in modern artificial neural networks, Philos. Trans. R. Soc. Lond. B Biol. Sci., № 375, с. 1
https://doi.org/10.1098/rstb.2019.0307
- Barsalou, The instability of graded structure: implications for the nature of concepts, Concepts and Conceptual Development: Ecological and Intellectual Factors in Categorization, с. 101
- Boyd, Input effects within a constructionist framework, Mod. Lang. J., № 93, с. 418
https://doi.org/10.1111/j.1540-4781.2009.00899.x
- Chelba, One billion word benchmark for measuring progress in statistical language modeling, arXiv [cs.CL]
- Chowdhury, RNN simulations of grammaticality judgments on long-distance dependencies, Proceedings of the 27th International Conference on Computational Linguistics, с. 133
- Christiansen, Implicit statistical learning: a tale of two literatures, Top. Cogn. Sci., № 11, с. 468
https://doi.org/10.1111/tops.12332
- Christiansen, The now-or-never bottleneck: a fundamental constraint on language, Behav. Brain Sci., № 39, с. 1
https://doi.org/10.1017/S0140525X1500031X
- Christiansen, Creating Language: Integrating Evolution, Acquisition, and Processing
https://doi.org/10.7551/mitpress/10406.001.0001
- Cornish, Sequence memory constraints give rise to language-like structure through iterated learning, PLoS ONE, № 12, с. 1
https://doi.org/10.1371/journal.pone.0168532
- Davis, Discourse structure interacts with reference but not syntax in neural language models, Proc. 24th Conf. Comput. Nat. Lang. Learn, с. 396
https://doi.org/10.18653/v1/2020.conll-1.32
- Davis, Recurrent neural network language models always learn English-like relative clause attachment, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, с. 1979
https://doi.org/10.18653/v1/2020.acl-main.179
- Dyer, Recurrent neural network grammars, 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference, с. 199
- Elman, On the meaning of words and dinosaur bones: lexical knowledge without a lexicon, Cogn. Sci., № 33, с. 547
https://doi.org/10.1111/j.1551-6709.2009.01023.x
- Fazekas, Do children learn from their prediction mistakes? a registered report evaluating error-based theories of language acquisition, R. Soc. Open Sci., № 7, с. 180877
https://doi.org/10.1098/rsos.180877
- Giulianelli, Under the hood: Using diagnostic classifiers to investigate and improve how language models track agreement information, Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, с. 240
https://doi.org/10.18653/v1/W18-5426
- Goldberg, Constructions at Work: The Nature of Generalization in Language
- Gomez, Artificial grammar learning by 1-year-olds leads to specific and abstract knowledge, Cognition, № 70, с. 109
https://doi.org/10.1016/S0010-0277(99)00003-7
- Gómez, Infant artificial language learning and language acquisition, Trends Cogn. Sci., № 4, с. 178
https://doi.org/10.1016/S1364-6613(00)01467-4
- Gulordava, Colorless green recurrent networks dream hierarchically, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, № 1, с. 1195
- Hart, Meaningful differences in the everyday experience of young american children, Can. J. History Sport Phys. Educ., № 22, с. 323
- Hawkins, Investigating representations of verb bias in neural language models, Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), с. 4653
https://doi.org/10.18653/v1/2020.emnlp-main.376
- Hochreiter, Long short-term memory, Neural Comput., № 9, с. 1735
https://doi.org/10.1162/neco.1997.9.8.1735
- Hu, A systematic assessment of syntactic generalization in neural language models, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, с. 1725
https://doi.org/10.18653/v1/2020.acl-main.158
- Huebner, BabyBERTa: Learning more grammar with small-scale child-directed language, Proceedings of the 25th Conference on Computational Natural Language Learning, с. 624
https://doi.org/10.18653/v1/2021.conll-1.49
- Jackendoff, Foundations of Language
https://doi.org/10.1093/acprof:oso/9780198270126.001.0001
- Kharitonov, How bpe affects memorization in transformers, arXiv preprint
- Kuncoro, What do recurrent neural network grammars learn about syntax?, Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, с. 1249
- Kuncoro, LSTMs can learn syntax-sensitive dependencies well, but modeling structure makes them better, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, с. 1426
- Lakretz, The emergence of number and syntax units in LSTM language models, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), с. 11
- Lepori, Representations of syntax mask useful: Effects of constituency and dependency structure in recursive lstms, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, с. 3306
https://doi.org/10.18653/v1/2020.acl-main.303
- Linzen, Syntactic structure from deep learning, Ann. Rev. Linguist., № 7, с. 1
https://doi.org/10.1146/annurev-linguistics-032020-051035
- LinzenT. ChrupalaG. AlishahiA. Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Brussels: Association for Computational Linguistics2018
- LinzenT. ChrupalaG. BelinlovY. HupkesD. Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Florence: Association for Computational Linguistics2019
- Liu, Probing across time: what does roberta know and when?, Findings of the Association for Computational Linguistics: EMNLP 2021, с. 820
https://doi.org/10.18653/v1/2021.findings-emnlp.71
- Marvin, Targeted syntactic evaluation of language models, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, с. 1192
https://doi.org/10.18653/v1/D18-1151
- McCoy, Revisiting the poverty of the stimulus: hierarchical generalization without a hierarchical bias in recurrent neural networks, CogSci, с. 2096
- McCoy, Does syntax need to grow on trees? sources of hierarchical inductive bias in sequence-to-sequence networks, Trans. Assoc. Comput. Linguist., № 8, с. 125
https://doi.org/10.1162/tacl_a_00304
- McRae, People use their knowledge of common events to understand language, and do so as quickly as possible, Lang. Linguist. Compass, № 3, с. 1417
https://doi.org/10.1111/j.1749-818X.2009.00174.x.People
- Pannitto, Recurrent babbling: evaluating the acquisition of grammar from limited input data, Proceedings of the 24th Conference on Computational Natural Language Learning, с. 165
https://doi.org/10.18653/v1/2020.conll-1.13
- Pickering, An integrated theory of language production and comprehension, Behav. Brain Sci., № 36, с. 329
https://doi.org/10.1017/S0140525X12001495
- Ramscar, Error and expectation in language learning: the curious absence of mouses in adult speech, Language, № 89, с. 760
https://doi.org/10.1353/lan.2013.0068
- Romberg, Statistical learning and language acquisition, Wiley Interdiscipl. Rev. Cogn. Sci., № 1, с. 906
https://doi.org/10.1515/9781934078242
- Saffran, Statistical learning by 8-month-old infants, Science, № 274, с. 1926
https://doi.org/10.1126/science.274.5294.1926
- Tomasello, Constructing a Language: A Usage-Based Theory of Language Acquisition
- Attention is all you need VaswaniA. ShazeerN. ParmerN. UszkoreitJ. JonesL. GomezA. N. Long Beach, CACurran AssociatesAdvances in Neural Information Processing Systems2017
- Warstadt, Blimp: the benchmark of linguistic minimal pairs for english, Trans. Assoc. Comput. Linguist., № 8, с. 377
https://doi.org/10.1162/tacl_a_00321
- Warstadt, Learning which features matter: RoBERTa acquires a preference for linguistic generalizations (eventually), Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
https://doi.org/10.18653/v1/2020.emnlp-main.16
- Can neural networks acquire a structural bias from raw linguistic data? WarstadtA. BowmanS. R. Proceedings of the 42th Annual Meeting of the Cognitive Science Society - Developing a Mind: Learning in Humans, Animals, and Machines, CogSci 20202020
- Wilcox, What do RNN language models learn about filler gap dependencies?, Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP
- Yu, Word frequency does not predict grammatical knowledge in language models, Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), с. 4040
https://doi.org/10.18653/v1/2020.emnlp-main.331
Publications that cite this publication
Quantum projections on conceptual subspaces
Alejandro Martínez-Mingo, Guillermo Jorge-Botana, José Ángel Martinez-Huertas, Ricardo Olmos Albacete
https://doi.org/10.1016/j.cogsys.2023.101154
2023, Cognitive Systems Research, p.101154
Scopus
WoS
Crossref citations:0
Find all citations of the publication
About this publication
Number of citations | 0 |
Number of works in the list of references | 52 |
Journal indexed in Scopus | Yes |
Journal indexed in Web of Science | Yes |