Building another Spanish dictionary, this time with GPT-4
Miguel Ortega-Martín, Óscar García-Sierra, Alfonso Ardoiz, Juan Carlos Armenteros, Ignacio Garrido, Jorge Álvarez, Camilo Torrón, Iñigo Galdeano, Ignacio Arranz, Oleg Vorontsov, Adrián Alonso·June 17, 2024
Summary
The paper presents Spanish Built Factual Freecianary 2.0 (Spani-BFF-2), an updated Spanish dictionary generated using GPT-4-turbo, which improves upon its predecessor by addressing limitations, expanding coverage, and incorporating part-of-speech tags. The study evaluates GPT-4-turbo's performance in generating definitions, comparing it to the Diccionario de la Lengua Española (DLE) and analyzing its strengths and weaknesses, particularly in handling monosemy and polysemy. While GPT-4-turbo generally provides accurate definitions, it struggles with polysemy and occasionally hallucinates. The dictionary has a higher precision for monosemy but lower recall for polysemous words, indicating room for improvement in capturing multiple meanings. The study also identifies issues with subword tokenization and error analysis, suggesting future work on refining the model's handling of rare and unconventional words, as well as responsible use of large language models in lexicography.
Introduction
Background
Evolution of built factual dictionaries
GPT-4-turbo as a language model innovation
Objective
Improve upon Spani-BFF-1
Evaluate GPT-4-turbo's performance in lexicography
Address limitations and enhance coverage
Method
Data Collection
GPT-4-turbo generation process
Comparison dataset: Diccionario de la Lengua Española (DLE)
Data Preprocessing
Definition generation from GPT-4-turbo
Part-of-speech tagging and analysis
Performance Evaluation
Monosemy and polysemy analysis
Accuracy, precision, and recall metrics
Error Analysis
Subword tokenization issues
Hallucinations and unconventional word handling
Responsible Use in Lexicography
Large language model ethics
Future directions for refining the model
Results and Discussion
Strengths and weaknesses of GPT-4-turbo in Spanish definitions
Comparison with DLE: GPT-4-turbo's performance
Recommendations for model improvements
Conclusion
Summary of findings
Significance of Spani-BFF-2 for Spanish language resources
Implications for future language model applications in lexicography
Basic info
papers
computation and language
artificial intelligence
Advanced features
Insights
What is the primary focus of the paper about the Spanish dictionary?
What is the main evaluation method used to compare GPT-4-turbo with Diccionario de la Lengua Española (DLE)?
How does Spani-BFF-2 differ from its predecessor in terms of improvements?
What are the specific challenges GPT-4-turbo faces in handling word meanings, as mentioned in the study?