GERest: A German Dataset for Aspect Sentiment Quadruple Prediction
- Thema:
- GERest: A German Dataset for Aspect Sentiment Quadruple Prediction
- Art:
- BA
- BetreuerIn:
- Nils Constantin Hellwig
- BearbeiterIn:
- Niclas Reuse
- Status:
- in Bearbeitung
- angelegt:
- 2024-11-21
- Antrittsvortrag:
- 2025-01-13
Hintergrund
The growing field of sentiment analysis has evolved from general text-level sentiment detection to more granular approaches like Aspect-Based Sentiment Analysis (ABSA) and Aspect Sentiment Quad Prediction (ASQP). While numerous resources and datasets exist for ASQP in English, there is a significant gap in resources for the German language. This lack of annotated datasets limits the development and evaluation of models capable of performing ASQP tasks in German. Addressing this gap is important to advance multilingual NLP and detailed sentiment analysis in underrepresented languages.
Zielsetzung der Arbeit
The primary goal of this thesis is to create the first German ASQP dataset by extending an existing ABSA dataset with opinion term annotations. The dataset will be used to train and evaluate a transformer-based model, such as BERT, to analyze aspect-level sentiments. Additionally, the performance of the model will be compared with an English ASQP dataset to assess cross-linguistic differences and the model's effectiveness.
Konkrete Aufgaben
- Convert the existing german ABSA-dataset GERestaurant from JSON to CSV format
- Extend the existing Rest16 dataset by annotating opinion terms for ASQP compatibility
- Train a transformer-based model (e.g. BERT) with the annotated dataset
- Evaluate the model's performance and compare it with the english ASQP dataset
Erwartete Vorkenntnisse
- Basic knowledge of Python for data preprocessing and model training
- Basic knowledge of natural language processing concepts (particularly ABSA)
- Basic experience with machine learning frameworks
Weiterführende Quellen
- W. Zhang, X. Li, Y. Deng, L. Bing and W. Lam, „A Survey on Aspect-Based Sentiment Analysis: Tasks, Methods, and Challenges,“ in IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 11, pp. 11019-11038, 1 Nov. 2023, doi: 10.1109/TKDE.2022.3230975.
- Wenxuan Zhang, Yang Deng, Xin Li, Yifei Yuan, Lidong Bing, and Wai Lam. 2021. Aspect Sentiment Quad Prediction as Paraphrase Generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 9209–9219, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.