Towards scalable and reliable coding of semantic property norms: ChatGPT vs. an improved AC-PLT

Research output: Contribution to journalArticlepeer-review

Abstract

When using the Property Listing Task (PLT) to collect semantic content for a set of concepts (Concept Property Norms, CPNs), coding raw properties into standardized labels poses significant challenges. In this work, we address these challenges by enhancing the Assisted Coding for Property Listing Task (AC-PLT) framework, which facilitates the coding process. The current work conducts an ablation study to optimize AC-PLT by evaluating combinations of text cleaning, embedding models (e.g., Word2Vec, E5, LaBSE), and classification methods (e.g., kNN, SVM, XGBoost). Results show that normalization with the E5 embedding model and kNN classification achieves the highest accuracy, with top-1 test accuracies of 0.523 for CPN27 and 0.608 for CPN120 datasets, outperforming the original AC-PLT baseline. Comparisons with ChatGPT (fine-tuned and one-shot) reveal AC-PLT’s superior stability and cost-effectiveness, despite ChatGPT’s competitive performance in some cases. The improved AC-PLT framework offers a scalable, efficient solution to manual coding challenges, reducing variability and time constraints. Future work will explore its role as a recommender system for human coders, further enhancing its practical utility in cognitive psychology and psycholinguistics research.

Original languageEnglish
Article number302
JournalBehavior Research Methods
Volume57
Issue number11
DOIs
StatePublished - Nov 2025

Keywords

  • Ablation study
  • Codification process
  • Large language models
  • Machine learning framework
  • Property listing task
  • Semantic memory

Fingerprint

Dive into the research topics of 'Towards scalable and reliable coding of semantic property norms: ChatGPT vs. an improved AC-PLT'. Together they form a unique fingerprint.

Cite this