Skip to main navigation Skip to search Skip to main content

Beyond Templates and BERT: Headword-centric parsing for semantic question answering in non-english financial domains

  • Jamal Al Qundus
  • , Bassam Al-Shargabi*
  • , Mario Graff-Guerrero (Editor)
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Recent advances in semantic question-answering (QA) systems struggle with linguistic variability, particularly in non-English domains like German finance. This work presents INAGQA, a novel QA system that addresses this gap through headword-centric parsing, combining syntactic chunking with knowledge graph embeddings to resolve the question ambiguity. The main innovations are as follows: First, a hybrid disambiguation algorithm that achieves 0.91 F1 in German financial queries, validated on 2,100 expert-annotated questions. Second, domain-optimized shallow parsing with customizable grammar rules that reduces relation-linking errors by 35% for compound nouns (e.g., Eigenkapitalrendite). And finally, seamless knowledge integration to prioritize user-curated data and demonstrates 2.1s average response time in a case study with financial analysts. Our experiments show that INAGQA outperforms BERT-KGQA (F1: 0.83) and template-based systems (F1: 0.79) while handling temporal / quantitative variants (e.g., When vs. Where was X founded?) with 98% accuracy. The editable system’s outputs of the system align with the Corporate Smart Insights frameworks, offering practical value for SMEs. To this end, the work contributes to Information SYstem (IS) research by proposing headword extraction as a replicable IS artifact for non-English QA and demonstrating language-sensitive design principles applicable to healthcare/legal domains.
Original languageEnglish
JournalPLoS ONE
Volume21
Issue number5
Early online date4 May 2026
DOIs
Publication statusPublished - 4 May 2026

Keywords

  • Algorithms
  • Humans
  • Language
  • Semantics

Cite this