Abstract
Question-answering systems make good use of knowledge bases (KBs, e.g., Wikipedia) for responding to definition queries. Typically, systems extract relevant facts from articles regarding the question across KBs, and then they are projected into the candidate answers. However, studies have shown that the performance of this kind of method suddenly drops, whenever KBs supply narrow coverage. This work describes a new approach to deal with this problem by constructing context models for scoring candidate answers, which are, more precisely, statistical n-gram language models inferred from lexicalized dependency paths extracted from Wikipedia abstracts. Unlike state-of-the-art approaches, context models are created by capturing the semantics of candidate answers (e.g., "novel,""singer,""coach," and "city"). This work is extended by investigating the impact on context models of extra linguistic knowledge such as part-of-speech tagging and named-entity recognition. Results showed the effectiveness of context models as n-gram lexicalized dependency paths and promising context indicators for the presence of definitions in natural language texts.
Original language | English |
---|---|
Pages (from-to) | 528-548 |
Number of pages | 21 |
Journal | Computational Intelligence |
Volume | 28 |
Issue number | 4 |
DOIs | |
State | Published - Nov 2012 |
Externally published | Yes |
Keywords
- context definition models
- definition questions
- feature analysis
- lexicalized dependency paths
- question answering
- statistical language models