Investigating semantic knowledge for text learning
Published in SIGIR-03, 2003
Recent work has made much of using semantic knowledge, derived in particular from domain ontologies, for improving text learning tasks. Semantic knowledge is assumed to capture more in-depth knowledge of the text domain in comparison to more conventional statistics-based methods that can only rely on more surface vocabulary-specific characteristics of a data set. Therefore, using semantic knowledge instead of statistics-based methods will improve performance in text learning tasks significantly. We believe that this claim needs careful scrutiny and examine the validity of this assumption in this paper. We explore the usefulness of ontologies for a text classification task and the use of feature selection methods to extract terms that can function as candidate ontological concepts for building or extending ontologies. We point to a number of issues that arise when trying to use semantic knowledge for text classification,. One particularly troublesome issue is that semantic knowledge encoded in ontologies simply may not correspond to the concepts and terms significant for text classification.
Anupriya Ankolekar, Young-Woo Seo, and Katia Sycara, Investigating semantic knowledge for text learning, In Proceedings of the ACM SIGIR Workshop on Semantic Web, pp. 9-17, 2003.