Improving Internet search tools by means of LSA technology.
Nkukwana, S. & Weideman, M.
Poster in Proceedings of the 8th annual Conference on WWW Applications. 5-8 September. Bloemfontein, South Africa.
Nkukwana, S. & Weideman, M. 2006. Improving Internet search tools by means of LSA technology. Poster in Proceedings of the 8th annual Conference on WWW Applications. 5-8 September. Bloemfontein, South Africa. Online: http://web-visibility.co.za/website-visibility-digital-library-seo/
The principal objective of this research project is to investigate the use of Latent Semantic Analysis (LSA) as a mechanism for improving the quality of Internet search results. The research aim was to improve the standard of search engine results where accommodation in South Africa is the search key, using the Ananzi search engine.
LSA is a theory and a method for extracting and representing the contextual meaning of words by statistical computations applied to a large text section. It analyses word-word, word-passage, and passage-passage relationships. This makes it feasible to compare words by paragraphs, paragraphs by paragraphs, and paragraphs by documents for the relevancy of data. Most of the existing search engines base their information retrieval purely on the keyword search mechanism. This implies that results are retrieved based on the matching of these keywords, ignoring the meaning and the sense they make towards documents to be retrieved.
The methodology used in this research is to design a working prototype of an LSA search tool, and to compare Ananzi and LSA search results for a holiday accommodation web search, in South Africa. The first five results will be evaluated for both searches in order to produce a line graph so as to illustrate the most relevant and effective tool.
In conclusion, results could indicate that LSA has value in improving the current disadvantages of a keyword search, which had a negative influence in various search engines by producing poor answers. LSA can be utilized largely where a semantic text based search is required, which seem to be relevant to search queries. Such capability can be used and integrated with many existing search engines and can help improve the standard of searching algorithms.
- Deerwester, S., Dumais, T., Landauer, K., Furnas, W. and Harshman, R.1990.
Indexing by latent semantic analysis. Journal of the Society for Information Science,
- Garodia , R. 2005. "Web Spiders". [Online]. Available WWW:
http://www.allconferences.com/conferences/20050423181743/ (Accessed on 20 May 2005)
- Kintsch, W. 2001. Predication. Cognitive Science 25, 173-202.
- Landauer, T. K., Foltz, P. W., and Laham, D. 1998.
Introduction to Latent Semantic Analysis. Discourse Processes, 25, 259-284.
- Osinski, S. 2004. Dimensionality Reduction Techniques for Search Results Clustering. [Online].Available WWW: http://www.cs.put.poznan.pl/dweiss/carrot-bin/osinski04-dimensionality.pdf (Accessed 10 October 2004)
- Quesada, J.F, Kintsch, W. and Gomez, E. 2001.
A computational theory of complex problem solving using the vector space model (part I):
Latent Semantic Analysis, through the path of thousands of ants. In J.J. Canas (Ed.)
Proceedings of the 2001 Cognitive research with Microworlds meeting, 117-131.
- Thomas, A., and Shearer, J. 2000. Internet searching and indexing. New York:
The Harworth Information press.