Search engine exclusion policies: Implications on indexing E-commerce websites.

Mbikiwa, F.N.

Published Online

Mbikiwa, F.N. 2005. Search engine exclusion policies: Implications on indexing E-commerce websites. Unpublished Masters Thesis, Cape Peninsula University of Technology, Cape Town.

ABSTRACT
The aim of this research was to determine how search engine exclusion policies and spam affect the indexing of e-Commerce websites. The Internet has brought along new ways of doing business. The unexpected growth of the World Wide Web made it essential for firms to adopt e-commerce as a means of obtaining a competitive edge. The introduction of e-commerce in turn facilitated the breaking down of physical barriers that were evident in traditional business operations. It is important for e-commerce websites to attract visitors, otherwise the website content is irrelevant. Websites can be accessed through the use of search engines, and it is estimated that 88% of users start with search engines when completing tasks on the web. This has resulted in web designers aiming to have their websites appear in the top ten search engine result list, as a high placement of websites in search engines is one of the strongest contributors to a commercial website's success. To achieve such high rankings, web designers often adopt Search Engine Optimization (SEO) practices. Some of these practices invariably culminate in undeserving websites achieving top rankings. It is not clear how these SEO practices are viewed by search engines, as some practices that are deemed unacceptable by certain search engines are accepted by others. Furthermore, there are no clear standards for assessing what is considered good or bad SEO practices. This confuses web designers in determining what is spam, resulting in the amount of search engine spam having increased over time, impacting adversely on search engine results.

From the literature reviewed in this thesis, as well as the policies of five top search engines (Google, Yahoo!, AskJeeves, AltaVista, and Ananzi), this author was able to compile a list of what is generally considered as spam. Furthermore, 47 e-commerce websites were analysed to determine if they contain any form of spam. The five major search engines indexed some of these websites. This enabled the author to determine to what extent search engines adhere to their policies. This analysis returned two major findings. A small amount of websites contained spam, and from the pre-compiled list of spam tactics, only two were identified in the websites, namely keyword stuffing and page redirects. Of the total number of websites analysed, it was found that 21.3% of the websites contained spam. From these findings, the research contained in this thesis concluded that search engines adhere to their own policies, but lack stringent controls for the majority of websites that contained spam, and were still listed by search engines. In this study, the author only analysed e-commerce websites, and cannot therefore generalise the results to other websites outside ecommerce.
REFERENCES
  1. Adam, R. 2002. Is e-mail addictive? Aslib Proceedings, 54(2):85-94. Alimohammadi, D. 2003. Meta-tag: a means to control the process of Web indexing. Online Information Review, 27(4):238-242.
  2. Alimohammadi, D. 2004. Measurement of the presence of keywords and description meta-tags on a selected number of Iranian web sites. Online Information Review, 28(3):220-223.
  3. AltaVista. 2005. Submit a site. http://www.altavista.com/addurl/default [09 March 2005].
  4. Amaratunga, D., Baldry, D., Sarshar, M. & Newton, R. 2002. Quantitative and qualitative research in the built environment: application of 'mixed' research approach. Work Study, 51(1): 17-31.
  5. Ananzi. 2005a. Add your site to Ananzi. http://search2.ananzi.co.za/Add_site/ [09 March 2005].
  6. Ananzi. 2005b. Ananzi home page. http://www.anazi.co.za [12 September 2005].
  7. Ananzi. 2005c. Ananzi advertising and services. http://www.ananzi.co.za/comments/ratecard/rate_card.html [09 March 2005]. 102
  8. Ananzi, 2005d. Advanced search. http://search.ananzi.co.za/index.html?ql=a [5 October 2005].
  9. Anon, 2000. Top site promote. http://www.topsitepromote.com [4 September 2005].
  10. Anon, 2001. Get listed in the directories. http://www.123-search-engine-optimization.com/directories.html [20 July 2005].
  11. Anon, 2002. SEO code of ethics. http://www.searchengineethics.com [3 July 2005].
  12. Anon, 2004a. Black-hat search engine positioning tactics. http://www.beanstalk-inc.com/positioning-tactics/black-hat.htm [23 August 2005].
  13. Anon, 2004b. How to suggest a site to the Open Directory. http://dmoz.org/add.html [19 September 2005].
  14. Anon, 2004c. Report Spam Now. http://www.engine-spam/report-spam.html 20 September 2004].
  15. Anon, 2005a. Hidden Links and text. http://www.aim-pro.com/helpfiles/hiddenlinks.html [07 September 2005]. 103
  16. Anon, 2005b. Internet usage statistics - the big picture. World internet users and population stats. http://www.internetworldstats.com/stats.htm [19 August 2005].
  17. AskJeeves. 2005. Site submit service terms and conditions. http://www. ask.ineedhits.com/programterms.asp?n=u [28 May 2005].
  18. Barker, J. 2005. Meta-search engines. http://www.lib.berkeley.edu/Teachinglib/Guides/Internet/metasearch.html [03 September 2005].
  19. Barnes, S.J. & Vidgen, R.T. 2002. An interactive approach to the assessment of e-commerce quality. Journal of Electronic Commerce Research, 3(3):114- 127.
  20. Boyes, J.A & Irani, Z. 2004. An analysis of the barriers and problems to web infrastructure development experienced by small businesses. Information Technology Management, 11(2):189-207.
  21. Brightplanet 2005. http://www.brightplanet.com/ [7 September 2005].
  22. Brinkley, M. & Burke, M. 1995. Information retrieval from the internet: an evaluation of the tools. Internet Research: Electronic Networking Applications and Policy, 5(3):3-10. 104
  23. Chambers, R. & Weideman, M. 2005. Search engine visibility: a pilot study towards the design of a model for e-commerce websites. Proceedings of the 7th Annual Conference on World Wide Web Applications, Cape Town, 29 -31 August 2005,http://www.uj.ac.za/www2005.
  24. Collins, G. 2004. Latest search engine spam techniques. http://www.sitepoint.com/print/search-enginespam-techniques [12 September 2005].
  25. Cooper, B. 2000. Searching the Internet. New York: Dorling Kindersley. Cooper, D.R. & Schindler, P.S. 2003. Business research methods.8th ed. New York:McGraw-Hill Education Publishing.
  26. Courtois, P.M. & Berry, M.W. 1999. Results ranking in web search engines. Online Information Review, 23(3) http://www.onlineinc.com/onlinemag [17 March 2005].
  27. Cox, J. & Dale, B.G. 2002. Key quality factors in web site design and use: an examination. International Journal of Quality & Reliability Management, 19(7):862-888.
  28. Createtraffic.net. 2001. Doorway pages generator. http://createtraffic.net/tour.p?page=features/doorway [6 September 2005].
  29. Dahm, T. 2000. Getting (and keeping) a top search engine ranking. http://www.webdevelopersjournal.com/articles/get_keep_top_ranking.html [20 May 2005]. 105
  30. Darch, H. & Lucas, T. 2002. Training as an e-commerce enabler. Journal of Workplace Learning, 14(4):148-155.
  31. Davidrajuh, R. 2003. Realizing a new e-commerce tool for formation of virtual enterprise. Industrial Management & Data Systems, 103(6):434-445.
  32. Drott, M.C. 2002. Indexing aids at corporate websites: the use of robots.txt and Meta tags. Information Processing and Management, 38(2002):209-219.
  33. Duffy, D.L. 2005. Affiliate marketing and its impact on e-commerce. Journal of Consumer Marketing, 22(3):161-163.
  34. Dunn, R. 2004. The top 10 worst SEO tactics. www.stepforth.com [18 September 2005].
  35. Epstein, M. J. 2005. Implementing successful e-commerce initiatives. Strategic Finance, March:23-29.
  36. Fetterly, D., Manasse, M. & Najork, M. 2004. Spam, damn spam, and statistics. Using statistical analysis to locate spam web pages. Proceedings of the Seventh International Workshop on the Web and Databases, Paris, 17-18 June.
  37. Garofalakis, J., Kappos, P.& Makris, C. 2002. Improving the performance of web access by bridging global ranking with local page popularity. Internet Research: Electronic Networking Applications Policy, 12(1):43-54.
  38. Gikandi, D. 1999. Doorway pages go mainstream. http://www.webdevelopersjournal.com/articles/search_engines.html [27 July 2005]. 106
  39. Goh, D.H. & Ang, R.P. 2003. Relevancy rankings: pay for performance search engines in the hot seat. Online Information Review, 27(2):87-93.
  40. Google. 2004. Technology overview. http://www.google.com/intl/en/corporate/tech.html [13 July 2005].
  41. Google. 2005a. Google home page. http://www.google.co.za/ [24 August 2005].
  42. Google. 2005b Google information for webmasters: how do I get my site listed on Google? http://www.google.com/intl/en/webmasters/1.html [24 August 2005].
  43. Google. 2005c. Google information for webmasters: my pages are currently not listed. http://www.google.com/intl/en/webmasters/2.html [24 August 2005].
  44. Google. 2005d. Google information for webmasters: webmaster guidelines. http://www.google.co.za/webmasters/index.html [01 March 2005].
  45. Green, D. 2000. The evolution of web searching. Online Information Review, 24(2): 124-137. 107
  46. Gyongyi, Z. & Garcia-Molina, H. 2005. Web spam taxonomy. Proceedings of the First International Workshop on Adversarial Information Retrieval on the Web 2005, Chiba, Makuhari Messe, May 10, http://airweb.cse.lehigh.edu/2005/#proceedings.
  47. Hart, T. & Rolletschek, G. 2003. The challenges of regulating the web. Info, 5(5):6-24.
  48. Henning, E., Van Rensburg, W. & Smit B. 2004. Finding your way in qualitative research. Pretoria: Van Schaik Publishers.
  49. Henzinger, M.K., Motwani, R. & Silverstein, C. 2002. Challenges in web search engines. http://www.acm.org/sigir/forum/f2002/henzinger.pdf [12 August 2005].
  50. Hines, T. 2000. An evaluation of two qualitative methods (focus group interview and cognitive maps) for conducting research into entrepreneurial decision making. Qualitative Market Research: An International Journal, 3(1):7-16.
  51. Hsieh, C. & Lin, B. 1998. Internet commerce for small businesses. Industrial Management & Data Systems, 98(3):113-119.
  52. Jasco, P. 2005. Google scholar: the pros and the cons. Online Information Review, 29(2):208-214.
  53. Joint, N. 2005. Aspects of Google: bigger is better – or more is less. Library Review, 54(3):145-148. 108
  54. Kim, S., Shaw, T. & Schneider, H. 2003. Web site design benchmarking within industry groups. Internet Research: Electronic Networking Applications Policy, 13(1):17-26.
  55. Kirkpatrick, C.H. 2002. Increase your website's search engine ranking. Marketing Library Services, 16(8). http://infotoday.mondosearch.com [13 September 2005].
  56. Kline, V. 2002.Missing links: the quest for better search tools. Online information review, 26(4):252-255.
  57. Konia, B.S. 2002. Search engine optimization with WebPosition GOLDTM 2. Texas: Wordware Publishing.
  58. Machill, M., Neuberger, C. & Schindler, F. 2003. Transparency on the net: functions and deficiencies of internet search engines. Info - The Journal of Policy, Regulation and Strategy for Telecommunications, 5(1):52-74. http:/www.emeraldinsight.com/1463-6697.htm [02 March 2005].
  59. Marckini, F. 2000. How to avoid trouble with the engines. http://www.inc.com/articles/2000/02/17232.html [3 August 2005].
  60. Marketleap.com. 2005. http://www.marketleap.com/verify/defaault.htm [12 September 2005]. 109
  61. McGuigan, G. 2003. Invisible business information: the selection of invisible websites in constructing subject pages for business. Collection Building, 22(2):68-74.
  62. Moxley, D., Blake, J. & Maze, S. 2004. Web search engine advertising practices and their effect on library service. The Bottom Line: Managing Library Finances, 17(2): 61-65.
  63. Näslund, D. 2002. Logistics needs qualitative research – especially action research. International Journal of Physical Distribution & Logistics Management, 32(5):321-338.
  64. Nielsen, J. 2004a. Statistics for traffic referred by search engines and navigation directories to Useit. http://www.useit.com/about/searchreferrals.html [14 August 2005].
  65. Nielsen, J. 2004b. When search engines become answer engines. http://www.useit.com/alertbox/20040816.html [14 August 2005].
  66. Nielsen/Netratings. 2005. Nielsen/Netratings releases top 10 search engine share rankings for July 2005. http://www.netratings.com/pr/pr_050824.pdf [29 October 2005].
  67. Nobles, R. & O'Neil, S. 2000. Maximize web site traffic: build web site traffic fast and free by optimizing search engine placement. Massachusetts: Adams Media Corporation. 110
  68. Oppenheim, C., Morris, A., Mcknight, C. & Lowley, S. 2000. The evolution of WWW search engines. Journal of Documentation, 56(2):190-211.
  69. Palumbo, F. & Herbig, P. 1998. International marketing tool: the Internet. Industrial Management & Data systems, 98(6): 253-261.
  70. Peng, Y., Trappey, C.A. & Liu, N. 2005. Internet and e-commerce adoption by the Taiwan semiconductor industry. Industrial Management & Data Systems, 105(4):476-490.
  71. Perkins, A. 2001. The classification of search engine spam. http://www.silverdisc.co.uk/articles/spam-classification/ [12 September 2005].
  72. Podesta, G. 2000. E-commerce: helping customers gain the competitive edge. Plastics Engineering, 56(7):73-74.
  73. Post , G.V. & Anderson, D.L. 2003. Management information systems: solving business problems with information technology. New York: McGraw-Hill.
  74. Poulter, A. 1997. The design of world wide web search engines: a critical review. Program, 31(2):131-145.
  75. Remenyi, D. & Money, A. 2004. Research supervision for supervisors and their students. London : Academic Conferences.
  76. Rowlett, D. 2003. Stop search engine spam! http://www.internetmarketingwebsites.com/spam-review.htm [12 June 2005]. 111
  77. Ru, Y. & Horowitz, E. 2005. Indexing the invisible web: a survey. Onnline Information Review, 29(3): 249-265.
  78. Saunders, Lewis & Thornhill. 1997. Research methods for business students. London: Pitman Publishing.
  79. Sekhar, C. 2002. Internet marketing and search engine positioning: a “do it yourself guide”. Tennessee: Southern Star Publishing.
  80. Shenton, J. 2001. Search engines explained: an overview of search engines and their use in promoting web sites. http://www.globalmillenniamarketing.com [20 July 2005].
  81. Sherman, C. & Price, G. 2002. The invisible web: uncovering information sources search engines can't see. New Jersey: Information Today, Inc.
  82. Sherman, C. 2001. Google unveils more of the invisible web. http://searchenginewatch.com/searchday/article.php/2158091 [12 July 2005].
  83. Sherman, C. 2002. Yahoo! birth of a new machine. http://searchenginewatch.com/searchday/article.php/3314171 [12 June 2005].
  84. Simeon, R. 1999. Evaluating domestic and international website strategies. Internet Research: Electronic Networking Applications and Policy, 9(4):297- 308. 112
  85. Singh, A.M. 2002. The Internet - strategies for optimal utilization in South Africa. South African Journal of Information Management, 4(1). www.sajim.co.za [07 August 2005].
  86. Struwig, F.W. & Stead, G.B. 2001. Planning, designing and reporting research. Pearson Education Publishing. South Africa.
  87. Sullivan, D. 2001a. Consumer group asks FTC to investigate search ads. www.searchenginewatch.com//sereport/07/07-ftc.html [25 July 2005].
  88. Sullivan, D. 2001b. Desperately seeking search engine marketing standards. http://searchenginewatch.com/sereport/article.php/2164371 [5 September 2005].
  89. Sullivan, D. 2002a. Google bombs aren't so scary. www.searchenginewatch.com/sereport/print.php/34721_2164611 [3 September 2005].
  90. Sullivan, D. 2002b. How search engines work. www.searchenginewatch.com/webmasters.php/34751_2168031 [25 August 2005].
  91. Sullivan, D. 2002c. Intro to search engine optimization. www.searchenginewatch.com/webmasters/print.php/34751_2167921 [08 March 2005].
  92. Sullivan, D. 2002d. Search engine features for webmasters. www.searchenginewatch.com/webmasters/34751_2167891 [25 August 2005]. 113
  93. Sullivan, D. 2002e. Search engine link popularity. www.searchenginewatch.com/searchday/print.php/34711_2159711 [3 September 2005].
  94. Sullivan, D. 2003a. Ending the debate over cloaking. www.searchenginewatch.com/sereport/print.php/34721_2165321 [08 March 2005].
  95. Sullivan, D. 2003b. How search engines rank web pages. www.searchenginewatch.com/webmasters/print.php/34751_2167961 [08 March 2005].
  96. Sullivan, D. 2003c. Searches per day. http://searchenginewatch.com/reports/article.php/2156461 [26 July 2005].
  97. Sullivan, D. 2004a. Buying your way in: search engine advertising chart. http://www.searchenginewatch.com/webmasters/print.php/34751_2167941 [9 August 2005].
  98. Sullivan, D. 2004b. Google tops, but Yahoo switch success so far. http://searchenginewatch.com/searchday/article.php/3334881 [7 September 2005].
  99. Sullivan, D. 2004c. Major search engine and directories. http://searchenginewatch.com/links/article.php/2156221 [28 May 2005]. 114
  100. Sullivan, D. 2004d. Search engine results chart. http://searchenginewatch.com/webmasters/article.php/34751_2167981 [15 August 2005].
  101. Sullivan, D. 2004e. Search engine size wars erupts. http://blog.searchenginewatch.com/blog/041111-084221 [28 May 2005].
  102. Sullivan, D. 2004f. Spam rules require effective spam police. http://www.clickz.com/experts/search/opt/article.php/3348681 [10 November 2005].
  103. Sullivan, D. 2004g. Submitting to crawlers: Google, Yahoo, Ask/Teoma & Microsoft's MSN. http://searchenginewatch.com/webmasters/print.php/34751_2167871 [28 June 2005].
  104. Sullivan, D. 2004h. Submitting to directories: Yahoo & the open directory. http://searchenginewatch.com/webmasters/print.php/34751_2167881 [6 August 2005].
  105. Sullivan, D. 2005. Hitwise search engine ratings. http://searchenginewatch.com/reports/34701_3099931 [23 August 2005].
  106. The Endless Links Page Company. 2005. Welcome to my FFA links page. http://www.free-for-all-links-page.com [12 September 2005].
  107. Thelwall, M. 2000a. Commercial web sites: lost in cyber space? Internet Research: Electronic Networking Applications and Policy, 10(2):150-159. 115
  108. Thelwall, M. 2000b. Effective websites for small and medium-sized enterprises. Journal of Small Business and Enterprise Development, 7(2): 149-159.
  109. Thelwall, M. 2001. Commercial Web site links. Internet Research: Electronic Networking Applications and Policy, 11(2):114-124.
  110. Thelwall, M. 2002a. Methodologies for crawler based web surveys. Internet Research: Electronic Networking Applications and Policy, 12(2):124-138.
  111. Thelwall, M. 2002b. Subject gateway sites and search engine ranking. Online information Review, 26(2):101-107.
  112. Thelwall, M. & Vaughan, L. 2004. New versions of PageRank employing alternative web document models. ASLIB Proceedings, 56(1):24-33.
  113. Thurow, S. 2003. Search engine visibility. Indianapolis: New Riders Publishing.
  114. Thurow, S. 2004a. Doorway pages are bad. http://www.searchenginesbook.com/presskit.html [13 March 2005].
  115. Thurow, S. 2004b. How to spot Search engine Spam: Doorway pages. http://www.clickz.com/experts/search/results/print.php/3325301 [03 August 2005].
  116. Thurow, S. 2004c. Keyword Repetition for search engine optimization. http://www.webpronews.com [10 November 2005]. 116
  117. Van der Westhuizen, M. 2001. The invisible web. South African Journal of Information Management, 3(3/4). http:www.sajim.co.za [24 August 2005].
  118. Van der Walt, P.W. 1998. Task analysis of the webmaster. Unpublished, Rand Afrikaans University, Johannesburg (MI thesis).
  119. Van Steenderen, M. 2001. Website management: making a web site more visible. South African Journal of Information Management, 2(4). http://www.sajim.co.za [21 September 2005].
  120. Vaughan, J. 1999. Considerations in the choice of an internet search tool. Library Hi Tech, 17(1): 89-106.
  121. Wallace, D. 2003. Spamming techniques that you will want to avoid. http://www.searchrank.com/resources/art003.htm [10 November 2005].
  122. Weideman, M. 2004. Ethical issues on content distribution to digital consumers via paid placement as opposed to website visibility in search engine results. Proceedings of the Seventh International Conference ETHICOMP 2004, Syros, University of the Aegean:904-915, 14-16 April 2004.
  123. Wen, J.H., Chen, H. & Hwang, H. 2001. E-commerce website design : strategies and models. Information Management & Computer Security, 9(1):5- 12.
  124. Wikipedia.com. 2005a. Search engine. http://en.wikipedia.org/wiki/search_engine [29 September 2005]. 117
  125. Wikipedia.com. 2005b. Search engine optimization. http://en.wikepedia.org/wiki/serach_engine_optimization [29 September 2005].
  126. Wikipedia.com. 2005c. Spamdexing. http://en.wikipedia.org/wiki/Spamdexing [29 September 2005].
  127. Wilkinson, T.A. 2004. Just say no to SEO spam. www.w-edge.com [23 August 2005].
  128. Wilson, M. 2000. The development of the internet in South Africa. Telematics and Informatics, 16:99-111.
  129. Wilson, K.C. 2002. Automatic indexing : problems and solutions. http://www.humbul.ac.uk/ltsn-humbul/survey/survey_appendix8.doc [23 September 2005].
  130. World Wide Worx. 2002. The Goldstuck report: online retail in South Africa. http://www.theworx.biz/retail02.htm [23 September 2005].
  131. Wu, B. & Davidson, B.D. 2005. Cloaking and Redirection: A Preliminary Study. Proceedings of the First International Workshop on Adversarial Information Retrieval on the Web 2005, Chiba, Makuhari Messe, May 10 http://airweb.cse.lehigh.edu/2005/#proceedings. 118
  132. Yahoo! 2005a. Avoiding search engine spam. http://smallbusiness.yahoo.com/resources/article.php?mcid=6&scid=35&aid= 2731 [3 August 2005].
  133. Yahoo! 2005b. Search ranking help. http://help.yahoo.com/help/us/ysearch/ranking/index.html [3 August 2005].
  134. Yates, R. 2005. Web site accessibility and usability: towards more functional sites for all. Campus Wide Information Systems, 22(4):180-188.
  135. Zhang, J. & Cheung, C. 2003. Meta-search-engine feature analysis. Online Information Review, 27(6):433-441.
  136. Zhang, I. & Dimitroff, A. 2004. The impact of webpage content characteristics on webpage visibility in search engines (Part I). Information Processing & Management, 41(2005): 665-690 http://jis.sagepub.com/cgi/content/refs/30/4/310 [12 February 2005].
Full text of Thesis No 0089: Search engine exclusion policies: Implications on indexing E-commerce websites.

Digital Library with full-text of academic publications on website visibility, usability, search engines, information retrieval

Back to Home page