Internet Search Mechanisms And Distortions Of The Semantic Space: The Scientific Challenges Facing The \“Googles”
Price
Free (open access)
Volume
38
Pages
10
Published
2007
Size
683 kb
Paper DOI
10.2495/DATA070171
Copyright
WIT Press
Author(s)
A. Linhares & C. W. Afonso
Abstract
Ever since the launch of Altavista, internet search engines have become a multi-billion dollar industry, with fierce competition between Google and the three major competitors. One of the challenges involved is to rank search results in a way that places the most meaningful results at the top. In order to do this, the algorithms involved must try to grasp the actual meaning, the semantics, embedded in a search query. In this paper we discuss a problem we call \“distortions of semantic space”. Distortions of semantic space occur regularly in people’s texts, writing styles, labeling of images, etc. We present a number of examples of distortions of semantic space, and analyze the problem. We also comment on new computational architectures that have tried to handle this problem, albeit the state of the art still remains far from the needed challenge. Keywords: search mechanisms, distortions of the semantic space, Google, literal search, new contents of the internet, semantic web. 1 Introduction Google defines its mission as \“to organize the world’s information.” Since its launch, in 1998, it has reached enormous financial and marketing success, given its superior ranking and indexing technology of data in the Internet. It is now possible to carry searches in 100 different languages with Google, and in 2005, the company reached the mark of a billion searches per day (Friedman [1]). To sustain this leading strategic position, however, the company faces enormous scientific obstacles so that, as the types of information available on the web change, new technologies may be able to organize them in an agile form for all to access.
Keywords
search mechanisms, distortions of the semantic space, Google, literal search, new contents of the internet, semantic web.