You are on page 1of 12

Google with Global Search by Topic

I) Summary
"Idea for Google" is a revolutionary idea, which, if implemented, will make the Internet search a rich, modern and interactive experience. In short, the idea is to add browsing by topic in Google search. The classification of the Internet will be made by each of the owners of content. This task takes a few minutes. Choose words and browse by subject at the same time will result in a search much more powerful and effective. It is increasingly difficult to search for something on Google, because the number of manipulations and irrelevancies has greatly increased in recent years. For example, when one researchs a camera, one can want buy, read specialized reviews, search for help, see photos, etc. The words used in the search cannot express it well. The basic idea is very simple. Google creates a classification tree by topic. Google makes an application that allows owners to classify their contents (sites, blogs, social networks) in minutes, inserting a special code for security reasons. The owner uploads their contents with this code and then Google update this content in its classification. The user searches words, as usual, but now he can refine his search by clicking successively on the desired topic, or vice-versa.

This idea is totally different from old directory search, since the thematic classification of the contents will not be made by a limited staff or volunteers, but for all people who have a site, blog or social network, who do not want to be invisible. If Google adopt this idea, the concept quickly spread throughout the world. The concept of interactive search by subject mixed with Google word search, give a whole new dimension to the concept of Internet search. One gets what he wants will become much easier, faster and interactive. The world would greatly benefit because the Internet would be a much more manageable chaos, where proper information becomes much more accessible to everyone. Google would greatly benefit because it would create a new type of ad-revenue (ad-subject) that would sell not just words, but subjects. I am not looking to make any money off of this idea, my utopian dream is for this idea to become a reality because it would be very good for the world, which is starving for accurate information and suffering from indigestion due to the excess of irrelevant information.

II) Google Search Engine Problems


With the recent explosion in the number of websites, the use of search engines such as Google, returns an increasing number of hits. The problem is that the majority of the top-ranked websites, end up being useless for the objective of the person seeking relevant information.

One side of irrelevance, that Google handles, is the question of the websites importance, partially determined by the nature of its ranking. In its algorithm, Google considers, in a weighted manner, an index of websites that link to the website in question. The other side of irrelevance is more subtle: besides the words included in the search, there is the issue of context, which is known by the individual performing the search, and therefore, Google has no way of guess this. For example, when the user types Canon A470, he could be wanting: To buy the camera in a store To find the lowest price To see reviews written by experts Read readers opinions about the camera Understand how the camera works Etc.

These purposes cannot be properly resolved by simply adding more words to the search, since, for example, e-commerce websites may include websites with reviews (most often with no review posted), or there could be some product advice, but not expert advice, etc. However, in the life cycle of a camera, there is only one moment in which a person is buying the camera, and all of the other moments its owner is interacting with the camera and not buying it. Continuously trying to sell a camera to someone who is not looking to buy one is like selling movie tickets to someone already watching the movie. Furthermore, for most people, thinking of additional words to refine a search is a difficult process. Very few people have the ability, experience and persistence to be usually successful in complex searches. In many cases, it is not possible precisely and quickly navigate websites, only using words, in order to contextualize the search adequately to serve the user. The consequence is that many users give up and end up looking at one of their favorite websites, postponing the search, or even taking a different course of action. The irrelevance increases even more from the use of SEO techniques (Search Engine Optimization), which tends to improve the rank of websites that have the resources and desire to do so.

III) Proposal: Add topic to word search


Contrary to what may seem at first, this idea does not represent anything that has ever existed. The Google Directory was discontinued in July 2011, because the directory model based on voluntary work, created in 1998, is doomed to fail because it can not maintain a standard of quality, consistency and continuity for your users, since there are already hundreds of millions of content items, including blogs and social networks. Today this directory (Open Directory) can be accessed at www.dmoz.org. Another model is the one directory that is maintained by the search engine staff. This model is what made the reputation of Yahoo! since 1994. However, it is a too expensive mechanism to guarantee a minimum quality standard, without the cost of updating it constantly becomes prohibitive. Yahoo! directory exists until today, but it is outside from Yahoo!'s homepage since 2002.

The proposal here is a Search Engine using words, like Google, combined with a Web Directory or Directory Search. However, is totally different from previous efforts because the classification will be maintened not by a small group of people, but by millions of content owners. This proposal can be divided into several steps: 1 Google creates a Classification Tree Google must create a classification tree for the websites in the style of Google Directory. This topic or theme tree, if so desired, can be somewhat simpler, in order to facilitate classification for those responsible for each of the websites. The major themes will appear in the first level (e-commerce, references, education, etc.). Each theme opens up to sub-themes, and so forth, until reaching the leaf classification, assumed to be the most detailed level of classification desired. It is not necessary that the tree have the same number of levels in each of its branches. Replication of the final classification is accepted at different points of the tree, as already happens in Google Directory and the like. 2 Google develops Classification Application Google should develop a simple application that allows the person responsible for a website or content classify it in one theme from classification tree. (For example, Dictionary within Reference => Languages) If a site addresses more than one theme in different sections, other registrations may be made, provided there is some type of separation of HTML addresses (sub-sites). This happens, for example, in the case of any blog, that is located in a generic domain.

This process is technically done by incorporating a specific meta tag provided by Google that uniquely identifies the site or sub-site that is doing the classification. To prevent the information from being cloned by a competitor (such as Bing), the meta tag is coded and doesn't contains the theme. It should be stored in the Google servers, along with the classification assigned by the user. This procedure practically guarantees the authenticity of classification, because it should be done by someone who has the authority to update the site or sub-site. Subsequently, if the meta tag is removed from the content, there is no problem because certification of authenticity already happened upon registration. Only the one responsible for the content will be able to change the classification of the website. 3 Website is classified using the tree Soon after the user uses this application and uploads the updated website, the website will be available for automatic topic classification, to be performed during the next round of page indexing done by Google. Thus, every time that Google re-indexes the websites, it also re-indexes the theme linked to the website classification, using their internal data bank.

4 User benefits from Classification When the user performs any search on Google, it returns the results that it has always returned. Until then, nothing has changed. However, there is a major difference. Google would place on the left, in an organized and non-intrusive fashion, links that correspond to the first level of classification by themes. -/Below is an example in which some possible first levels are listed, simply as an example, in the form of links. In reality there would be more links. Here, I am not suggesting what the links should be; this would be irrelevant to grasping the idea.

If the user clicks on Products, he drops a level in the classification tree and the websites change accordingly:

This is the great innovation. The user now has access to websites that contain the desired words, but also relate to the topics that he would like to see. Thus, the results are much more useful to the user than the initial page that also lists hits related to blogs, news, chronicles, which simply pollute the results. All with a single click. Note that the second view shows the sub-branches of Products, while the branch Products is displayed on the top of the frame or table as a link, allowing users to return to the main topic with just one click. Thus, the user can refine their search either by using the classification tree and quickly reducing the amount of websites involved or by adding new words to their previous search, as can already be done. With another mouse click the user has access to, for example, various Reviews websites, all together. Overall, he performs, in addition to typing the words "Canon, "Rebel" and EOS, just two clicks.

IV) Details of Implementation


If the user does not classify their website, they remain on the first level called Not Classified. If the user classifies their website incorrectly, by accident, he can reclassify the website. He must only wait a few days for Google to perform the next round of indexing to have it corrected on Googles classification tree. If the user intentionally classifies their site incorrectly, users can report the classification error. To reduce fraud, the complaint form should be robot-proof by, for example, using the captchas technique. The number of complaints should be calculated in relation to different IPs, or even better, by the amount of different fixed IPs, that is, using a fixed node linked to the dynamic IP of each user. This entire process serves to minimize fraud. The index to be calculated, within a prescribed amount of time, is Complaint Index = Number of fixed IPs that filed complaints / Total number of fixed IPs that accessed the website via Google If this index passes a threshold percentage, the website automatically changes to Not Classified, and the user will receive an automatic e-mail from Google. The website will remain Not Classified until the user reclassifies the website.

The second invalid classification, reported in the manner described above, generates a quarantine period, in which the website remains as Not Classified, regardless of reclassification. If the complaint, despite all of the precautions, is a fraud, the website, if it has a large audience, may be singled out for auditing by Google. If it is singled out for auditing, the Google team will definitively classify the website and send an e-mail to the person responsible for the website, which will have a certain amount of time to accept the websites reclassification or suggest a proper classification, which will be subject to review.

The details of the operation described above follow just one line of thought, and can have another approach, either more or less automated. The important thing is to have a process that requires almost no manual work by Google, in addition to using the help of other users to progressively improve the quality of classification.

V) Advantages for Google


By adopting this idea, Google will practically become an eternal icon as a gateway to the Internet, because it will become almost insurmountable with a sophisticated and comprehensive search by words, coupled with a relationship linked to all of the planets relevant websites, through their topics. What is interesting is that Google, due to its universality, is the best option for enforcing this classification procedure on the people responsible for the websites. For Bing, it would be a somewhat slower process, although still worthwhile, for the differentiation that it would create. This idea adds a great value to Google search engine, without impacting centrally in its search and ranking algorithm. Require changes, but laterally to the current algorithm. This idea has great commercial potential for Google, because it allows the advertiser to have a much more contextualized and efficient device. Thus, the advertisement tends to receive more clicks. Google will be able to sell subjects (nicknamed here as ad-subject) and not just words (ad-word). As the user is more successful in finding what he wants than before, it encourages Googles use and ends up increasing the number of total clicks. Only the person responsible for updating the website can insert, through the Google application, the proper classification for the website, which will eliminate practically all manual work by Google that is normally associated with classification engines. Note that, through this process, only a hacker would be able to change the website classification. But this is not a problem, because if the hacker can change the classification, he can also change the entire website, which is much more serious. No other process of website classification into topics is as secure as this one. Google can use this mechanism (meta tags) for other purposes, yet to be imagined. Many service providers including Google, EBay, etc already use this feature. However, in this context, it involves virtually all relevant web content. Technical Note: Some sites use meta tags for validate a site or content action as legitimate, others uses a text file with a special code in the root of the content. This last method poses difficulties for blogs and social networks, because these generally do not provide access to the user to the root page in the server. In order to avoid that comments in blogs or social networks can post a classification code to defraud the

classification; Google can use any hallmark present in the source code to separate the author content from comments. Moreover, it can be easily settled with Facebook, Linkedin, etc.

VI) Benefits to the Community


It is much easier for the user to successively click than to keep trying to find the perfect words that allow him to narrow in on what he wants. Thus, the search requires less experience, less ability and finally, less persistence. Which in practice means that it will find more, high-quality results? It greatly increases the Internets relevance through Google. Naturally, abandoned or semi-abandoned websites will not have anyone to classify them, making the Internet much cleaner via Google. Even irrelevant sites that are classified are likely to fall into irrelevant topics. Interestingly, the minor topics will not be forgotten. Quite to the contrary. There are user niches that will be able to focus on their points of interest with much more ease. That is, this idea helps enable the long-tail of subjects on the Internet. Even without doing anything, all the results of previous searches will remain in the initial search, besides the fact that all of the websites were previously classified within the subject Not Classified, which is the initial topic of a non-classified website. Since all the trees are the same, you can direct Google to a current classification, in any language desired, and optionally suggest a translation for the search words. It allows for an alternative view of the directory that does not require manual work and is much more powerful than the current directories made with user collaboration. If the website is minimally relevant, there is nobody better than the personal responsible for the website to say what it means, from the point of view of Googles classification tree. The owner of the website, who is generally interested in their site being seen by those to which it is relevant, guarantees the classification quality. If it has a good audience, the owner of the website will not risk being reported by other users or being subject to auditing for manipulating the topic classification. Finally, this approach helps provide the world with more security. Whoever publishes highly controversial topics, such as terrorism or racism, anonymously, will have to think twice before classifying their content by topic, because they will have to expose the person responsible for the website to Google, to the extent that Google requires. Google can even offer users the option to filter out sites without topics.

VII) Parallel Trees


Based on this mechanism, Google may decide to make other parallel trees with new kind of classifications. As many things in life revolve around business, there is the very interesting possibility of creating a parallel tree, in the same manner as the topic tree, specifically for including all of the sites that provide a product or service, paid or free, to sales, specialized reviews, price comparison, etc. Google would benefit from this by having access to a powerful commercial tool while the user would have an even more powerful tool for filtering information. Google can sell Ad-What, allowing the site to be classified within those things that it sells.

This is quite interesting for websites, because it is very common for users to be interested in some product or service and not have a clear idea of how to proceed. For example, if the user wants to find all of the websites that sell cameras, he doesnt need to choose any words. Just click on Products: Online Stores in a tree and in another click Products End Consumer: Electronics: Cameras. Once this has been done, only the sites that sell cameras will appear.

IX) Thematic Search with Social Networks


Detailed here are the possible extensions for Social Networks, which can be directly or indirectly associated with the idea of implementing the automatic search by topics. 1 Social network is not a topic, but a category The idea here is to create a search category in social networks in Google, (including MySpace, Orkut, Twitter, Quora, etc.). Today, the only categories that exist are videos, blogs, news, books, etc. This idea is independent of the issue of assigning a subject to the websites, but related, because it is another dimension of filtering, which already exists. This creates a new dimension in the search for information from social networks, allowing a unified way to search for information in user profiles, using names, cities, etc. 2 Social network account can be themed Here I suggest allowing that the social network account itself, if related to a specific theme, be registered in a Google topic, since one manually posts the code that Google gives it as proof of authenticity, either manually or automatically. This suggestion enriches the search by topics, because it includes the vast range of social network users. A non-classified social network can be located within a topic called personal life or something similar. The current social search in Google has a different meaning and it is not a category. To maintain compatibility of the proposed idea with Google concept, it can create the filter friends, so that the search is limited to his network of contacts. This approach brings a new importance to themed social networks, because, along with the previous idea, it allows an extremely focused search within social networks. 3 Social popularity of a site. The idea is to allow the website, blog or social network account, upon entering their topic classification, to post in some associated social networks, a standard message, with some Google logo, quoting the website and the subject that it is classified under (as with many sites in relation to Twitter and Facebook, when the user permits). The social network user, upon seeing the classification post, can repost (re-twitter, for example) their support to a specific site. This action would have to be made by Google cloud and certified by posting the code given by Google. What was said above does not affect the fact that, regardless of the websites classification procedure, the social network user, interacting online with Google, can post a code provided by Google on their social network and give their approval to the website. This too can be reposted by the contacts, in more than one level. The different social networks approval of a website can be transformed into a global index or an index for social networks, using some formula that can use the number of contacts,

visitors and other measures of support, to calculate the index. This helps to prevent that ghost accounts, which follow each other, are not included in the index calculation. This approval may be listed in the search results, directly or indirectly and measures the sites social popularity. Eventually, it is even possible to use social popularity as part of Googles page rank. This is one more way to stand out amongst the rest. The websites will work hard to become socially popular and will be careful to not disappoint their supporters. They can also campaign to rally supporters in hopes of increasing their popularity. For an e-commerce website, for example, nothing prevents them from creating special deals to gain supporters. It cant be something frivolous or users may withdraw their support. On the other hand, websites linked to the sale of products or services, that offer poor service, can become victims of viral campaigns that, depending on the popularity and influence of the one complaining, may result in a loss of support for the site, including the option of officially registering your dissatisfaction with a particular website, using the same process as above. In short, I propose that a social network users scale of approval of a website has three values: support, indifference or ignorance, and disapproval. This idea is much more general than the current support mechanisms, such as like icon in Facebook or "+1" from Google+ because it is not tied to any one specific social network and has explicit disapproval (thumbs down) option, like the site StumbleUpon does. After all, the relationship is between any two contents. Since there is already a complete relevance algorithm of content, expressed by the page ranking, it is possible measure the relevance of the evaluation in a much more reliable, applicable and comprehensive manner. The presented idea is practically independent of the categorization. The formal instant of categorization is only the beginning of a process that builds on itself, but is not necessary for the idea of social popularity. The only common point is that it uses the idea of having something posted to serve as authentication, for information that will remain in the Google servers. This philosophy does not overload the Google robot. 4 Posting links on a theme The suggestion here is that Google creates a brief and simple coding, that functions as a link that can be posted in any place (blogs, social networks, question and answers sites, etc.). The idea is to merge words and themes that directly result in the search results. As searching by words and topics is much cleaner and more precise, this type of link is extremely expressive with very precise results. This concept is closely related to the implementation of the automatic search by topics, because, in the current scheme, this link would not make much sense, due to the large amount of irrelevant sites returned. When performing a search using only words, almost always, the sites that appear are completely different than the objective of the person performing the search.

X) Conclusion
The title of a lecture given in 1972, by the North-American mathematician and meteorologist Edward Lorenz (1917-2008), was Does the flap of a butterflys wings set off a tornado in Texas?. The so-called Butterfly Effect became so popular that it was turned into a movie; moreover a good movie. If something so insignificant could have such dramatic consequences, what are the possible consequences that might result from the implementation of the content proposed in this article?

These days, anyone thinking of searching for some information, thinks of Google. Libraries, friends, everything became insignificant, in a sense. However, irrelevance is widespread. Accurate searches are for the heavy users. It is beyond reach for mere mortals. Thus, I cant even imagine how much people, companies and governments would gain in time and quality, with a search engine that combines words and topics. Some examples: A company discovers information about a technology on the other side of the world that can revolutionize their production system. Thematic fragments can inspire a writer to produce new musings. A traveler improves their itinerary and embarks on an even more memorable journey. This list has no end This idea, indirectly, can even contribute to saving lives and stimulating the evolution of science. How many times has a doctor or scientist discovered something almost by accident? History is rich with evidence about the power of chance! Imagine a doctor making a huge breakthrough from an approach suggested in an article, that normally he would never have found. The fact that this idea is mine is just a detail. Certainly, the people who read this article and buy this concept, and help to spread it, would be a part of this. An idea like this is a lit fuse flying through the air. After all, an idea, like sound, does not travel in a vacuumlacking communication. Lastly, an idea like this needs to be spread!

You might also like