You are on page 1of 78

Referencing of Website

UBB - F. PRULHIERE

Summary
I. Introduction
II. Methodology III. Finding information on the Web IV. Positionning a website on search tools V. Methodology for the positioning of a website VI. Methodology for optimizing of a web page VII. How to miss the referencing your website VIII. Meta tags: work and implementation IX. Popularity index X. Reference a website design in flash
UBB - F. PRULHIERE 2

I. Introduction
The ranking of a website is a set of techniques to promote understanding of the theme and content of one or all pages of a website by search engines. These techniques are therefore designed to provide maximum information about the contents of a web page to crawlers of search engines.
The aim of this process is to guide the positioning of a web page in search engines results on keywords relevant to the main themes of the site. It is generally considered that the positioning of a site is good when it is positioned (ranked) in one of the first ten or twenty responses of a search on keywords precisely matching its theme. The understanding of how work search engines and directories is essential for a good optimization.

UBB - F. PRULHIERE

II. Methodology
The methodology to be considered in chronological order and according to refine the experience, is in 12 points! ... 1. Understanding how search engines and directories work. 2. Having a look at the listings in Romanian, French and English directories and search engines and do a manual submission. 3. Optimize the web pages of your site by following a methodology, based on criteria of relevance of the engines and ensure that no blocking factor hinder the future ranking of your site. 4. Test your pages on an HTML code audit system that will allow you to better refine this work by giving you many indications on how to best define all areas of your pages according to the criteria of relevance of search engines .

UBB - F. PRULHIERE

5. You can then use the interface of checking and manual submission to rank your site on major search tools. On these directories and search engines, still prefer a manual process, far more effective than all the software of the world.

6. If some major portals like Free, Orange, SFR, ..., seem vast relative to your target, check who provides them with their engine or directory and submit your site to them (for example dmoz). 7. Identify engines and directories which are specific to your business (ecommerce, water sports, health, food ...) and also submit them on your site. To find these search thematic tools, use websites that list them. If your site has an important regional character, also submit it on regional search tools (Romanian, English, ).

8. Every 3 to 4 weeks, you should check the presence of your site with the manual interface, because submitting a site is unfortunately not, synonymous with a registration in a database of the search tools.
UBB - F. PRULHIERE 5

9. During the phases of your optimization and referencing, always follow the charter of quality and ethics enacted by IPEA-SMA (Internet Positionning European Association Search engine Markting Association) and validated at the time by the French major engines and directories. It remains generally valid today.
10. Search engines are also increasingly sensitive to the link popularity of sites (the number of links from your pages to the Web and their "quality", i.e. the popularity of the pages that contain them). Calculate your own and increase them by a broad campaign to exchange links with friendly sites (complementary but not competing). 11. Once your site has been installed everywhere, do, every quarter or every six months, a "revision" of SEO: Make sure your site is present everywhere, and if not, repeat a submission phase through the manual interface.

12. Finally, in this methodology, we have only spoken of referencing, the presence of your pages in the index directory and engines. You can also go faster by proceeding to the referencing of your site on certain keywords or terms.
UBB - F. PRULHIERE 6

III. Finding information on the Web


1. What is a directory?
A directory is a search tool that identifies a number of sites through listings including, in general, title, address (URL) and a brief description of 15 to 25 words or less. Each site is placed in one or more category (s) - also referred to as section (s) -. These tools can thus be seen as the yellow pages of the Web. When a keyword is entered in the proposed form, the directory searches on the occurrences of the term in their listing site, and not in the content of the pages in question. This is the most notable difference with search engines. The most popular directories worldwide are Yahoo! International, Looksmart and Open Directory. As a francophone, the most used are Yahoo! France, Nomade, le Guide de Voil, the Lycos Directory France and French Open Directory, Yahoo!France. But there are many others.

UBB - F. PRULHIERE

2. What is a search engine?


The search engine operates on a system radically different from the directory. Software robots (called crawlers or spiders) scan the Web, going from page to page (actually from link to link) and back up the text content of the encountered pages, thus constituting an "index" , i.e. a greater or lesser collection of Web pages. Most of the time, the "index" of global search engines contain hundreds of millions of web pages.
The software robot returns more or less frequently on the pages it has indexed in advance to save a newer version. We say that it "refreshes its database (or index). When the user enters a keyword in the proposed form, the engine will seek out the score in its index, that is to say in the text content of Web pages stored in advance. Once the set of pages that contains the requested term has been identified, the engine ranks the pages in order of relevance, in an order and an algorithm (based on some sort of criteria) that is specific to it.
UBB - F. PRULHIERE 8

The search engine conducts its research on Web pages, while the search directory suggests sites. There is a difference which is why it is absolutely impossible to compare results from both types of tools.
The best known search engines are AltaVista, Google, AllTheWeb, MSN Search (Bing) and Yahoo Search. At the French level, Google France, Voil and Exalead are some of the most effective.

3. Is a meta engine better than an engine?


Meta-search engines are tools that, for a same query, interrogate several engines simultaneously, repatriate the results, summarize and offer a summary of responses. The idea is attractive ... However on these services, it is not permitted to use the advanced features of search engines, simply because they vary greatly depending on the tools surveyed. It is obvious that the implementation of these features as part of a simultaneous search of multiple engines is far from easy or simply impossible.
UBB - F. PRULHIERE 9

On the other hand, meta-engines synthesize the results from several different engines, ranking each of their results in different ways, without using the same criteria of relevance. Summarizing the classified documents using disparate ways, is it so simple to do, and most importantly, is it more relevant? Can we compare the result given by Yahoo! or by Google for the same keyword? The question remains ...
Using this type of meta-engines creates another problem: almost all search engines on which they rely, are financed through advertisements (banners, sponsored links) they display. However, proponents of this additional software layer that are meta-search engines do not always use these ads, sometimes even preferring to offer their own ads. The use of these meta-engines substantially reduces the number of accesses to the traditional search engine, which compromises its advertising revenues and may eventually sign its death warrant.

UBB - F. PRULHIERE

10

On the other hand, there is an ethical problem: Is it fair to use for its own benefit the technologies and investments implemented by other companies, without payment? If the concept of meta-search engines is interesting at first, it can be quite sure that the use of a single engine, using all its advanced features, following effective methodologies, is more fruitful.

4. When to use a directory and when to use a search engine?


In general, we can say that we shall use a directory like Yahoo Directory or the Open Directory to find a general site on a given topic. However, we shall use a search engine like Google or Yahoo Search to search for a much more precise topic. To give a simple example (even simplistic), we shall research the companys website on a directory, but we shall research the information on its products (research therefore within sites) on the search engines.

UBB - F. PRULHIERE

11

IV. Positioning a website on search tools


People ask many questions about the positioning of a page or a site on search engines and directories. Many specifications or bidding, request a clause of "positioning" in their referencing requests. To try to see things more clearly in this field, here are some items related to positioning. However, they can be questioned if indexing techniques change.

1. What is positioning?
In the field of SEO, positioning is to ensure that a website, or pages that constitute it, are well ranked in the results of research tools (directory, engine) for a word or expression.

UBB - F. PRULHIERE

12

Example: you sell shoes. The positioning of your site will be to ensure that a page of your site appears in the top results of major search engines, as close as possible to the first place, when the user types, for example, queries: sale shoes, luxury shoes, shoes bottom price, shoe shopping, etc. The aim is to appear in the first thirty results, i.e. in the first three pages of results (many search tools offering 10 results per page). The best, of course, is to be on the first page (the first 10 links) and, better yet, in the first place!

2. Position a site on the first two pages of the major search engines
"Being on the first X pages of search tools" means nothing in itself. This positioning should be absolutely compared to keywords or sentences (like "new york", "train ticket sales," "buying a washing machine, etc.) which are given at first.

UBB - F. PRULHIERE

13

It is therefore very important when you are looking for a position, to say what words, terms or sentences you wish.

Warning: every word, every letter is important. If you optimize a page of a site for the expression "sale of air tickets" and get a good ranking on a given engine for this sentence, the page may be ranked much worse if someone enters "sale ticket" or "Sale of tickets" on the same engine. Do you see the difference between the three queries?

3. The positioning is done on keywords


The positioning is often done on keywords. But the question is how to choose them? You just know one thing: the more the word is generally considered, the more it is typed by the customers of search engines, the more the positioning is complex. The more the chosen word is rare, the more the positioning is possible, but least the word is entered on the search engines. It is, for example, absolutely impossible now, to make a good manual positioning (excluding advertising positioning) on words like car, books, CDs, computers, industry, commerce, MP3, Star Wars, Literature, etc.
UBB - F. PRULHIERE 14

So, the more the specific term will be required of your business, the better the position will be, if your pages are optimized according to the sort criteria of search engines. But on the other hand, there will be few people who will type it on engines. A site like Keyword Search Engine can help you with this choice of keywords by making an "audit" of keywords that you imagined at first. You have to ensure that the scale does not collapse on one side or the other. But the positioning is not (no longer?) a very effective way to quantify the quality of a good optimization. It is much better to check the overall traffic from search tools in characterizing the "referrer url (sites where users come to your site) with statistical tools (most tools of this type allow to access to a "referral traffic") to measure overall traffic from major search engines, all keywords combined. The Excel tables, showing the positioning of the site for a particular keyword on a particular engine or directory, still exist. This is no longer useless! In any case, used alone, they are only a part of the truth.

UBB - F. PRULHIERE

15

In fact, regarding the positioning, the question to ask is not only "in which position is my site on the engine engine1 for keyword key1?" But "What is the traffic generated by the position of my site on the engine engine1 for keyword key1? ". Is it usefull to be prime for a given expression on Google, if it does not generate any traffic? In this, a statistical analysis tool has become indispensable for quantifying the quality of an optimization.

4. I'm misclassified on the directory Yahoo! France for the word: "Machin"
First, we must be very clear on one point: if someone working in the field of SEO guarantees (excluding advertising positioning) a position in a directory like Yahoo! France, the Open Directory or other LookSmart, you can say you've just met a crook! ... Positioning is a technique that applies only to search engines, not to directories!

UBB - F. PRULHIERE

16

To convince ourselves, look at the process referencing on such tools: the first time you see the category(ies) of interest, then you fill out a form describing the site: title, address, summary, chosen category(ies). A few days later, a directory Netsurfer see your site and decides if he accepts it or not. If his decision is positive, he very often changes the available title and abstract. You dont control the final description that will be displayed on the directory (and only if the site is accepted!). If the word Machin" appeared in the summary, you have proposed, the directory netsurfer may well have to remove it. He has the right (he takes it anyway!). Now it is quite possible to request the amendment of a title or a summary to a directory. A form is also often available for this task. However, it is necessary that the request be justified. If it is for other purpose than trying to be ranked (and directories netsurfeurs have some experience with it), the application has no chance of being accepted. It may instead, make you look bad ...
UBB - F. PRULHIERE 17

5. Do I need a big budget on SEO and only SEO to promote my site?


This can be a big mistake. SEO is a necessary phase but absolutely not sufficient in the larger process of promoting your website.

To promote your site you should absolutely spend a good SEO, of course, but also a policy of links exchange, buying banner ads, sponsorship, creation of quizzes, sending press releases, etc.
And dont forget that the main generator of traffic to your site in the medium term, is the good match between the target you're aiming in the interest of users and the content, you offer! Do not try to ask for referencing more than it can do, you will be disappointed!

UBB - F. PRULHIERE

18

6. The website is not found when doing a search by keyword.


a. Case for a search engine

This is possible if the keyword is found neither in title nor in the visible text, in short, although present anywhere in the meta tags. It should be noted that these are hardly taken into account by most search engines. These can not find your site on that keyword.
b. Case for a search directory Be carefull: Many people think that when submitting a site, including search engines, it provides a number of words that serve as reference in future research to users. It is not nothing! Directories (Yahoo Directory, Open Directory, Le Guide de Voil) conduct the research in the listings of sites (title, abstract, address, item name) and not in a list of keywords.

UBB - F. PRULHIERE

19

Engines (Google, AltaVista, Yahoo Search, MSN Search, Here ...) conduct their researches in the HTML of your pages: title, visible text, etc.

In any case, the keywords that you give to the referrer (which are most often requested by it) dont go directly a posteriori to the researches on the tool. Some directories require keywords, but its rather to get an idea of the content of the site, to help netsurfers to evaluate your site. But no more. Once the site is accepted or rejected, these keywords are trashed (virtualy, OK). However, some directories, like Nomade and Le Guide de Voil take into account the provided keywords, select or modify them and keep them as search criteria.

UBB - F. PRULHIERE

20

V. Methodology for the positioning of a Website


Referencing (inserting) a site on a search tool is one thing, positioning it, is another. Positioning a site means that for a given keyword or sentence (a sequence of words) which is important to your business, you try to place a page of your site in the X first results of research tools. Knowing that "X" is ideally in the 10 first responses, and at worst the top 30. Here is a methodology that will help you, at least we hope, to make this type of operation.

UBB - F. PRULHIERE

21

Before starting, it is necessary to understand that a positioning strategy is only possible on search engines. For the directories (Yahoo Directory, Nomade, Open Directory, ...) defining themselves the fields on which further research will focus (title, abstract, field names), it is impossible to establish a real methodology of positioning. And if someone says he has successfully positioned on site on 1st position for one keyword on the "directory" part of Yahoo, it is simply, and in 99% of cases ... that he got lucky!
Second point, we only speak here about the position in the organic results of the engines, i.e. coming from their index. We wont speak about the positioning through sponsored links, which involve more criteria such as ... the thickness of the wallet. After these two points are set (and they are important!), we can begin the description of the methodology itself.

UBB - F. PRULHIERE

22

The first crucial point is to choose the right keywords. Indeed, it is useless to attempt positioning on keywords (or sequences of keywords) that nobody ever types on a search engine, since it doesnt generate any traffic! If your site has already been in operation for several months and referenced on a number of engines and directories, see your statistics on the "referrers" and identify the keywords previously entered by users to find the site. This will give you an initial idea. Otherwise, use a keyword generator to help you to make your choice. Sites like Keyword Search Engine or Google AdWords can also give you valuable assistance. To determine if a positioning is possible, go, for example, on Google. If a single term interests you, type it as is. If it is rather a combination of keywords on which you want to execute the positioning, type it in quotes: "electronic commerce", "sale appliances", Airbus A380", etc. Then look at the number of responses indicated by Google. Below 20 000 results, positioning is possible. Beyond 200 000, there are too many answers and positioning becomes very complex. Between these two numbers, you will need to paddle", but there are some possibilities.
UBB - F. PRULHIERE 23

However, if you get 2 million hits, immediately drop and change keywords ... Again, the site Keyword Search Engine can help you.
You've chosen your keywords? You are sure that a user can potentially type them in an engine? They are not too specific? You can then move to the next phase ...

The best thing to do is to optimize the REAL home page of your site. And also all the other pages ...
Once the pages have been optimized, you can submit them to the engines.

When these pages are listed in the indexes of the engines, you need to check their positioning on the major engines. You can do this by hand or using a specific application that allows you to make this work automatically, like those from the list below.

UBB - F. PRULHIERE

24

Application Name

Corresponding URL

Agent Web Ranking


Promoteur Search Engine Commando

http://www.agentwebranking.com/
http://www.trellian.com/fr/swolf/ http://www.searchenginecommando.com/

SeeYourRank
SeoWebRanking

http://www.yooda.com/
http://www.seowebranking.com/

This list is not exhaustive of course!

UBB - F. PRULHIERE

25

If your pages are wrongly classified, you must repeat the entire procedure: further optimization, rechecking, etc. The engines updating their indexes often enough, it is not required to re-submit the address of a page that has already been referenced. When you get a good ranking on an engine for one term or a series of keywords, the work is not finished. You must now check that this work generates traffic. Indeed, the aim of positioning is not to be in the top ten results of an engine, but that this work generates traffic on your site. Analyze your statistics with statistical tools (Weborama, eStat, Stats-Reports, XiTi, etc.) and see the indications of "referrers" (from which sites come the users who "land" on your pages and which keywords they have typed for that?). If you find that the keyword on which you are first, had in fact been typed by almost anyone, you have worked for nothing and you can go back to the first item of this methodology to redefine new terms ...

UBB - F. PRULHIERE

26

Finally, if you are well positioned on a keyword and if it generates traffic, you have won. Now do a follow-up work of this position and check every month, to see that you remain well positioned. React otherwise. To conclude, we can say that the aim of positioning is to find the right match between the keyword on which we will attempt to position and the traffic it will generate. If you find the term(s) on which the positioning is (are) still possible and that generate(s) some traffic, then you win. Not so simple ... Especially since, on many of the current search engines, the results are totally changed, for the same keyword, an hour to the next ... This explains why it is absolutely impossible for a company that specializes in this type of benefit, to guarantee any outcome in terms of positioning for a particular keyword and a given engine.

UBB - F. PRULHIERE

27

VI. Methodology for optimizing

a web page
Each search engine has a specific ranking algorithm to present results. This algorithm takes into account a number of criteria of relevance. You must strive to consider these criteria when creating your pages. These "hot zones" (presented here in order of decreasing importance) must absolutely contain the most important keywords to your business:

Title (<TITLE> ...</ TITLE>): often the most critical area in the ranking of your documents. Here are some tips to optimize the titles of your pages:

UBB - F. PRULHIERE

28

1. Place <TITLE> as high as possible in your HTML code. 2. A web page title is primarily descriptive of the contents of the page. Avoid titles like "Welcome to our website" ...

3. Insert the most possible determining and characteristics keywords of your business.
4. Do not exceed 10 words per title (excluding "stop words" like "the", etc.).

5. No way bilingual or trilingual title, etc.


6. The title should perfectly sum up the contents of the page. 7. The title of a homepage is inherently generic and evolves over the tree.

8. Each page of your site should have a title of its own (which must be optimized).

UBB - F. PRULHIERE

29

Body text of the page: the first hundred words in the body of your page (the "visible text") are paramount. These first paragraphs should contain important words to define the content of the page, if possible highlighted (bold, clickable).

Here are some tips to optimize the visible text of your pages:
1. Take great care of the content of the first paragraph of your pages (the first 2 or 3 sentences).

2. Create shorter pages, with only one theme, and one language, but still containing at least 100 words.
3. Put forward the important keywords for your business: bold, h1 tag, text links (avoid the links like "to learn more" or "click here"), etc. 4. Consider the different forms of the words: feminine / masculine, singular / plural, etc. 5. Consider the proximity of words between them and their order. 6. Do not necessarily look to "cram" your pages with identical keywords, it will have little effect on the "weight" of your pages. 7. Remember that each page of your site can be optimized for a keyword or sentence. Do not bet everything on one page.
UBB - F. PRULHIERE 30

Link popularity (PageRank appointed at Google): a first approach: the number of Web pages that have established links to your site. It is essential to achieve a successful campaign to exchange links with "cousin sites" (additional sites, but not competing). This will greatly improve your page ranking on most engines. Links quality" is also important. A single link from a page with high PageRank is much better than several links from pages with low PageRank (which, themselves, however, are not disadvantageous) ... Internal links to your site are also important and considered by search engines. Here are some tips to optimize your link pages: 1. The optimization of a link is important both for the page that it contains (original page) and for the page to which it points (target page). 2. Your links should be as simple as possible so that engine robots can follow them to index the other documents of your site.

UBB - F. PRULHIERE

31

3 Choose text links and avoid maximum image, JavaScript, forms or Flash links. 4 Consider creating a page "site map" that is "compatible" with the constraints of engines and their crawlers. 5 The link text is important for the "development relevance" of the target pages. A page pointed to by a link that contains the word "insurance" will see its relevance strengthened by this term. The domain name of the site and the URLs of pages: it may have some significance. If you do mechanical engineering in Cluj, buy the domain "mechanical-engineering-cluj.com", which contains three primary keywords. Separate words with a dash. The domain "mechanicalengineeringcluj.com" is invalid because the engines do not know how to separate two concatenated words. Having said this, it is also recommended, on the off-line advertising in particular, to consider the spelling "mechanicalengineeringcluj.com" but to perform SEO on "mechanical-engineering-cluj.com". Here are some tips to optimize the URLs of your pages:
UBB - F. PRULHIERE 32

1. Buy a domain name (.com, .ro, .net, ...), without redirection system.
2. Insert one or two words important to your business in the Domain Name: your name, your business, etc. 3. Try to establish a network of "small sites" rather than a big gate if you can. 4. Separate keywords with dashes in the statements of your domain names. 5. Also use a dash (-) to separate words in the url of pages and no underscore (_). 6. Create sub-domains (keyword.yoursite.com) for the convenience of the Netsurfers and increase your visibility in search engine results. 7. Insert, if possible, intelligible and important keywords in the full title of your URLs (www.yoursite.com/groceries/condiments/salt.html). 8. The most important: always act with fairness and avoid spam, never paying for the short, medium or long time on engines!
UBB - F. PRULHIERE 33

META Tags, Keywords and Description: If the engine takes them into account, these tags will most often have a little importance. If it doesnt index them, it does not penalize in any way your documents. The current trend (Google, Yahoo Search, MSN Search (Bing)) would rather not index these data fields as they result in too many "spamdexing" (fraud research tools to better position the pages which are not necessarily very relevant ). Exclude more info on these areas to get good positioning on search engines!
Here are some tips to maximize the "meta tag" of your pages: 1. Tags <meta name=description> are important to better control how the engines display summaries of your pages. 2. A "good" tag <meta name=description > develops the title of the page and summarizes the text content. A "medium" size is 150 to 200 characters. 3. Ideally, each page must provide a tag <meta name=description > of its own.

UBB - F. PRULHIERE

34

4. Meta tags <meta name = keywords > allow us to indicate several forms of important words (plural / singular, masculine / feminine, and possibly case-sensitive and accent) to engines that take into account their content.
5. Remember any typos and spelling errors which it is always interesting to add in the tags <meta name = keywords >. The indication of some thirty or fifty keywords is mostly sufficient. 6. Warning: the tags <meta name = keywords > are now not considered by almost any major search engine.

7. Beside the tag <meta name = robots >, the other meta tags ("revisitafter" and consors) have no impact in the context of a referrencing. Other criteria:
We can also turn our attention to other criteria, which may, at one time or another, have a relative importance:

UBB - F. PRULHIERE

35

1. The "hidden areas" of HTML code: the ALT attribute of IMG tags will often add relevant keywords. 2. Comments (<!-- this is a comment in HTML ->) are not taken into account by search engines.

VII. How to miss the referencing of your website?


When you create a website, it is very important to take into account certain technical points which can be prohibitive for a good referencing on directories and search engines on the web. Before going further, it is important to understand the difference between these two types of tools (directory and search engines), and how they react under certain conditions.

UBB - F. PRULHIERE

36

1. Directories: everything to the content


When you want to submit your site to directories (Yahoo, AOL.fr, MSN.fr (Bing) or .en, Open Directory, Guide de Voil), there are mistakes that we should not make. Here is a list (it needs not to be totally exhaustive, but it includes, however, the main obstacles to a good SEO): Referencing a hollow site, without content The directories are more restrictive and focus first the contents which are most important, most interesting, in one word, the best. Directories assume that the sites they identify, must answer questions from visitors. A "hollow" site does not answer any question! Sites such as "wafer" (coordinates, types of products sold by the company, word of the President) have only very little chance of being integrated into the major directories.

UBB - F. PRULHIERE

37

We must therefore focus on quality with the launch of the site and provide interesting content to users (and thereby directories): thematic issues, news, feature articles, press releases, information often updated and renewed, FAQ, directory of interesting sites, links, etc. In short, we must invest in content. Customers will thank you first and then the directories by including you quickly in their database (the speed of acceptance of a site in a directory like Yahoo or Nomade is often a good indicator of the quality of its content! ). A content a bit offbeat, original, or even marginal, may also accelerate a referral. Indeed, say you that a directorys netsurfer evaluates an average of 50 to 100 sites per day and can reject at least one third, when it is not much more. Make sure it is yours on which he stops and ranks it with pleasure.
Surprise them!

UBB - F. PRULHIERE

38

Referencing a site under construction


No major directory will accept a site submitted to them that is not complete or, worse yet, not online. Its the same thing if your site is only accessible with a password. Wait until your site is complete and totally uploaded before beginning your ranking. Do not submit your information areas as "under construction". Prefer to not be present at all, even to incorporate a new area of information later, rather than display logos such as "Russia 50s" (the beautiful yellow and black signs with a "Men at Work" ... ) or those referring to the words "construction zone". If you have a site accessible only by password and wish to be listed in a directory, add a content area accessible to all.

UBB - F. PRULHIERE

39

The look Make sure your site has a good look. A professional look is always better than a "personal page" look which shows images in the format "animated gif" in not knowing what to do with, the "clip art" from CD-roms to 15 euros or that can be found everywhere on the Web, funds of pages with images and textures (the type of site you created in 1995), a counter ("You are the 20th visitor since January 1, 1998!" ) etc. This type of site is no longer accepted on the major directories for so long! Keep abreast of "trends" on the web. Today, there are major directions followed by all the major sites: a white ground, dropping frames, using certain fonts, shape "round" of graphical interfaces, etc. Go see the sites that make the most of hearing in the World and Europe and take good ideas from these sources of information. Remember, however, to be original!

UBB - F. PRULHIERE

40

Website outdated If, on the home page of your site, the "Last Updated: July 25, 1997" appears, no problem, your site will never be listed on the directory! Keep it up! ... It is the same if the pages are not dated but if the information presented therein is notoriously out of date". Suggest only updated information and "up to date". When this is not necessary (the presentation of the business of a company for example), do not specify the date of last update, it adds nothing to the user who visits the page.

UBB - F. PRULHIERE

41

Language Global directories (Yahoo!, Open Directory World ...) index only English websites. Francophone directories (Yahoo! France, Guide de Voil, ...) only take into account the sites in French. Ditto for Italian directories with the Italian language, etc. So dont submit a site only in French to Yahoo.com, it will be refused. Dont waste time!

Submit your site only on "possible directories, thus relating to the language(s) present(s) in your pages.

UBB - F. PRULHIERE

42

Gadgets and technology Flash Machin, Real Truc, and Quicktime Bidule as homepage can block a referral. Imagine the directorys netsurfeur does not have the needed plug-in (in the "going well version) installed on their browser to view your home page. Hop, your site goes directly in garbage can in a minute!

Test your site with a "classical" configuration, a computer "from the factory", having inhibition Java and Javascript on the browser to see what is going on display. In expectation, do not use "new technology" like Flash - in recent versions too - and no action or animation technique "outside the box" before being placed on the directories. There will be time to see later, within the constraints of the search engines.

UBB - F. PRULHIERE

43

The display time


The netsurfeurs directories see many many sites every day, as indicated above. The time spent evaluating each of them is counted. If they need a minute just to view your homepage, your site is clearly in a "bad beginning" from the outset.

By counting the file size of your HTML homepage and all images contained therein, limit its total weight to 70 or 100 kb, at worst. The maximum time display of your main pages, with a connection through the "classic" telephone network, must be 10 to 15 seconds, never more.
The absence of their own domain name Directories filter more entries, given the large number of requests. A site that has its own domain name (www.mywebsite.com) will always have a small advantage over a "personal website" (www.multimania.com/meandmydog/), although this is not a registrations guarantee , far away. Buy your domain name with .com, .net or .ro termination or any other termination. It's not so expensive!
UBB - F. PRULHIERE 44

Spam and the Law Several techniques can be considered as spam (fraud) by directories: A site name such as "123-Doe" or "ABC Widget" to appear first in alphabetical lists, referencing the same site in several different domain names, etc. The netsurfeurs directories know ALL the tricks, do not take them for fools! ! Additionally, if your site is the SANPH (Sect of Anonymous Neo-Nazi Pedophiles Hackers), you'll be unlikely registered. Do not even try!

Do not attempt too visible initiatives or lost in advance to be in the top positions in alphabetical order and dont offer illicit sites. Do not put such sites online, it will save time and energy to everyone.

UBB - F. PRULHIERE

45

The attempted bribery and the rants If you submit your site at noon, do not call netsurfers the same afternoon to inform them that it is a scandal that your site is not yet online even if you have purchased advertising on their pages . Worse: do not offer money under the table to accelerate the procedure. Finally, you do as you wish!

No budget "corruption" or rants. Smile and life will be more beautiful! But it's not over. It remains to watch the constraints of the search engines!

Here for the best ways to miss the ranking of your site on directories. It should be noted that blocking factors are perhaps even more important, even more for search engines, which may not take into account either your site if certain technological options are taken into the HTML of your pages.
UBB - F. PRULHIERE 46

2. Engine: constraints are numerous!


Some technical options are partially or totally unacceptable for a good SEO of a site on search engines (Google, Yahoo, MSN, ...). You must know them before starting the technical creation of your pages. They are summarized here without being, once more, totally comprehensive. To avoid seeing your site disappear on engine indexes (or simply never appear on it), here are the technical options not to consider or to take into account with the appropriate patches and parades:
Frames Many search engines take into account the poorly made pages with frames. No longer use frames, it's really the current trend. If you absolutely want to keep them, use the tags <NOFRAMES> ... </ NOFRAMES> and put a maximum of text links so that the engine can follow links and indexes your site more easily. Also use the meta tags and title that can sometimes even reverse the problems and make your site better referenced than a site without frames.
UBB - F. PRULHIERE 47

The establishment of a CGI script to your homepage


Detecting a Flash plug-in, the type of browser or the operating system used, the machine name, etc. All these tests - and many others - made through CGI (computer programs launched before displaying the HTML page) can block some "spiders" who can not go further.

This means avoiding as much as possible CGI scripts that trigger before the display of the homepage! Pages too graphic
If your page is too graphic, no text (that is actually available as text, i.e., selectable with the mouse and not the text in an image as a graph), your site can sometimes - not always - be referenced but will be always misclassified.

UBB - F. PRULHIERE

48

Insert visible text at least 100 words in your homepage. And not white on white, of course ... A page with no visible text, entirely graphics, will always be poorly considered and will always be poorly ranked in the engines for a given query. Look at the U.S. sites that have an audience: experience leads them to abandon the too graphic charters to return to much more textual content.
Flash The text present in a Flash animation is rarely taken into account by a "spider". Even if the situation improves month by month, it is far from perfect.

Use Meta tags, title tag and <NOEMBED> as you used the tag <NOFRAMES> for sites with frames. Try, if possible, to make a copy of the site without Flash and with more textual content that may be better taken into account by search engines.

UBB - F. PRULHIERE

49

Javascript, pop-ups and roll-overs, without having to repeat the graphic links to a simple textual version Most spiders do not know how to follow the links provided in the form of roll-overs or simply by JavaScript. If you have graphic links, static or in the form of "roll-overs", ALWAYS double them by the text links. Moreover, as the words that compose the text inside a link (the text is presented in color and underlined) has a heavier weight for the engines, you kill two birds with one stone! However there is nothing to be done to the windows style pop-up!

UBB - F. PRULHIERE

50

The "exotic URL proposing characters such as ?, & or % in their statement These pages are not always taken into account by some engines. Example: http://www.votresite.com/query=?!yu6Gh.qu&pg=web/ $qwerty There are a number of possible remedies, not "hacks" complex, but still possible. It depends on your technical configuration, most of the time. At best, do not use this type of URL if possible. Otherwise, create static pages with "clean addresses, copies of actual pages and reference them. You can also use techniques of "URL Rewriting" if your server allows. However, the pages to ASP format, CFM or PHP that are not displayed with "exotic addresses dont pose a real problem of referencing on search engines.

UBB - F. PRULHIERE

51

Tag <META http-equiv="Refresh" > or redirects made to the DNS level


Tag <META http-equiv="Refresh" > (automatic redirection from one page to another) with a very short delay (from 0 to a few seconds) can be considered as spam by search engines and penalized. Equivalent in Javascript can be used. URL redirects are also not taken into account, even if they are not considered to be fraud. Example: Entering the address http://www.yoursite.com/ that automatically redirects to http://www.yoursite.fr/english/ (with the address that becomes suddenly visible in the address bar of your browser) or http://www.yoursite.com/ which redirects to the "real" address http://www.yourhost.com/yoursite/. In this case, the initial address (www.yoursite.com) may never be referenced.

UBB - F. PRULHIERE

52

Make sure, when creating the domain name yoursite.com, it is not a simple redirection (look at this issue with people who will technically manage the domain name for you). When the user will type the address http://www.yoursite.com/ on the address bar of his browser, he should stay on this address, to go for example, according to his navigation to the address: http://www.yoursite.com/products/printer/inkjet.html.
Spamdexing, of course, but it was obvious No stubborn repetition of keywords, text in white on white background, etc. All these techniques are known by the engines. You're just wasting their time and your pages will be penalized (denied, downgraded or put in "black list").

Do not do spamdexing, of course ...

UBB - F. PRULHIERE

53

Of course, it will always be possible to circumvent these obstacles by creating alias pages or satellites that do not contain "defects". Remember that an alias page is a page whose body and appearance are the same as the homepage but the title and meta tags are different, while a satellite page is a page specifically optimized for a specific keyword and / or a particular engine and has a link to the homepage. Also referred to page "pre-entry" for satellite pages (such as "To go to the website Doe, click here") or "ghost page".

But alias pages are often complex to implement and maintain. Satellite pages work, for their part, less well and are eliminated as indexes. On the other hand, many drivers prefer more and more, in their results, the homepages of sites that are proposed or do clustering (that is, they no longer show that one page per site, and often its homepage, making it invisible satellite pages). The trend is clearly to sites optimized for the engines and not the satellite pages or alias that sometimes act as "patches, that may be of little use in a short term!
UBB - F. PRULHIERE 54

Always remember: the directories are built by humans and engines by software. Respect their constraints and you have earned a good presence in their midst.

VIII. Meta tags: work and implementation


To get a good ranking on search engines, it is still sometimes necessary to learn what is called the META tags of your web pages. Warning: this is however less and less true. In the HTML language, META tags can indicate to search engines a certain amount of information on the contents of a page. The term Meta means Metadata. These tags indicate therefore the information on information.

UBB - F. PRULHIERE

55

Note to begin and to be very clear, that the META tags have only an effect on search engines (Google, Yahoo Search, MSN Search, ...), at least those who take them into account. Directories (Yahoo, Nomade, Open Directory, ...) do not take any account as they are constructed by human persons who will never look at the HTML code of your pages, instead of "spiders" (or robots or crawlers) engines that are "suck" to copy this code on their records and make a search index thereafter. Some search engines (Google, Yahoo Search, MSN Search, ...) did not however take them into account either.

1. General description of the META tags


Imagine that you build a site on behalf of your company, named Stela, and whose business is to sell shoes. In the HTML of the homepage, you will indicate, for example, the following terms:

UBB - F. PRULHIERE

56

META NAME="description" CONTENT="Stela, specialty retailer of sport shoes based in Paris, France"> <META NAME="keywords" CONTENT="stela, sport shoe maker, tennis, running, footing, stretching, clay, hard, grass, Wimbledon, Flushing Meadow, shoes, Roland Garros, Flinders Park, Grand Chelem"> Both META tags ("description" and "keywords") must be placed after the tag <TITLE> ...</ TITLE> and before the end of the header (</ HEAD>).

If your HTML editor places them before the tag <TITLE>, put it in first position, as we have seen, because this field is more important than META tags as a criterion of relevance.

"clean" HTML code compared to search engines, always begins:

UBB - F. PRULHIERE

57

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <HTML> <HEAD> <TITLE> Page Title </ TITLE> <META NAME="description" CONTENT="..."> <META NAME="keywords" CONTENT="..., ..."> If other accessory lines (including other META tags that are automatically added by your editor) come and parasitize your HTML code, position them after the tag <META NAME="keywords" CONTENT="">. META tags can add a description of the page displayed to the attention of engines and also specific keywords seamlessly. They are however not an absolute guarantee that the pages containing them will get a better ranking than others. These tags are useful, but the best guarantee of a good ranking is the presence of relevant keywords in the title of the page and in the visible text, as close as possible to the top of the page (to be precise, the nearest possible below the BODY tag of your HTML code).
UBB - F. PRULHIERE 58

Some search engines (Google, Yahoo Search, MSN Search) do not take into account all of the META tags in their algorithms of relevance. But they can display them in their results (for the tag <meta name="description" content=""). Finally, some (more rarely) take them well into consideration, but with a "weight" quite low. In short, each engine handles these areas of information in a specific way, but the general trend is to abandon them purely and simply.

2. The tag <Meta name="Description" >


The tag <META NAME="description"> allows to tell the search engine a sentence summarizing the page content. This description will be displayed by some engines in their results page, under the title describing the page found. If the page does not contain a tag <META NAME="description" content="">, the first words visible on the page will be usually posted. This tag also makes it possible to better control the layout of the page given to the user by search engines.
UBB - F. PRULHIERE 59

Thus, in the absence of META tag NAME="description">, a search engine may for example display (first words of the text of the page): 1. Sport shoes Stela Site optimized for Firefox 3.6.2 and Explorer 7.Download the software to take full advantage of your pages. http://www.stela.com/ - size 5K - 20-Dec-09 We can not say that the line summary really tells us about the content of the page. If a tag <META NAME="description"> exists and is indicated in the HTML form:

<META NAME="description" CONTENT="Stela, specialty retailer of sport shoes based in Paris, France.">
The engine will display the following summary: 1. Sport shoes Stela Stela, specialty retailer of sport shoes, based in Paris, France. http://www.stela.com/ - size 5K - 20-Dec-09
UBB - F. PRULHIERE 60

The contents of the tag <META NAME="description"> is fully incorporated. Limit the contents of the tag <META NAME="description"> to 150 or 200 characters including spaces. In fact, most search engines limit the size allocated to abstracts (and taking into account the proposed text in their relevance ranking). If your description is longer, make sure that, reduced to the first 150 characters, the sentence still has meaning. In your account, please consider a letter from accented character, although its representation in HTML is longer. In this case, its the sentence displayed which accounts (it will be otherwise for the tag <META NAME="keywords">). Of course, make sure that the proposed descriptive sentence contains the most representative keywords of your business. One solution is to copy the proposed title (<TITLE>) and develop it. Indeed, a "good" (as defined in the optimization with respect to search engines) <TITLE> tag typically contains 7 to 10 words. These can be resumed and completed to offer a <META NAME="description"> tag quite convenient if you stay within the limits of 150 to 200 characters.
UBB - F. PRULHIERE 61

Of course, to be totally effective in connection with engines that take them into account, each page of your site must contain a tag <META NAME="description"> different and accurately describing the contents of the said page!

3. The tag <Meta name="Keywords"


The tag <META NAME="keywords"> is used to indicate additional keywords to search engines. Of course, as the sentence of the tag description <META NAME="description">, they are not displayed in the webpage, but are "sucked" by the engine spiders who then choose to take this into account or not . These keywords can improve - on the search engines who take them into account - the page rank or show some important words that the document dont contain. They are separated, optionally by a comma, a space or a comma followed by a space.

UBB - F. PRULHIERE

62

In the case of our company Stela, a manufacturer of sports shoes, a tag <META NAME="keywords"> could be: <META NAME = "keywords" CONTENT = "stela, sports shoe maker, tennis, running, jogging, stretching, athletics, clay, hard, grass, Wimbledon, Flushing Meadows, shoes, Roland Garros, Flinders Park, Grand Chelem "> It is customary that the tag <META NAME="keywords"> contain up to 100 keywords or 1000 characters. Beyond that, you can be considered as a "spammer" (fraud) and your page may be penalized by search engines. It will be logically identical if several words are repeated. That said, at present, a twenty of keywords is more than enough ... In your counting 1000 characters maximum, attention to count the number of characters of accented letters once coded in HTML. For example, the letter "", &eacute; in HTML count for 8 characters (since this is according to this size, that the engine takes them into account).

UBB - F. PRULHIERE

63

The tag <META NAME="keywords"> is also used to propose to search engines various spellings of your keywords (plural, case sensitive, feminine, accented spelling mistakes, if any, etc.). The tag <META NAME="keywords"> is therefore important to provide to engines a number of keywords that characterize your business and that best describe the content of the page in which they are present. These tags will also be used to supplement the content of pages that do not contain important keywords in the visible text, in particular those which contain a number of options or shares in a graphical format with no text equivalent. As for the tags <META NAME="description">, each page of your site must offer specific tags <META NAME="keywords">. If you did not planned at the beginning, it can be a lot of work in perspective.

UBB - F. PRULHIERE

64

Caution: If you had the idea to include in the tags <META NAME="keywords"> of your pages, all the names of your competitors, so that your pages are found even if the user is interested primarily in your rivals on the Web, be aware that several American companies have already sued other companies guilty of such fraud. Moreover, your competitors, when they see it, not happy to sue you, will be the first to denounce these practices from search engines. Some lawyers have even specialized in the past, in the hunt for "fraudulent" META tags. Avoid any such practice, you risk important reprisals.

It will also be better that the keywords you insert into your documents also have a sense for the contents of the site. It's the same for the title and text of the page. So, unless you are a former "Playmate of the Year" (this is a true case which has been held in 1998 in the United States for a certain Terri Welles), do not repeat a hundred times the word "Playboy" in your META tags. Ditto for "sex" and "mp3" (roughly the two keywords most often requested by users on the search engines today).

UBB - F. PRULHIERE

65

4. The other meta tags Many other META tags are available and sometimes visible in the HTML Web pages encountered during your travels on the Web: "revisit-after", "classification", "distribution", "rating", "identify-URL","copyright", etc. You should know that they are not taken into account by any major search engine. Their presence is superfluous in your pages, if for no other purposes than indexing. Only one will interest us here: the tag <META NAME="robots">. It is never used as a criterion of relevance to the engines, but it helps to show them how they should index the page. A specific <META> tag may actually be used - in each HTML document - to allow or deny the access to engine spiders. It comes in the form: <META NAME="robots" CONTENT="ATTRIBUTE1, ATTRIBUTE2"> where the fields ATTRIBUTE1 and ATTRIBUTE2 can take the following values:
UBB - F. PRULHIERE 66

ATTRIBUTE1:
index: page to be indexed by the spider noindex: prohibition to index page ATTRIBUTE2:

follow: the spider can follow the links contained in the page to index the other documents nofollow: the spider can not follow the links in the page. Indications index, noindex, follow and nofollow can either be entered in lowercase and uppercase. Here are the different possibilities offered by this tag:
<META NAME="robots" CONTENT="index,follow"> <META NAME="robots" CONTENT="noindex,follow"> <META NAME="robots" CONTENT="index,nofollow"> <META NAME="robots" CONTENT="noindex,nofollow"> Such META tags as those presented earlier, should be in the header of the HTML document, between the <HEAD> and </HEAD> tag and after the <META NAME="keywords"> tag. They should be in all the documents you want to filter the access, unlike the robots.txt file that takes into account the entire tree of a site.
UBB - F. PRULHIERE 67

Final Points: The first example given above ("index, follow") has no practical application. Indeed, the tag reads the same options as if it did not exist. The following syntaxes are equivalent: <META NAME="robots" CONTENT="index, follow"> and <META NAME="robots" CONTENT="all"> <META NAME="robots" CONTENT="NOINDEX, NOFOLLOW"> and <META NAME="robots" CONTENT="none"> Most of the robots take this META tag into account.

5. The META Tag Generator


Creating META tags on your pages is not complex. All HTML editors have this capability. You can also do "by hand" directly in the code, very easy. Warning: some software "pollute" slightly (or strongly) the HTML generated code by inserting the tags that are useless to optimize your code (which does not mean they are of no use at all). You must therefore, mandatorily rework "by hand" the code of your pages if you want to better optimize it and get a code "clean" as was stated earlier in this course.
UBB - F. PRULHIERE 68

If you do not really want to look at the subject of the technical integration of these tags in your pages, a number of websites (or software) can also automatically generate META tags. You give them some information (keywords, sentence description, etc.). And you get on the Web or e-mail, these tags. Most of these services add a comment line to indicate that the tags have been created through a utility. For example: <!-- Meta-tags created by the Meta-Tag Generator http://www.websitepromote.com/resources/meta --> This line can, of course, be removed in the final code.

6. The Robots.txt file


We try to explain how to ensure that your pages are better indexed by robot (spiders) search engines. But some of your pages may be confidential, or at least that your goal is to not widely disseminate them on the search engines. A site or a page under construction, for example, doesnt need to be the target of such an aspiration. Then it is necessary to prevent certain spiders to consider it.
UBB - F. PRULHIERE 69

This is done through a text file called robots.txt on your server. This file will provide some guidance to engine spider on what it can do and what not to do on the site. Once the spider engine arrives at a site (for example, http://www.mysite.com/), it will scan the document present at http://www.mysite.com/robots.txt before making any "aspirational document." If this file exists, it reads it and follows the instructions contained therein. If he does not, it begins its work of reading and saving the HTML page which it came to visit, considering that a priori nothing is forbidden.

There can only be one robots.txt file on a site, and it should be at the root level, as shown in the address example above. The name of the file (robots.txt) must always be created in lowercase. The structure of a robots.txt file is:

UBB - F. PRULHIERE

70

User-agent: * Disallow: /cgi-bin/ Disallow: /tempo/ Disallow: /perso/ Disallow: /underconstruction/ Disallow: /subscribers/price.html In this example: User-agent: * means that the access is granted to all agents (all spiders), what they are. The robot will not explore the directories /cgi-bin/, /tempo/, /perso/ and /under construction/ on the server or file /subscribers/price.html. The directory /tempo/, for example, corresponds to the address http://www.mysite.com/tempo/. Each directory to exclude from the spiders aspiration must be in a specific line Disallow:. The order Disallow: used to indicate that "everything that begins with" the specified expression must not be indexed.
UBB - F. PRULHIERE 71

Disallow: /perso do not allow the index of: http://www.mysite.com/perso/index.html, or:

http://www.mysite.com/perso.html
Disallow: /perso/ will not index http://www.mysite.com/perso/index.html, but will not apply at the adress http://www.mysite.com/perso.html On the other hand, the robots.txt file does not contain any blank lines (white). The star (*) is accepted in the User-agent field. It cant serve as joker (or truncation operator) as in the example: Disallow: /underconstruction/*.

UBB - F. PRULHIERE

72

There is no field for the permission, like Allow:. Finally, the description field (User-Agent and Disallow) can be either entered in lowercase or uppercase. Lines that begin with a sharp sign (#), or more exactly everything that is on the right of that sign on a line, is considered as a comment.

IX. Popularity index


To calculate the index of popularity of a site is to measure the number of pages that have set up a link to your website. This data is taken into account by some search engines (Google especially) in their relevance rankings.

UBB - F. PRULHIERE

73

Three engines allow you to perform this calculation: Google, Yahoo Search and MSN Search. Here are the requests to take on these three tools to determine your popularity index (here for the site: iutvichy.univ-bpclermont.fr): Google : link: iutvichy.univ-bpclermont.fr -site: iutvichy.univ-bpclermont.fr Yahoo! Search : linkdomain: iutvichy.univ-bpclermont.fr -site: iutvichy.univbpclermont.fr MSN Search : link:www.iutvichy.univ-bpclermont.fr The queries link: and linkdomain: can get pages containing a link to the SRC Vichy website. The queries -site: allow to remove from this result the SRC websites inner pages. As the three search engines do not operate on the same pages index, you must perform the operation on the three tools to get an idea of your index.

UBB - F. PRULHIERE

74

X. Reference a site designed in Flash


Flash technology allows, quite simply, to insert animated images and real vector animated short cartoons in your web pages. Of course, these events have an impact on the referencing of your website. Firstly, let us see how a Flash file is structured. When you create an animation, you get a file named, for example, anim.fla (the .fla is the characteristic of the Flash format). To view this file in a web page, it is necessary to export it in the Shockwave Flash format (.swf extension). This is, once exported, this file that you will use for your website.

If the finished animation contains text, it will not be taken into account by the search engine which does not specifically take into account this format. One page (or even an entire website!) entirely made in Flash will therefore be almost unnoticed by search engines, which dont index, for the most part, the textual content in HTML format. But the HTML file that "throws" the Flash himself is taken into account.

In this case, you do not have much choice: you must use in the most optimized way possible the <TITLE> tag (certainly the largest), <META> tag (although the meta tag "Keywords" are almost no more taken into account, but the meta tag "description" still has some importance) and, in our case, the <NOEMBED> tag (in the same way used the <NOFRAMES> tag for a site designed with frames). In the <NOEMBED> tag, try to insert the most possible text (without spamming, i.e., without undue repetition of keywords), so that it is "sucked by the engine spiders. Another possible field: the ALT attribute in the <img> tag for the images, if your HTML file contains some one.

UBB - F. PRULHIERE

76

If the whole site is concerned, you can play on the only (rare?) HTML pages presented to optimize the content, until all the search engines will read the Flash animations. In any case, it is highly advisable to provide an online HTML version of your site with a link to the HTML homepage... which is not included in the Flash animation! Finally, there is another way to position a Flash site on search engines: go through the offers of advertising positioning from Espotting, Google and Overture for exemple, to be sure to be on the top of the list. The major drawback is: you must pay! (there is a test version which is free, but mimited to 15 days!)

UBB - F. PRULHIERE

77

In any event, always remember one key point: a spider (i.e. a search engine) can only read HTML! The contents of a Flash animation is banned (except, again, for engines that can read the Flash). Maybe one day all the search engines will read the Flash "in the text". For now, it's not yet the case.

UBB - F. PRULHIERE

78

You might also like