Friday, April 11, 2008

SEO FAQ's

Q No1 :- What is SEO?
SEO stands for Search Engine Optimization and is defined as (in my own words):

"The process of finding out the best keywords for a web site and by the use of optimizing the web site along with other off-page work making that web site attain a higher position in the search engine result pages (SERPs) for those selected words."

Although the exact calculations used by the search engines are kept secret, there is lot of knowledge and observations in this field from thousands of webmasters worldwide.

It could be said to be a branch of online marketing. In general terms you can say that it means to make a web site more visible and make it look important in the eyes of search engines.

Not being familiar with SEO and not applying it compared to actually doing the right things can make a huge difference in terms of visitors to your web site.

Q No 2 :- How do I find out the best keywords to target?
The "best" keyword depends on the following main factors:

The Amount of Traffic it will generate
Often people choose keywords based on how popular they think they may be. Mostly it is based on "real world" factors rather then fact which is readily available. For instance, I recently saw someone who proposed they were going to go after the term "nursing homes" due to the aging population.
Although this area may be growing quickly, those interested in finding out more information often do not use a computer or if they did, would not be researching it online. Although they would still get visitors, they would find there are much better keywords to target with profit in mind. When it comes to traffic, the best measurement is actual searches. This will tell you how many people search in a day or month for that term and it can be a great indicator. By far, the most popular tool for finding this out is the Keyword Suggestion Tool. This tool combines the two most popular ways of judging popularity, Wordtracker and Overture. In addition, it suggest related keywords and lists their traffic. Always remember however, that this is total searches. These search numbers will always be divided among the SERPs.

The Difficulty of Attaining a Top Ranking
If you simply chose the keywords with highest amount of traffic, you could still lose money. This is because these keywords typically warrant a lot more work to rank for. A perfect keyword is one that has a lot of searches but little SEO competition and moderate to easy to rank for. The best tool I have found for this is the Keyword Difficulty Tool created by Rand Fishkin. It will give you an indication of the amount of SEO work required which you can balance against the number of searches.

The Profitability of that Keyword
There are also keywords where you may get 1000s of visitors with only one conversion while others where you can achieve 1 for every 100. This should be factored in as unless you make your money per impression, you want the highest number of conversions per visitor. The best way I know to evaluate this is to run an AdWords account. The amount of data you receive by starting a campaign can be very useful in establishing the conversion rate. I believe it is always better to spend $10 to find out a keyword isn't profitable then to spend 6 months getting it to number one, THEN find out its a dog.

The moral of the story is although there is no such thing as a "perfect" keyword; you can find the best ones for you by using a combination of the factors above.

Q No 3 :- What is KEI and how do I use it?
KEI stands for Keyword Effectiveness Index. KEI is a ranking system based on how popular a key word is and how much competition it has on the Internet. The higher the KEI number, the more popular your keywords are and the less competition they have. It also means that you'll have a much better chance getting ranked high on a search engine. A low KEI score means not many people are searching for that keyword and it has too much competition. Hence, eliminate all KEI scores with a low number and choose those with a high KEI score. The higher the score, the more profitable your keywords will be to your web site.

Q No 4 :- What are the most important things for on-page optimization?
On-page optimization is the part of SEO where you deal with the pages itself, opposite to off-page optimization and keyword analysis.

Here is my opinion on the most important elements in on-page optimization and some brief information about them.

Content
It has been said over and over since years that a successful way of establishing a web site is by adding more good unique content to it on a regular basis. It cannot be stressed enough although still people are not doing it and want some "magic" to happen.

Make sure that you have your targeted keywords included in the content on your web pages in a natural way. My rule of thumb is that you write without thinking about it and when you finish you can look over the text and add the keyword maybe on one or two places extra where it fits in. If it looks spammy or too much in any way then reduce it. You are writing for the visitor and not for the search engines - never forget that.

Also make sure that the text is unique as if it is the same as on other web pages it can raise a red flag at major search engines and duplicate web pages and even whole domains gets erased from the index.

Weight factors
By placing your targeted keyword in places such as the title, H1 and H2, Strong tags and Emphasis tags you put more weight on those words and for the search engines it becomes more relevant for those words.
But don't overdue it because if you place the same word or phrase on all the weight tags on the same page you can get hit by the over-optimization penalty which means that basically the search engines figured you tried to cheat them and push you down the rankings.

Navigation structure
Make sure that the search engine spiders can follow the internal links on your site. If you have a site based on a database it is recommended that you use mod_rewrite to get the best benefit.

If you have lot of pages on your site and they are buried down in the navigation tree of 3 or more clicks away than I recommend the use of a sitemap and a link to it from each page of your site. A sitemap makes is easy for search engines to find all your pages and it can also be a great resource for your visitors to find a specific page quickly.

If one or more web pages of your site is more important than the other ones, like the home page, then get more links to it from the other web pages of the site. A good example is to have a link, "home", on each web page back to your home page. That is also useful for your visitors and it gives more power to the home page in the search engine rankings.

Q No 5 :- Is the use of meta tags dead?
Yes, in fact they have not been relied upon since many years. An article from Danny Sullivan stating exactly this was released in October 2002.

There are still some minor “stone age” search engines around that uses them.
The main reasons why they have been ceased to work are mostly from these factors:
Most webmasters tried to fool the search engines with Meta tags unrelated to their content and services.

With improved FTS (full text search) tool kits from verity and many other companies, search engines can index your web pages and know the theme of your web page. With such advanced APIs, search engines like google can easily decide, what your website is about and what your website offers.

Some of the basic features of FTS API are that they can filter out text of your webpage and get important statistics such as:

1) How many times a word gets repeated
2) How far each repeated words are from each other
3) How many times a particular word gets repeated in a particular sentence
4) How far a word 'Online' appears from words like 'Party', 'invitations' to see if that sentence makes any sense.
5) They can easily figure out, if you are doing keyword dumping.

So with such API's the webmaster should concentrate on the content/layout and not put the meta tags as a main concern.

However it has been tested that the meta keyword tag still has a minor influence on the rankings and the meta description tag should be used as it is some times shown in the SERPs (search engine result pages).

Q No 6 :- Where in my code should I put the keywords?
We all know it is not enough to have your keyword in the Meta keyword tag.
Here is a list of places to put it in the source code, ordered by estimated weight:

1) Title tag.
2) H1 and H2.
3) In paragraphs and general text on the site.
4) In STRONG tags: Keyword
5) In the file names of the web document: www.domain.com/keyword.html
6) ALT description attributes on image tags:
7) TITLE attributes on anchor tags:
8) SUMMARY attributes on tables:

9) In the file names of images:
10) Meta description tag.
11) Meta keyword tag.

Q No 7 :- How is the best way to write the title?
The title is most probably the single most important place to put your keyword.
Have the word in the beginning of the title and also in the end. Try to vary it in different forms as well.
If you want to brand your company name you should keep that name in the end.
Try to follow this and at the same time make it look natural and appealing for the visitors. Remember that this is what is most visible in the SERPs for the visitor.

Q No 8 :- What is the best way to write the URL's?

In regards to Google it has been stated by two of their staffs involved in the SEO community (Google Guy and Matt Cutts) that the dashes (-) are better than underscores (_) when writing the URLs. This has also been confirmed by my own tests on the matter.

In regards to Yahoo, MSN and other search engines I actually don't know but I think it varies.

And speaking of putting a dash in URLs, hyphens are often better than underscores [Ed. Note: bolded by Matt :)]. hollywood-hotel.html is seen as two words: “Hollywood” and “Hotel”. hollywood_hotelis seen as one word: hollywood_hotel. It’s doubtful many people will be searching for that.

Q No 9 :- Which factors are considered unethical or black hat SEO?

Page cloaking is a BIG one, which basically consists of using server side scripts to determine whether the visitor is a search engine or a human and serving up different pages depending. The script would serve up a keyword rich, totally souped up page to search robots, while giving humans an entirely different page.

Link farms are considered black hat. Link farms are basically sites, which consist of masses of links, for the purpose of getting rankings in search engines and turning a profit (usually off of affiliate program advertisements on the website).

Duplicate content can keep a page from getting indexed, and some people have even reported trouble with entire websites being duplicated by unethical webmasters, causing problems with their rankings.

Spamming keywords, in meta tags, title tags, or in your page's content is a black hat SEO trick that used to work well back in the 90's. It's long since been basically exterminated as a useful trick.

Linking to "bad neighborhoods", or sites that have been banned from search engines for using the above tricks, while not particularly black hat is definitely unhealthy for your own sites rankings.


Q No 10 :- How should good navigation look like?

Good Navigation can be broken down to one word - "Breadcrumbs"

If you remember the fairy tail "Hanzel and Gretel" you will recall that when the children were kidnapped, they dropped breadcrumbs so that their rescuers would be able to find them. In our situation, imagine its your site that has been kidnapped and the Search Engines are your "rescuer".

An example of Breadcrumbs would be the following

*Home-->Sublevel1-->Sublevel2

This type of navigation allows the Search Engines to find all of your pages in the most efficient and thorough way. It also aids "Deep Crawls" which are crucial to dynamic sites which may have 100,000 or more pages. Not only should you use this style of navigation behind the scenes but displaying the Breadcrumbs somewhere on the site will help both the Search Engines and visitors alike. A perfect example of this is www.dmoz.org

The second part of navigation is the "site map". This is a page which contains a link to every one of your pages. To be fully optimized, the links should have a descriptive anchor text to further help the page you are lining to. In addition your site map should always be within one click of the index page as this will help the Search Engines find it quickly.

Using these two methods of navigation will ensure your site gets fully indexed and will add to your user’s experience

Types of Links
The major type of navigation to avoid or to at least compensate for is JavaScript pulldown menus. Because Search Engine bots will not follow these, it is important to compensate by having text links somewhere else on the page. These can be in the footer or worked in elsewhere in the content. In fact, JavaScript as navigation in general has been shown to hinder indexing. There are a few alternative ways to code your JavaScript however if you always code a backup plan, you will enjoy easy indexing without the worry Answer provided by:

A very important point to remember is that always link to your main pages from your home page and if possible from other pages as well.
If you have a link from your home page to another page of yours then you are telling the search engine that this page is also important and needs to be indexed soon. Same goes for pages linked from many pages within the website. A good practice is to add a link to sitemap from your home page. The sitemap contains link to all other pages and helps in faster indexing.
Additionally all your pages should also have a link to the home page so people can return to home page in case they find lost.
This also adds weight to your home page from a search engine's point of view.
The text in the link should describe the page as closely as possible and if using image links you must add ALT tags in the image.
If you really need to have a JavaScript navigation then you should also add text links at the bottom of the page .

Q No 11 :- What should my domain name look like?

This is a dilemma that webmasters have faced more and more as there are fewer prime domains to choose from. There is a difference between the optimal domain name and the one you have to settle for. In general, if you are picking a domain for SEO purposes, it should contain the keywords you are targeting. Many will say because there is an extra bonus in the SERPs for this (File that under Theory, Assumption, or Speculation), but the true power is in the SEO power that your natural links will garner. When people link to you on their own, they may use the name of your website. Most of the times they will get it from your URL. If your keywords are there, they will then be in the anchor text used.

As far as the other factors of the domain, unless you are after a local market (i.e. UK and therefore .co.uk), always go for the .com . This is trusted in general by newbies who may hesitate at a .net (Hard to believe but true)

The last important factor is the use of "-"'s. Avoid them if possible but if you have to, use them conservatively. More then 2 may be viewed as Spam so doesn’t cross that threshold.

If you aren't choosing your domain for SEO reasons, pick something unique. After all, who heard of www.google.com 7 years ago

Q No 12 :- What is PageRank and how does it work?
One big way in which search engines are granting importance to web pages are based on how many other sites are linking to it.

PageRank is a system implemented by Google that measures web page's importance with only taking into account the links and links related factors. The other big search engines (Yahoo, MSN) probably have a similar system but without an official measurement.

The basics are that each web page is giving votes to the links on that page and the power of the vote is determined by the number of links pointing to the page that gives the vote.

The votes are divided among the links on the page so the lesser number of links, the more share of the vote is given to each link.

Internal links are given a higher share than external.

The scale is between 0-10 but in fact it has many decimals which we cannot see.
0-3 is the most common, 4 is obtained by getting an amount of links and 5 take some work to get. Only a short list of sites on the internet has PageRank 10.

Q No 13 :- What are some recommended ways to get more links to my site?
The best way to get more links to your site is through hard work.

In order to find sites that are relevant to yours and allow you to post a link on, you can perform the following searches:

Replace the keyword with the phrase you are targeting.

keyword "add url"
keyword "add site"
keyword "submit site"
keyword "submit url"

If you perform these searches for various keywords you will find plenty of sites that will allow you to post your link on and that are related as well.

You may also submit to directories. This is a long and hard process but it is an easy and free way to get some incoming links to your site.

Another way that I just discovered is the following query on Google:

intitle:add+url OR intitle:submit+your+site OR intitle:add+your+site "your keyword"

That will list sites in which you probably could add sites related to yours

Another proven successful way is to write an article about a subject you know. When you finish the article you write a short bio about yourself in the end with a link to your site and then you submit it to article submission sites. You can find lists of such sites on various places with up to 100 places you can submit your article. If you are lucky many will accept your article and publish it on their site and each one will have a link back to you.

If you provided a high quality article and are lucky it is likely that your article will get published on authority pages and your page with the article will have some PageRank as well.

The trick here is to write something good and useful and also to submit it to many places.

It is an excellent form of marketing, not only with the backlinks you get but also from the readers who reads your name and bio.

Q No 14 :- Can incoming links hurt me?
There has been some recent speculation that links to your site that looks spammy may hurt your rankings.

So how do spammy links look like?

1) Thousands from the same IP.
2) All the same anchor text.
3) Only up temporarily and have no real age.
4) Coming from spammy/questionable/not-so-good sites.
5) Has a set pattern.
6) Unnatural growth, like suddenly thousands overnight.

Q No 15 :- What is a Link Farm?
A linkfarm is any type of website in which there is no real content, service, or purpose, but rather just a load of non-related reciprocal links to other places. Generally linkfarms are built to increase search engine rankings and turn a profit, which means they're also generally littered with advertisements from affiliate programs the site owner has partnered with.

Linkfarms are not to be confused with Linkdumps, which are simply places people dump all kinds of links to content on various websites.

Q No 16:- Can I get my site accepted faster into DMOZ?
DMOZ (www.dmoz.org) has the ability to frustrate webmasters with its opaque site selection process. Though the guidelines are clear and fair, the actual implementation is not.

The only 3 things you can do to possibly get into DMOZ faster are:

1) Follow all the editorial guidelines
2) Choose the right subcategory
3) Submit a meaningful site.

Despite following all 3 of these instructions, you might be in a position where your site is not added for years, if at all. This has more to do with the availability of editors for the site, and their choice.

There are stories about people managing to "bribe" editors into getting their sites into DMOZ faster. These are unverified, and I would not recommend this route.

Despite all its faults, the popularity of the DMOZ data is so high that DMOZ submission is well worth the 15 minutes it takes. Good Luck.
Trust level: Proven and confirmed

There is a 4th way to get your site listed in DMOZ, become an editor.

There is a proven way of getting your site listed faster, if the site has got some kind of connection to a local area (county/town). If so you can submit it to the local area section (county town) and then after getting listed use the change form to apply a change to the main directory.


Q No 17 :- Does the web host have anything to do with the ranking?
As such, a web host can not help to improve the rankings. Search Engines will not rank you higher because your site is hosted on a particular web host (means buying hosting from Yahoo does not means your site will rank well in Yahoo) but yes if you choose a web host that remains down a lot that can certainly have a negative effect on your rankings.

So do keep in mind that though a web host will not help to improve your SE rankings but yes a bad host can have a bad effect on your rankings.

There is an additional factor to this.

Google and probably the other major search engines are checking the IP numbers on links and it is well known that by linking to a "bad neighborhood" can trigger filters which can cause your site to go down on the rankings.

Now this is when you link to a bad neighborhood, although here, imagine being in a bad neighborhood. That is, your host is hosting other bad web sites and they have the same or same class of IP like your innocent site. Guess what Google will think? Right. Answer provided by:

Q No 18 :- What is relative vs absolute links?
When designing a site, you will always face the question of whether you should use relative or absolute links. Later in this answer I will explain the benefits of each but first here is a definition:

Relative: Relative links usually look something like index.html or /folder/page.html. The way the page knows where to go is all relative to the page the link is placed on. A link to index.html for example, will only work if the file is found in the current folder.

Absolute: Absolute links usually look something like http://www.example.com/page.html. This is a full url and the linked to page will be found regardless of where that link is located on the site.

Which you use depends on the following:

Speed: When your browser goes to find a page with a relative url it looks within the existing site. When it uses an absolute url it leaves the site for an instant and "refinds" the page. This means when it comes to speed, relative is the way to go.

Ease of Design: When you are designing a site using notepad, the danger with relative urls is that if you move a folder, it can break your entire site. As each page depends on another, if you are not a find and replace whiz, absolute may be your best bet.

SEO: This is the one area that I would place firmly in "Theory or Assumption". The truth is that we know broken links can hurt you so the most important is to choose a technique of linking that works best for your site.

Q No 19 :- How do I get my web site to show up in the search engines?
As you might have noticed the submit URL function on the major search engines does not really help that much.

The best way to get your web site listed faster in the search engines is to get links to it from authority web sites (those with high PageRank).

By getting a great link from for example the home page of an established PageRank 7 web site can make your new web site and it's subpages indexed in a matter of hours.

If this is not an option for you then other good ways is to put a link to it from big forums signature, directories, friend’s sites etc.

If you wonder why only the home page is listed and not the rest it is because the search engine spiders first visits one time and then comes back for a "deep crawl". To speed up this process, the solution is to get more links to your site. Answer provided by:

Q No 20 :- What are poison words?
Poison words are those which Search Engines will decrease your ranking for if found in your URL, Title, or Description. These can be a disaster when it comes to ranking and therefore you should be very careful to avoid them.

Poison Words Can be Divided into three categories

Ranking Killers (Adult Words and Obscenities)
Use of these words will cause the major Search Engines to analyze the website as "Adult". For a mainstream website this can mean your visitors never find your site. If one of these words is necessary, its always best to star out one of the letters. The most popular forum software and CMS systems will allow you to do this through their Admin panels

Ranking Decreasers
These include words the major Search Engines associate with a lower quality site. These include but are not limited to Links, Search Engine, Bookmarks, Resources, Directory, BBS, Paid-to-Surf and Forum. The pages containing these words will still be indexed however they will find it very difficult to rank for their main keywords.

Speculative
These words are similar to Ranking Decreases except they are merely theory at this point. They include words like "Free" or "Offer". Many of the pages using these words seemed to take a hit in the Bourbon Update.

The moral of the story is that you should be very careful with your Keyword choices, especially when writing your title and description. If you have a forum or a CMS system, use the profanity filters whenever possible

We can also make some assumptions on Google in particular by referencing the stop words used by their AdSense PPC program. These can be found on Vaughn's Summaries. There you will see a comprehensive list of Keywords that stop AdSense ads from showing. Although this isn't a direct correlation, it does give some insight into the way Google views Keywords Answer provided by:


Q No 21 :- What are stop words?
Stop words are the small common words that the search engines are ignoring.

They should be avoided as much as possible in places like the title, headings and maybe some other places.

Q No 22 :- How do I create a sitemap?
A sitemap should contain links to all your pages. A good rule of thumb is to limit the number of links to under 100 per page, so you may consider splitting sitemaps up by topic, alphabetically, by brand or other logical grouping.

A useful tool to create a basic sitemap is the Xenu's Link Sleuth, a free site crawler that produces an HTML sitemap that can be edited to fit your site's design.
If your content is dynamically driven from a database, you should also create your sitemap(s) from the same data.

Q No 23 :- What is the best way to redirect a site?
The best method is the so called 301 redirect as in this way you also tell the search engines that the page in question has been moved permanently (302 is temporary and is very risky for SEO purposes).

Here are the ways to do it:

Examples using .htaccess

Redirect 301 /oldfolder http://www.toanewdomain.com
Redirect 301 /oldurl.html http://www.yourdomain.com/newurl.html

If your server has windows then use it this way on the file that is being moved:

For windows server

You can also use PHP or ASP for this and then just add these lines at the top of the file:

PHP

header("HTTP/1.1 301 Moved Permanently");
header("Location: http://www.newdomain.com/newdir/newpage.htm");
exit();

ASP 301

Redirect using Meta refresh or JavaScript is not recommended and may even be harmful as this method has been used a lot by spammers and is most likely making the search engines penalizing your site.


Q No 24 :- What is TrustRank?
TrustRank, as defined in a whitepaper by Jan Pederson (Yahoo!), Hector Garcia-Molina (Stanford), and Zoltan Gyongyi (Stanford), located at http://www.vldb.org/conf/2004/RS15P3.PDF, is a system of techniques to determine if a page is reputable or if it is spam. This system is not totally automated, as it does need some human intervention.

TrustRank is designed to help identify pages and sites that are likely to be spam or those that are likely to be reputable. The algorithm first selects a small seed set of pages which will be manually evaluated by humans. To select the seed sites, they use a form of Inverse PageRank, choosing sites that link out to many sites. Of those, many sites were removed, such as DMOZ clones, and sites that were not listed in major directories. The final set was culled down to include only selected sites with a strong authority (such as a governmental or educational institution or company) that controlled the contents of the site. Once the seed set is determined, a human examines each seed page, and rates it as either spam or reputable. The algorithm can now take this reviewed set of seed pages and rate other pages based on their connectivity with the trusted seed pages.

The authors of the TrustRank method assume that spam pages are built to fool search engines, rather than provide useful information. The authors also assume that trusted pages rarely point to spam pages, except in cases where they are tricked into it (such as users posting spam urls in a forum post).

The farther away a page is from a trusted page (via link structure), the less certain is the likelihood that the page is also trusted, with two or three steps away being the maximum. In other words, trust is reduced as the algorithm moves further and further away from the good seed pages. Several formulas are used to determine the amount of trust dampening or splitting to be assigned to each new page. Using these formulas, some portion of the trust level of a page is passed along to other pages to which it links.

TrustRank can be used alone to filter the index, or in combination with PageRank to determine search engine rankings.

Q No 25 :- What is the sandbox?
The sandbox is a kind of filter implemented by Google in March 2004 that applies to maybe 99% of all new sites.

It's function is to push down new web sites in the SERPs.

My theory is that it was Google's solution to stop new spam sites that was created to rank high in the SERPs. It took probably some time before the Google spiders could detect such as a site and ban/penalize it and by that time the creator probably made several new ones.

When this phenomena was first noticed in March 2004 it was seen that it could take two months before a new site was "released" and could rank normally again. However by now October 2005, a time of half a year or more is normal and as long as more then a year has been reported.

By my own observations I have seen in Google that new sites can rank unusually high in the Google SERPs for some weeks before the sandbox filter gets activated. Answer provided by:

Q No 26 :- How do I get out of the sandbox faster?
One theory is that it has to do with link aging.

This means that as soon you put your site live, start getting quality links to it and try to keep them there forever.

The sandbox is a collective filter that still has a lot of confusion and speculations surrounding it. One of the biggest areas is how one can "escape" it or at least make your stay there shorter. The theories (and I stress they are theories) break down into the following areas.

Link Speed
Some SEO's claim the sandbox actually has nothing to do with time and is actually more a function of linking. There is a double edge sword here. To rank high you need lots of links, but to avoid the sandbox you can't get too many links. As a compromise its proposed that you build your links exponentially which basically means if you get 10 the first week you would get 15 the next week and 20 the following, etc. Its believed it is this slow, steady and consistent increase in your number of incoming links that causes Google to see it as a legitimate and emerging site. So in the first month you may have accumulated 300 links but by spreading them out over the month you have avoided the sandbox filter. Alternatively, you can buy your domain before you even start designing your site and slowly accululate links. This way by the time you are completely up and ready, you will not be subject to the sandbox.

Link Quality and Relevance
Majority of SEO's and Webmasters tend to go after reciprocal and directory links when starting their first campaign. After all they are most likely a PR0 with a site no one has ever heard of, so they go with the one technique where these factors have little importance. What results is 100's of unrelated and unimportant links. Google has been devaluing the importance of directories over the last year and with the addition of hundreds a day, this trend seems here to stay. In many theories it is believed that if Webmasters get the majority of their links from "Authority" sites, they will be viewed as important and therefore not subject to the filters reffered to as the Sandbox

Purely Age
There are still those that feel that age is still the most important factor. If you are registering a new domain name, there is no way around this. However, many SEO's buy old and existing domain names and add their own content. If you are able to get a DMOZ listed domain, you may help your chances even more as it will have an existing link structure. There are many services on the web that offer lists of upcoming expirations so you may be able to grab one at a bargain price. Be aware though that there is a number of SEO's that believe that these domains will be "reset" when ownership is changed so you may be subject to the sandbox anyway.

Q No 27 :- How do I write the robots.txt?
As you may, know the robots.txt is a tool used by the search engines to see which pages not to index. It is useful if you have sensitive information or other pages you don't wish to get indexed - that is all it does.

It is simply a text file names robots.txt located in the root.

It contains a line for user-agent and then what to disallow or allow.

Example:

User-agent: *
Disallow: /password.html
Disallow: /temp/


As the * is a wildcard the above means that all user-agents will index all pages except the file password.html and the directory temp and it's content.

Here is another example when disallowing all from a bad spider.

User-agent: badspider
Disallow: /

Q No 28 :- Does a site keep its PR when ownership changes?
The good thing about PageRank is that it is blind. Unless something inappropriate is going on with an individual, Google does not care about the humans who own/buy/sell websites. They focus entirely on the domain name and the website behind it. Changes in ownership are inconsequential to Google.

PageRank rises and falls entirely on the merits of the website and Goggle's algorithms. The risks of PageRank falling or rising are purely technical.

If anything, Google is very forgiving of websites that go inactive. There is a huge aftermarket of entrepreneurs seeking inactive domains with PageRank attached for its reaale value. Google often times is slow to identify inactive domains. This is mostly due to the fact that just because a PageRanked website goes inactive, it's links can remain active for years.

Last year I purchased an inactive domain name with a PageRank of 6 with several hundred links showing up in Google/Yahoo/MSN. Six months later, its PageRank dropped to a 4. It all honesty, it really should have been a 4. I just got to enjoy a PageRank of 6 for a few months. What this tells us is that Google is going to sooner or later correct a website's PageRank. But Google isn't the bad guy here, they are just applying the rules as strictly as they can with what they know. Google doesn't really worry about buying and selling because they know that the technical factors will keep everything even in the long run.

Purchasing a website in itself is no more risky than having owned it for years. A potential buyer needs to understand what the websites SEO strategy has been and how it gets its traffic. If the former owner spent a fortune purchasing traffic and the new buyer isn't planning on pursuing the same kind of traffic then it isn't unreasonable to expect the performance of the website to degrade. If the former owner was engaged in "black hat" or other inappropriate activities in creating traffic and search engine visibility and Google catches on, then the website will suffer along with whoever might own it. When purchasing anything of value, (including websites) the buyer must be diligent to protect themselves.

Q No 29 :- How can I tell my search engine rankings?
There are several ways of checking where your pages are in each Search Engine's Results Pages (a.k.a. SERPs) for a given search word or phrase.


Manually - Go to the Search Engine (SE) of interest and search like a punter would, then scroll down every page to see where you are.

Automated DIY - With most prominent SE's, you can sign up to be able to use the Application Protocol Interface (API) to query their database without using their front-end. This way you can simulate normal searches or make use of the more advanced API calls and retrieve ranking positions directly.

Automated 3d Party - There are several 3d party tools out there on the web made by people who have implemented an API querying interface for you. All you do is (pay and) sign up. Most are web based some are downloads. A search in your favorite SE for phrases like "Keyword Tracker" should show some of these.

Automated the wrong way - beware of tools which find your ranking by 'scraping' ordinary SERPs instead of utilizing the purpose built API calls. SE's aren't very happy about these practices for obvious reasons.

RSS Feeds - some SE's offer RSS feeds of SERPs which you could interpret with a script of yours and find where you are. Please refer to the program's Terms of Service for details on allowed usage.

Note: - Search Engines often differ SERPs based on user location. What you see in your browser is not necessarily what others see in their browser. Also, most SE's utilize a range of datacenters all of which could potentially have different datasets or ranking algorithms. What you retrieve with an API call might well differ from what you find manually.

Some SE's now also store your search history to try and understand patterns applicable to your searching behavior. This may also cause discrepancies in your ranking position findings.

Q No 30 :- What are all the Google Operators and what do they do?
The advanced operators, as far as SEO is concerned can be divided into 2 categories: Alternate Query Types and Query Modifiers.

Alternate Query Types

cache: If you include other words in the query, Google will highlight those words within the cached document. For instance, [cache:www.google.com web] will show the cached content with the word "web" highlighted.

link: The query [link:] will list webpages that have links to the specified webpage. For instance, [link:www.google.com] will list webpages that have links pointing to the Google homepage.

Note there can be no space between the "link:" and the web page url.

related: The query [related:] will list web pages that are "similar" to a specified web page. For instance, [related:www.google.com] will list web pages that are similar to the Google homepage.
Note there can be no space between the "related:" and the web page url.

info: The query [info:] will present some information that Google has about that web page. For instance, [info:www.google.com] will show information about the Google homepage.
Note there can be no space between the "info:" and the web page url.

This functionality is also accessible by typing the web page url directly into a Google search box.

Query Modifiers

site: If you include [site:] in your query, Google will restrict the results to those websites in the given domain. For instance, [help site:www.google.com] will find pages about help within www.google.com. [help site:com] will find pages about help within .com urls. Note there can be no space between the "site:" and the domain.

allintitle: If you start a query with [allintitle:], Google will restrict the results to those with all of the query words in the title. For instance, [allintitle: google search] will return only documents that have both "google" and "search" in the title.

intitle: If you include [intitle:] in your query, Google will restrict the results to documents containing that word in the title. For instance, [intitle:google search] will return documents that mention the word "google" in their title, and mention the word "search" anywhere in the document (title or no). Note there can be no space between the "intitle:" and the following word.

Putting [intitle:] in front of every word in your query is equivalent to putting [allintitle:] at the front of your query: [intitle:google intitle:search] is the same as [allintitle: google search].

allinurl: If you start a query with [allinurl:], Google will restrict the results to those with all of the query words in the url. For instance, [allinurl: google search] will return only documents that have both "google" and "search" in the url.

Note that [allinurl:] works on words, not URL components. In particular, it ignores punctuation. Thus, [allinurl: foo/bar] will restrict the results to page with the words "foo" and "bar" in the URL, but won't require that they be separated by a slash within that URL, that they be adjacent, or that they be in that particular word order. There is currently no way to enforce these constraints.

inurl: If you include [inurl:] in your query, Google will restrict the results to documents containing that word in the url. For instance, [inurl:google search] will return documents that mention the word "google" in their url, and mention the word "search" anywhere in the document (url or no).
Note there can be no space between the "inurl:" and the following word.

Putting "inurl:" in front of every word in your query is equivalent to putting "allinurl:" at the front of your query: [inurl:google inurl:search] is the same as [allinurl: google search].

Q No 31 :-I lost my rankings, what could have happened?
It can be a frightening experience. You check your rankings and suddenly they have either plummeted or are non-existent. Your swallow hard and feel sick to your stomach. It's happened to all of us and here are a few possible explanations.

Fluctuation
This is the most common explanation and happens on a daily basis. Search Engine Result Pages (SERPs) are in a constant state of flux with Google leading the pack. Gone are the days of the "Google Dance". This has also meant, unfortunately, that you can be Top 10 one day and disappear the next. The only thing you can do in this case is be patient and give it a few days to see if your ranking reappear.

Stop or Poison Words
Have you added new content lately? It is possible you have added what Google calls "Stop" or "Poison" words. These may include Adult words or commonly words like Links, Search Engine, Bookmarks, Resources, Directory, BBS, Paid-to-Surf and Forum which although won't knock you out of the SERPs, they can drop you a few spots

Broken Links
The major Search Engines view a site with dead end links as "broken". In some cases this can cause a major hit to your rankings. It is always advised to run a check using a tool like a Broken Link Checker and fixing them. I have seen a site rebound time after time within days of fixing this issue

Links to Bad Neighborhoods
A single link to a banned site or a bad neighborhood can torpedo your rankings. A banned site can usually be shown by a site that had previously been indexed but is no longer in Google. A grey toolbar can indicate either a new site or one that has been banned. A bad neighborhood is one that employs black hat tactics. This may include a link farm. Link Farms are one of the most often asked about things and the general rule of thumb is "If it looks like one, it is one". Avoid linking to sites with hundreds of links on its pages with little to no content

Use of Black Hat Methods
This will usually result in an all out ban and includes things like Hidden Text, Cloaking, etc
To get back into the index usually requires you to remove the offending method and email Google (or whichever SE banned you) asking to be reindexed. In most cases you will be successful although be prepared to wait several weeks if not months

These are just a few of the reasons for a sudden loss in rankings though the most common. The best rule of thumb is to always build and SEO for visitors first and Search Engines second. If you try to stick to this, you should enjoy more stable rankings without the fear of your site disappearing overnight.

A few more reasons other than those mentioned are as follows:

An algorithm update.
An algo update is different from a "regular database update". In case of a database update, you will possibly regain your positions without doing anything in a few days. However in an algo update, you will have to study the new ranking criteria’s based on the search results. This could be very difficult if you are unfamiliar of SEO practices and this is the reason why it is said that SEO is an ongoing process and not a one time thing.

Site update at your end.
You would have added new content on your webpage that is in search engine results. Generally a change of about 5-10% is considered as fine and you will not see a fall in your ranks. However a change much more than that might be an indication that some large amount of data has been added / replaced and is needed to be reindexed / analyzed. In this time of reindexing, the search engine may not put your web site on the position that you have seen it earlier on. This is possibly viewed as a "site redoing" by search engines and to maintain the quality of the SERP's they may drop your page for a very short time from the position. The retaining of positions will depend on factors like the relevancy of the data to the keywords, richness of the data popularly known as content value, uniqueness (no duplicate content), and use of black hat seo techniques on the pages. The percentage calculated includes the html coding, JavaScript’s, content, and other factors. An example I can mention here is "adding many pages to a web site suddenly". The listings are generally dropped and only if the content is found qualitative, the position retained. But if found to be some automatic generated content, then it's possible that the listing be dropped for a long time. Search engines list pages and not web sites in SERPs but do consider the whole web site when listing a single page in the SERPs. If adding poor content to a website can drop a quality page from that web site, then it is sure that if poor content be added to the page itself, it drops in listings.

It is best to consult your SEO before making changes to your pages. If you have not followed any black hat techniques, then this could be a possible reason for a drop in listings.

Q No 32 :-What is ModRewrite and when should I use it?
Mod_Rewrite is a set of functions built into .htaccess, an Apache module which allows for all sorts of nifty tricks with URL's, Error Pages, etc.

Mod_Rewrite specifically is used mostly for changing dynamic links, such as:

http://www.example.com/shop.php?cat=4&id=123

into search engine friendly static URL's, such as:

http://www.example.com/shop/4/123.html

While the dynamic link won't get indexed easily and likely not at all without external sites linking directly to that URL, the static link will get indexed with simple internal linking.

Mod_Rewrite is a very powerful tool for creating static links for content management systems, forums, and the like. From personal experience, after converting dynamic links on a games website I run to static links using .htaccess, my search engine results have multiplied many times over.

Q No 33 :-In the long run, how do I get enough or more visitors?
1)Good content and lots of it updated regularly over a period of months/years

2) Onpage optimization (static URLs, proper title and meta tags, use of H1, H2 tags, etc)

3) Offpage optimization (link popularity, building up inbound links)

4) Community building tools (forums, article comments, whatever applies)

5) Viral marketing (release something people want for free with some tag to your site on it)

6) Promotion and Advertising (promote your site wherever you go however you can)

Q No 34 :-What are "Supplemental Results" in Google?

http://www.google.com/webmasters/faq.html

Q No 35:-. Why my site is labeled "Supplemental"?

Supplemental sites are part of Google's auxiliary index. We're able to place fewer restraints on sites that we crawl for this supplemental index than we do on sites that are crawled for our main index. For example, the number of parameters in a URL might exclude a site from being crawled for inclusion in our main index; however, it could still be crawled and added to our supplemental index.

The index in which a site is included is completely automated; there's no way for you to select or change the index in which your site appears. Please be assured that the index in which a site is included does not affect its Page Rank.

Q No 36 :-How can we attract GoogleBot?
First of all, the GoogleBot must find your page through links from other pages that are already indexed by Google.

Then to get GoogleBot to visit again and again you should add fresh content frequently - for example by using a blog.

Q No 37 :-What is the Future of SEO?
Trust level: Theory, assumption or speculation

The future of SEO is undoubtedly one where:

1) one-way text links from relevant pages continue to be the most valuable links

2) reciprocal linking continue to decline

3) the 'shotgun' approach to link buying declines

4) mass email link requests decline

5) free directory submission declines

6) niche directory submission increases

7) article PR (article submission) increases

8) article submission sites (e.g. EzineArticles, GoArticles, and ArticleBlast play a much bigger
and more important role in helping online publishers locate quality articles (due to the increasing article volume)

9) user popularity is just as important as link popularity, which means:

10) the quality of article PR improves in order to increase site traffic, credibility, and loyalty

11) the quality of website content improves in order to convert traffic and encourage repeat visits

12) Clearly, the choices for SEOs will be pretty much limited to paying for links at niche sites and/or engaging in article PR. Being an SEO copywriter, I may be a little biased, but for me, article PR is the hands-down winner in this comparison:

13) It satisfies Google's criteria for relevance and importance. Linking site owners include your article and link because, in doing so, their site becomes more useful to visitors, and their business gains credibility and authority.

14) It generates hundreds of free links quickly enough to make it worth your while, but not so quickly as to raise red flags at Google (in the form of link dampening).

15) Links are permanent and you don't have to pay to keep them there.

16) You get a lot of qualified referred traffic who already trust you and your expertise. This satisfies Google's visitor popularity criteria, while at the same time bringing you a lot of extra customers.

Q No 38 :-What is Google Analytics?
Google Analytics is a free web-stats solution which not only reports all the regular site stats, but also integrates directly with Google AdWords giving webmasters an insight into the ROI of their pay-per-click ads. According to Google, "Google Analytics tells you everything you want to know about how your visitors found you and how they interact with your site."

Why is this such a landmark move? Because for the first time ever, Google will have access to your real web stats. And these stats will be far more accurate than those provided by Alexa. Furthermore, Google's privacy statement says: "We may also use personal information for auditing, research and analysis to operate and improve Google technologies and services.". Now let's put two and two together:

Google is 'giving' every webmaster in the world free access to quality web-stats.

Millions of webmasters will accept this 'gift', if only because it integrates directly with their Google AdWords campaigns.

Google will then have full access to the actual web stats of millions of commercial websites.

Google will have the right to use these stats to develop new technologies.


Q No 39 :-What is the difference of IP delivery and cloaking?
Here is a comment from Matt Cutts of Google dated April 18, 2006 on his blog:

IP delivery: delivering results to users based on IP address. Cloaking: showing different pages to users than to search engines.

IP delivery includes things like "users from Britain get sent to the co.uk, users from France get sent to the .fr". This is fine-even Google does this?

It's when you do something *special* or out-of-the-ordinary for GoogleBot that you start to get in trouble, because that's cloaking. In the example above, cloaking would be "if a user is from Googlelandia, they get sent to our Google-only optimized text pages."

So IP delivery is fine, but don't do anything special for GoogleBot. Just treat it like a typical user visiting the site.

So it all comes down to intent.

Wednesday, April 9, 2008

Google SEO Glossary

Aging delay. Term describing a set of filters applied to new websites whereby the site cannot rank well (or at all) for any competitive keywords for 6 – 24 months. Also called the Sandbox.

Algo, Algorithm. A specific mathematical process for achieving a desired result. Google uses a proprietary algorithm that contains over 100 different criteria to rank Web sites in a specific order based on a specific search request.

Algorithmic listing. Any search engine listing that is on the “free” or unpaid section of a search results page. These listings are obtained using SEO techniques without the use of paid advertising. Also called organic, natural or editorial listing.

Anchor text. The clickable portion of text displayed (usually as blue, underlined text) for a link. Also known as link text.

Authority. Site with a high number of incoming links and a relatively low number of outgoing links. Opposite of hub.

Backlinks, backward links. Links from other sites that point to your site. Also known as inbound or incoming links.

Cascading Style Sheet (CSS). Code that defines the visual appearance, style (size, color, font), or positioning of text on a Web page. This code can be located either on the page it is used on or can be stored in a separate (.css) file.

Conversion rate. The percentage of visitors to a website that end up performing a specific action that leads to a sale. Such actions can include the purchase of a product, the submission of a form, or an email requesting more information.

Crawl. The operation of reading or analyzing pages of a website by an automated program called a spider or robot. Spiders crawl your site by following links on each page of your site. After crawling, the spider will return the results back to the search engine for later inclusion into it’s database for indexing.

Directory. As opposed to search engines, search directories use humans to review and place websites in alphabetical order under defined categories and sub-categories. The best-known directories are Yahoo! and the Open Directory Project (ODP).
DMOZ. Another term for the Open Directory Project.

Editorial listing. Any search engine listing that is on the “free” or unpaid section of a search results page. These listings are obtained using SEO techniques without the use of paid advertising. Also called organic, algorithmic or natural listing.

Everflux. Term used for the constantly changing search results that occur regularly.

External links. Links located on websites other than your own.

Googlebot. The name given to the main Google spider that crawls sites.

Google AdWords™. Google’s Pay-Per-Click (PPC) advertising program, whereby your site is listed in the right-hand side of Google search result pages in a small box. This type of advertising involves an auction where you bid, along with your competitors, for the cost per click for a specific keyword.

Google bombing. Term used to describe the process of artificially altering the ranking of a page by the use of links. It requires a concerted group effort from many different site owners who all agree to use the exact same link text in links that point to the same site. The linked-to site may not even contain the text used anywhere on the page.

Google dance. Older term designating the time period where Google updates their index, which results in site rankings that jump around, sometimes minute by minute. This is caused by Google running PageRank calculations for all pages repeatedly until the values reach a steady-state.

Google Directory™. The Google Directory lists those websites that are in the Open Project Directory (ODP), then ranks them according to PageRank alone.Google Toolbar™. A downloadable program that attaches to your browser, allowing you to see a public approximation for the PageRank (PR) value of a page, along with the external sites that link to that page.

Hub. Site with a high number of outgoing links and a relatively low number of incoming links. Opposite of authority.

Inbound, incoming links. Links that reside on another website that point to your website. Also known as backlinks or backward links. The opposite of inbound links are outbound links.

Index. Term used to denote the database that stores information about every web page for every website that a search engine has crawled (visited). If your website is included in the Google database (index), it is said to be indexed.

Index page. Another name for a home page. Many home pages are named index.html so that Web servers will display this page by default.

Internal links. Links that are located on pages within the same website. As opposed to external links, which are links that are located on a different website.Inline links. Links that are part of a sentence in a paragraph on a page, rather than simply listed in a menu bar or a links page without any surrounded text.

IYP. Internet Yellow Page directories such as Verizon Superpages, SMARTPages and other local-based directories like Google Local and Yahoo Local.KD.

Keyword phrase. General term used to define a specific word phrase that best describes the main topic of a web page. Synonymous with a search phrase that a visitor enters into a search engine to find specific information.

Keyword. General term used to define the main topic of a page. Synonymous with search term. A group of keywords used together in a phrase is called a keyword phrase. Google looks for keywords on a page that match searched-for terms.

Keyword density. The number of times a keyword is used on a web page divided by the total number of words on the page. Expressed as a percentage.

Keyword prominence. How close to the beginning or top of a web page that a keyword is found.

Keyword proximity. How close together the individual words that make up a keyword phrase are to one another, and in what order.Keyword weight. Also known as keyword density.

Landing page. Generally speaking, the web page that a person reaches when clicking on a search engine listing or ad. This may be any page of the site. For paid advertising, it is common to have multiple ads, each one linking to a specific landing page on the site that is targeted specifically for that ad.

Latent Semantic Indexing. A technology used by Google that factors in synonyms and related keyword phrases when ranking a page for a specific keyword. A page could rank well for a related keyword that may not even appear on the page.

Link quality. A general term referring to link reputation and link strength. Links with high quality are those where the PageRank of the linking page is high, and where your keywords are used in the link text and in the page title that the link is.

Link popularity. A term referring to the number of incoming links to your site.

Link reputation. A term referring to how closely link text matches the title of the page the link is on and, more importantly, the text on the page that the link points to.

Link strength. Dependent on the PageRank of the linking page as well as the number of other links on the page. Also referred to as link voting power.

Link text. The clickable portion of text displayed (usually as blue, underlined text) for a link. Also known as anchor text.

LocalRank. A variation of basic PageRank whereby links from sites that share the same Class C IP address block are weighed less (are worth less) than links from a variety of different IP addresses (different servers owned by different businesses).

META tags. HTML tags located in the HEAD section of a web page that specify information that is viewable only to a search engine. The two most commonly-used META tags are the “Keywords” META tag and the “Description” META tag. Most search engines ignore META tags today due to their abuse in the past – however Google and others still use the contents of the Description META tag when listing web pages. In addition, the “Robots” META tag can be used to prevent search engines from indexing a web page.

Natural listing. Any search engine listing that is on the “free” or unpaid section of a search results page. These listings are obtained using SEO techniques without the use of paid advertising. Also called organic, algorithmic or editorial listing.

Off-page factors. Those elements of a website that are not located on your website (such as incoming links). Off-page factors are largely out of your control.

On-page factors. Those elements of a website that are located on your website (such as keywords). You are in control of on-page factors.

Robot. The software program which a search engine runs to read and analyze your site. See also spider. Google robots is called Googlebot.

ROI. Return On Investment. The amount of revenue generated from a specific marketing expense, expressed as a percentage.

Sandbox. Term describing a set of filters applied to new websites whereby the site cannot rank well (or at all) for any competitive keywords for 6 – 24 months. Also called the aging delay.

Search Engine Marketing (SEM). A general term that encompasses both paid and “free” forms of advertising a website using search engines. SEO is one type of SEM. The other major type of SEM is Pay-Per-Click advertising (PPC).

Search Engine Optimization (SEO). A general term used to describe specific techniques that can be used on websites in order to rank them favorably with search engine.

Search Engine Positioning (SEP). A term used interchangeably with SEO. However, since search engine optimizers do not actually "position" pages within the search engines, this is. SEP more closely describes Pay-Per-Click (PPC) advertising, since that is the only way a site can be positioned in a search engine.

Search Term. The word or words a person enters into a search engine's search box. Also synonymous with keyword or query term.

SEMPO. Search Engine Marketing Professional Organization. A non-profit group whose focus is increase the awareness of and educate people on the value of search engine marketing.

SERP. Search Engine Results Page. The page or pages that a search engine displays after a search query for a certain search term or phrase.

Server log. The data file that a Web server produces (usually daily) that lists website traffic activity by domain. Web statistics programs use the server log file to produce graphic reports. See statistics.

Spider. The software program, also known as a robot, which a search engine runs to read through and analyze your site. Google’s spiders is called Googlebot.

Statistics, stats. The data associated with visitor traffic to your site over time.
Theme. The overall subject area, topic, or category of a web site.

Sensitive PageRank. A variation of basic PageRank whereby a web page is assigned different PageRank scores for each different topic a page covers.

Tracking URL Typically used in paid ads, such as Google AdWords, where unique code is added to the end of a link in order to track visitors who click on that ad. Tracking URLs allow you to measure the popularity of an ad.

TrustRank. A variation of basic PageRank whereby links from site that are “trusted” or “white-listed” by Google carry more weight (are more valued) than other links.

Vote, voting. When one website links to another website, it “casts a vote” for the other website. The strength or weight of this “vote” depends on the PageRank of the page and the number of other links on the page.

Yahoo. A popular search directory (as opposed to a search engine). All Web sites listed on Yahoo are first reviewed by a human editor.