Archive for the ‘seo’ Category

Google Giveth and Google Taketh Away

Michael Martinez over at SEO Theory recently posted an interesting article on contract law, terms of service, and how they apply to the web. It’s worth a read and raises some interesting points, but my main beef is the loosely stated complaint about Webmaster Guidelines, specifically those of everyone’s favorite search engine. This complaint, and variations on it, have been bouncing around the SEO/M and Affiliate communities for a while now and I’ve heard more than enough whining on the subject. Frankly stated, it seems they don’t like the fact that they actually need to work in order to keep their spam profitable on the search engine result pages.

Google, as with most other search engines, is a business. In order to actually remain in business, contrary to popular belief on the web, they need to make a profit. In most web business models profit is proportional to the number of users. So far, at least in my opinion, all of this is web business 101. Google has its number of users for one reason, as of now it’s arguably the easiest and most accurate search engine available. Google will retain its users as long as it remains the easiest and most accurate search engine available.

Easy is something that Google has down pat, you can’t get much easier than a single input and a button, accurate is where it gets interesting. In order to remain accurate Google needs to be unmanipulatable. Their algorithm needs to return the most relevant and authoritative content possible, and that means excluding spam. If you’re not publishing the most relevant and useful content out there you don’t deserve to be listed, let alone rank on the first page.

For better or for worse, the bulk of SEO exists to manipulate the search engines, and if you think otherwise you’re seriously deluding yourself. Don’t get me wrong, I believe SEO is absolutely necessary, if you don’t at least try to be listed in the search engines there’s a pretty good chance your site will never be found. However, SEO is only the start, it’s the framework to build your content upon. Good SEO establishes a solid base for accessibility, findability, and information architecture, which is a good thing. Good SEO, however, is not magic. If you do it the Google-approved “right” way it will probably take a decently long time to get a specific ranking, but once you have it, it should be pretty difficult to lose. Taking a shortcut and ignoring the webmaster guidelines may prove useful and in some cases successful, but comes with the underlying risk of being delisted altogether.

Basically, what I’m saying is that SEO, be it black hat or white hat is a gamble. It’s a simple question of risk versus reward, and relies very heavily on your business model. If your business model is to make a quick buck over a short-term, by all means, go black hat, but don’t complain when you’re discovered and your profit dries up. However, if your business model is to make a long-term name for yourself or your business, go white hat, take your time producing quality, relevant content, and rely on Google to keep the spam from appearing ahead of you in the SERPs.

Either way, Google will continue doing what they do, producing to the best of their ability the most relevant SERPs for a given query, and they’ll change their algorithm whenever necessary to make it happen. Instead of complaining about the Webmaster Guidelines, thank Google for them, without them you’d be shooting in the dark. Instead of complaining about quality guidelines, thank Google for them, the higher the consistent quality of the ads and results Google displays the greater the chance they’ll be clicked on.

Complaining about and attempting to change Google’s practices on these points will not help you in the long run. Take a second and think about this. If Google lowers its quality control standards to appease the SEO’s and affiliate marketers, Google becomes less useful to the end user. If Google becomes less useful, less end users will actually use it. If less end users actually use Google, you have less potential customers, and like it or not your profits are going to be less as well.

Fixing the SERPs When Changing Permalink Structure in WordPress

As of today, if you run a Google search to see what pages of my site are in the index you’ll find quite a bit. However, the pages in the index reflect a URL structure that I’m not thrilled with for SEO reasons. In order to remedy this problem I did some playing with Google’s webmaster tools and my .htaccess file. As of today we’re going to employ some 301 redirects to get the SERP’s pointing in the right place, remove some old pages from the index, and modify my robots.txt. Hopefully these changes will concentrate my search results to show off the real content of this site and lose some of the fluff. Along the way, we’re also going to find out how long it takes for these corrections to take place.

First things first, here’s the problem, if I do a google site search like this site:www.ericdelabar.com I get results that look like this:

serp.gif

Notice the entries that look something like www.ericedelabar.com/?cat=11? They all work, but they’re not ideal for SEO, so first things first, the .htaccess file.

1
2
3
4
5
6
7
8
9
10
11
12
13
redirect 301 /?cat=1 http://www.ericdelabar.com/
redirect 301 /?cat=5 http://www.ericdelabar.com/category/standards
redirect 301 /?cat=6 http://www.ericdelabar.com/category/seo
redirect 301 /?cat=7 http://www.ericdelabar.com/category/firefox-extensions
redirect 301 /?cat=9 http://www.ericdelabar.com/category/eclipse
redirect 301 /?cat=10 http://www.ericdelabar.com/category/the-hard-way
redirect 301 /?cat=11 http://www.ericdelabar.com/category/css
redirect 301 /?cat=12 http://www.ericdelabar.com/category/view-from-the-trenches
redirect 301 /?cat=13 http://www.ericdelabar.com/category/user-behavior
redirect 301 /?cat=14 http://www.ericdelabar.com/category/usability
redirect 301 /?p=5 http://www.ericdelabar.com/2007/02/in-beginning-there-was-doctype.html
redirect 301 /?p=7 http://www.ericdelabar.com/2007/03/lets-talk-about-tools-part-2.html
redirect 301 /?p=8 http://www.ericdelabar.com/2007/03/css-things-i-learned-hard-way-absolute.html

Each line in the file is a rule, the redirect is the command, the number is the type, which in this case is 301, which means “moved permanently,” the original URL comes next, and then finally the desired URL. Looking at my list, I have two basic redirects here, redirects to convert from category id to the category name, and redirects to convert from post id to post slug. These lines must be before the # BEGIN WordPress comment in your .htaccss file, if they are not, they will not work.

Now that categories and posts are redirecting, there is one final problem, the last few pages of results contain quite a few search result pages. These pages have a URL pattern something like the following: http://www.ericdelabar.com/index.php?s=firebug. I don’t really want my search result pages in the SERPs, but I’m not ready to get rid of them all yet. To remedy this I simply installed the Search Permalink plug-in which redirects the ?s=keyword pattern to /search/keyword/, which is not perfect, but looks a little nicer than the query string. In the long run I will probably remove them from the SERPs altogether, but I haven’t decided on my search strategy for this site yet, so I’ll leave that for another day.

Next we want to keep Google out of the site admin, so we add the WordPress login page and admin section to the robots.txt file by adding the following lines.

4
5
Disallow: /wp-login.php
Disallow: /wp-admin

Next, we’re going to use the Google Webmaster Tools to remove the wp-login and wp-admin pages from the index. This requires that the pages have either a robots.txt file disallowing them or a robots metatag with noindex specified. Since our robots.txt file should handle this, our request should look like the following:

remove_page.gif

Finally, we resubmit the site for crawling and hope that this gets cleared up within a few days.

Now this methodology only specifically addresses Google, my site has also been indexed in MSN, Yahoo, and Ask, and steps will have to be taken to resolve these as well, but fixing Google is definitely a good start!

Best Laid Plans

I tried. I even had a rough draft; but when I went to try out my concept it didn’t work. My article on Friday was supposed to be on 301 redirecting from the old WordPress URL structure to a new URL structure, but after doing my research, I’m not sure it’s even possible. My guess is mod_redirect ignores URL parameters, but I definitely need to do some more homework.

The problem I was trying to solve is that Google has quite a few of my pages indexed, but they’re in the wrong permalink structure and I’d like to convert them to the friendlier structure that I have now. Looking around the web there are a few WordPress plug-ins that are supposed to help you migrate, but it seems they only migrate from one non-default structure to another. Currently, most of the links still work, but I’d prefer to have the new structure in the SERPs to gain the SEO benefits of having keywords in a URL.

So, as of now, the plan is to solve my problem and then post the article. Hopefully, if all goes well, there will also be another new article on this coming Friday, but only time will tell.

In the beginning there was DOCTYPE

Alright, in the beginning there wasn’t DOCTYPE, it didn’t come along until about the time XHTML was released; however, if you want to do the Web 2.0 thing right, it helps to start on a solid base.

My goal here is to get a brand-new HTML document up and running as a good base for designing a Web 2.0 application. Today, we’ll look at the parts of a document that the typical user doesn’t actually see, but play a huge role in how a user finds your site and how it’s actually rendered on their screen.

First things first; make your DOCTYPE the first line of the file. That’s right, line number one, no XML definition, no spaces, no server-side code, the first line of the file. This ensures that you don’t end up in quirks mode accidentally. Now, as far as DOCTYPE’s go, I really only see two options, HTML 4.01 Strict and XHTML 1.0 Strict. There are theoretically valid arguments for not using XHTML, most of which have to do with the simple fact that most web servers don’t serve it as XML and most browsers don’t read it as XML, but that’s a different article for a different day. Today, we’re using XHTML 1.0 Strict because the client insists that we use the latest technology regardless of whether or not it’s appropriate; so this is what the DOCTYPE definition for XHTML 1.0 Strict looks like:

1
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">


Remember, that’s the very first line of your file!

Next, let’s get the basic xhtml in there:

1
2
3
4
5
6
7
8
9
10
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
 <head>
  <meta http-equiv="Content-Type"
        content="text/html; charset=UTF-8"/>
 </head>
 <body>
 ...
 </body>
</html>


Next, let’s get some very basic SEO framework in there and add some meta elements. Meta elements are sometimes called meta tags, but since we’re working in an XHTML document we’ll use the XML terminology. We’ll use description and keywords, and throw author in the for good taste.

5
6
7
<meta name="keywords" content="page, site, title, keyword, doctype"/>
<meta name="author" content="Eric DeLabar"/>
<meta name="description" content="A short description of the page content."/>


Great, now there’s a place for keywords and a description, but why do we need them? Keywords are old-school, I don’t even know if modern crawlers still look at them, but I do know that like all meta elements they should be specific for the page you’re writing them for. So, as a means of guiding my SEO, I usually choose the 5-10 keywords that I want to apply to my current page, then make sure that as I’m writing the page copy I use those keywords.

The description meta element is slightly different, because it’s visibly used. It is displayed by some search engines as a description of your page when it occurs as a result. As a guide to writing a description, keep it short, around 128-256 characters. Keep in mind that if it’s too long it will most likely by truncated, so just write a sentence describing the current page.

Now, we’ll have a look at the page title:

8
<title> Descriptive Page Title - EricDeLabar.com</title>


Like the meta description, the title element is displayed as part of a search result list, so a descriptive title can be very helpful for drawing users into your site. Some SEO experts also believe that keywords in your title are more heavily weighted in search rankings. Keep the title shorter than your description, and maybe consider including your branding at the end. I like the branding at the end because if it does get truncated by the search engine more of my more important descriptive title is displayed.

Finally, let’s add the stylesheets:

9
10
11
12
<link rel="stylesheet" type="text/css" href="/style/screen.css" media="screen"/>
<!-- [if lte IE 6]>

  <link rel="stylesheet" type="text/css" href="/style/iefix_screen.css" media="screen"/>
<![endif]-->


Now I’m a bit of a purist, and I’ll have a later post describing my process for styling a page, but notice for now, I usually have a minimum of two stylesheets. The first being a clean stylesheet (as in NO Hacks/Filters/Whatever you like to call them) that handles all of the (mostly) standards complaint browsers, then an Internet Explorer conditional comment for handling older versions of IE. In this case, my iefix stylesheet is for lte IE 6 which translates to Less than or equal to Internet Explorer 6. Notice that the iefix stylesheet comes after the other stylesheet(s), this is so that any rules in the iefix stylesheet with equal or greater CSS specificity will overwrite the rules in the standard stylesheet. (More on specificity in a later post.)

That’s it! Congratulations, you have a nice, clean base to start your web 2.0 website on! If you’re curious, the finished product looks something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
 <head>
  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
  <meta name="keywords" content="page, site, title, keyword, doctype"/>
  <meta name="author" content="Eric DeLabar"/>
  <meta name="description" content="A short description of the page content."/>
  <title>Descriptive Page Title - EricDeLabar.com</title>
  <link rel="stylesheet" type="text/css" href="/style/screen.css" media="screen"/>
  <!--[if lte IE 6]>

    <link rel="stylesheet" type="text/css" href="/style/iefix_screen.css" media="screen"/>
  <![endif]-->
 </head>
 <body>
 ...
 </body>
</html>