Dev Techniques (Tips & Tricks)
I’ve spent some time today comparing Google Analytics data with the data now shown in Google Webmaster tools. I was intrigued after hearing about Google’s announcement on Friday and learning that they were now offering more data in the Webmaster tools, including top search queries complete with click through data. Today I learned that the data is not really accurate—yet.
I’m showing, in most cases, that my sites are getting many more actual click throughs (as shown in analytics and server logs) than Webmaster tools is even registering in impressions. For example, I’m showing 61 visits in analytics for a certain keyword that Webmaster tools shows as only getting 58 impressions. Time to wait for more data to come through.
It will certainly be nice once the data is more accurate. Makes me wonder if Google will eventually allow you to create goals in Analytics that let you incorporate impression and click through data for certain organic keywords and essentially add one more level of visualization for your data. That would be awesome! As they say in that Coke commercial, “scientists, I’m looking at you”!
I just had to write a quick blurb about the astute post written a few weeks ago by Brent Nef, a brilliant Laboratory Lingo contributor. His article on pubsubhubbub, is ahead of the curve on technology trends and how they relate to Google and its tools and services. Recently, Google has made efforts to push (no pun intended) a new protocol that would allow them to index new web content in real time. The protocol PubSubHubbub (PuSH for short) has quickly become a new and exciting technology that would allow Google to have the web come to them instead of having to go find the web. Read Write Web published an article that highlights some things said by Dylan Casey, a senior project manager at Google. This is an example of how the web will become programmable someday, with many applications talking to each other and using APIs to exchange data–all in a way that doesn’t require polling.
I see all this as further indication that Google is trying to shift the way it calculates relevance to be based more on site content than linking. Somehow, Google wants to be able to deliver the best possible results to searchers by placing sites at the top that actually have the most relevant content. For now, the best way is still by counting the incoming links as votes for that site. Right now, relevant site content + quality inbound links = top rankings. Someday that equation may be more one sided with relevant site content trumping all. SEO will be all about creating great content that is relevant for particular searches rather than creating link-bait or doing endless link building.
Some people say that this will not change the way Google calculates page rank, and is only relevant to searches that require real time results to answer the search query. That may be true for now, but PuSH is definitely a way for Google to get one step closer to identifying the quality of a website’s content based on something other than inlinks.
1. Write Good Page Titles and Headings
Make sure that you use keywords that people are searching for in your page’s header tags (i.e. h#: where # is a number). Also, be sure the H1 tag on each page accurately describes the page’s content. Good headlines are good for your users, and good for the search engines. Here’s a link to a previous post that goes into much more detail.
2. Utilize Title Tags to Their Utmost
You want your title tag to contain keywords that are important to your SEO efforts. Make sure those keywords show up near the front of your title as Google only picks up the first sixty to seventy characters. Don’t just list a bunch of keywords; take a targeted approach that makes sense to your user. Include a clear call to action! Learn why a call to action is important.
3. Use Natural Language in Your Content
Don’t try too hard to stuff keywords in your page just to make it more “keyword dense.” The search engines are getting better and better at identifying natural language. They can spot your keyword fluffed content much easier than they could in the past. What I mean to say is, don’t use the same keyword phrase over and over and over and over and over and over and over and–you get the idea. Try including semantically related words in your writing.
4. Create a Smart Internal Linking Structure
Use keywords and user friendly descriptions in links to help your users (and the spiders) navigate from page to page on your site. Properly position navigation to make it simple for users to know where they are and where they can go.
5. Include a Sitemap File for The Search Engines
We at Industry Forge create sitemaps at http://www.domain-name.com/sitemap.xml. This is the standard way to do it, and we’ve been pretty happy doing this. You may choose something else, but keep it simple. Also consider creating an HTML version of your sitemap for your users. You might put it at http://www.domain-name.com/site_map/ and include the same content you have in the XML version. Now go submit your sitemap to Google using the webmaster tools.
6. Protect Your Site From 404 Errors!
You can use your new Google Webmaster Tools account to track the 404 errors identified by Googlebot when spidering your site. It’s essential that all links coming into your site, and your site’s own internal linking, do not produce 404 errors for your visitors. By creating a redirect strategy to handle redirection of old pages, you can avoid this problem and make happier visitors. We’ll post a full article on a great way to handle this with PHP.
7. Use Statically Typed (Pretty) URLs
This is a great usability feature for your site. A “pretty” URL looks like this:
8. Design For Accessibility
With the Web 2.0 craze came a huge push by web developers to use XHTML and CSS for page markup and to follow strict compliance guidelines from the W3C. Thank goodness! Now if we could just get Microsoft to jump on board we could quit worrying about creating websites to be “cross browser/cross platform” and just focus on content. Pages built to standard are often much lighter (in data size) and therefore load more quickly. Google gives you a bit of preference for a quickly loading site, plus you’ve only got your user’s attention for so long. Make it accessible and your users will see what you expect them to, and the spiders will more simply index your pages.
9. Use “Spider Visible” Content
Consider using text instead of graphics for your navigation, page titles and other page elements. Google cannot process the content of some rich media files or dynamic pages. Some search engines still have a hard time with Flash as well. Some flash elements on your site can come in very handy, however Flash-only sites are often difficult to maintain and hard for some search engines to index.
10. Don’t Try to Trick The Spiders
We figured we would include at least one don’t in our list of dos. Google warns against shadow domains, doorway pages, spyware and scumware. Don’t use them! They won’t do anything for the long term success of your business. They don’t work and they will get you banned from the search engines (at least from the ones that matter).
Thanks for reading! For your quick reference, here is a list of external resources used (and linked to) in this article.
Keywords and Copywriting
Site Maintenance & Monitoring
Accessibility & User Agent Detection
Are you easily distracted by technology? Do you troll the Interwebs looking for the latest whatsit or whatchamacallit to try? Have you promised yourself just 5 more minutes on the computer and later find an hour has passed while you research some esoteric feature that might or might not have anything to offer you, but you want to know about it anyway? If so, I probably don’t have anything to offer you, but I do want to be your friend — tell me everything you know.
For the rest of you, I offer you this: my latest find, pubsubhubbub.
Syndication feeds (RSS/ATOM) as a technology have their limitations. Obviously they are fantastic for aggregating large amounts of content across the Internet into an easily accessible format, however, RSS and ATOM remain tethered to the concept of polling. Requiring the subscriber to continually ask for updates to your content, instead of accessing that content on an as-needed basis is poor design and inefficient. For instance, looking through server logs, it is common to see Google Reader accessing RSS feeds several times an hour, even if the blog only gets updated on average once a week.
This seems wasteful for google (as well as any other RSS agregators) and wasteful for you, the publisher, as your server expends precious CPU cycles, which could be better used serving timely content. Alternately, some feeds get so little traffic that Google Reader might not update them regularly and several days might pass before your readers are alerted to new content.
The answer to efficient syndication lies in webhooks, or more specifically pubsubhubbub - henceforth called PSHB. PSHB is an effort on the part of some google employees to provide a protocol where syndication is event driven rather than polled. When a publisher creates or updates content, a “hub” is notified by a POST request. The hub manages a list of “subscribers” (PSHB speaking clients) for each feed. When the hub is notified of new content, it notifies all subscribers to pull the latest subscription feed.
After installing this wordpress plugin, wp-pubsubhubbub, (super easy to install — just activate and it’s done), I’ve seen marked improvements in the timeliness of my content appearing in Google Reader. Granted the sample in this case is currently 1, so I would love to hear from any others that have implemented this and what their experience has been, in the comments.
The nice folks here at industryforge have invited me to share some of my thoughts and feelings on random nuggets of technology. We (meaning probably just me) need to come up with a good title for these thoughts, please leave your suggestions in the comments. I think that I’ll leave several suggestions there myself to avoid any potential embarrassment from lack of comments.
Here is a brief synopsis of how this article explains how you can create user friendly and search (SEO) friendly headlines:
- Make it simple for users by forming clear and descriptive headlines.
- Don’t try to be too tricky or clever (unless your goal is to entertain).
- Break up the page with subheads for easy ingestion and scannability.
- Make it brief and to-the-point.
- Include keywords in a natural way.
Page titles and headlines are a very important part of your page both for your visitors and for search engine spiders. If you are able to create headlines using natural language with keywords that your visitors have used in their search query, you are more likely to capture the attention of those visitors and have a higher conversion rate (i.e. those visitors are more likely to turn into real customers). Here are a couple of tips for creating good headlines. Remember, these are just guidelines; some site designs and layouts may work better with other approaches, but this will at least give you some thoughtful insight as you consider writing your next headline.
Focus on the Headline
First, a headline is the very first thing a user is going to look at when they get to your page. Don’t let your designer tell you that some other page element needs to get the focus. 90% of the time, it’s going to be the headline that should rightly receive the attention. Users will quickly glance at a page headline to confirm their question, “does this page have what I’m looking for?” If they can quickly determine that the page is the right one for them, they will continue to scan the page.
Make Your Page Scannable With Subheads
Notice how I italicized the word “scan” above. This is very important: your visitors are not going to read your site like they would a newspaper until they absolutely have to. They will pick the page apart, quickly ingesting its parts in an F-shaped pattern, starting at the top left and moving on down. Break up paragraphs with subheads (h2 or other HTML heading tag) to let your users know what to expect. If they just see a huge block of text, they will almost undoubtedly skip it. Hint: if you want users to read, consider smaller text—it encourages users to read instead of scan. But use small text sparingly, only in instances where you’d like the user to read content like they would a book or newspaper.
Be Concise and Clear
The fact that visitors are going to quickly scan a page makes it paramount that you create a headline that is concise, to-the-point, and does little more than quickly tell the user what the page is about. Now, here’s where the search friendly stuff comes in. The great thing about search strings is that they are typically very concise. In fact most search queries are not complete sentences. The “search for keywords” behavioral paradigm has been well established in internet users.
What you want to do is include keywords in your headline that users have likely used in their search. If your headline has natural language that happens to also have the keywords that the user searched for, they will likely identify your page quickly as a resource that they want to investigate. As you can see in the headline of this article, I’ve incorporated the keywords “SEO Friendly Headlines” into a natural language headline that the reader can quickly identify with and immediately know what to expect.
Don’t try to be too clever in your headlines unless your goal is to entertain. Visitors looking for entertainment want to see something that makes them laugh, or think, or be otherwise entertained. Visitors looking for information want to know quickly what it’s all about without the fluff.
Short story: keep your user in mind; make it simple for them by being clear and descriptive. Don’t try to be too tricky or clever (unless your goal is to entertain—in which case you have a lot of leeway). Break up the page with subheads for easy ingestion and scannability. Make it brief and to-the-point. Include keywords in a natural way so communication is clear and un-muddled.
I use a really simple MVC framework that I’ve developed over the years to create many of the websites I’ve built. I recommend that my clients have some portion of their site that is updated frequently with new relevant text. Sometimes I use WordPress to handle the blog portion of the site. WordPress provides a nice user interface for creating new content, managing comments, and is relatively simple to make search friendly. In most cases, I like to include portions of WordPress generated content outside of the blog itself. We’ve even done this on our own Industry Forge website.
I thought there might be a simple way to include the WordPress functions I needed outside of WordPress. I went to Google and searched for “show wordpress posts outside of wordpress.” I found a few articles on how to include the wp-blog-header.php file from WordPress to give your php script access to all the WordPress functions. They all provided a bad solution:
Wrong Way to Include WordPress Functions In Your Site
This seemed to be a pretty simple way to get what I wanted quickly. I was happy with the solution for a few days until I saw the 404 errors begin to show up in my Google Webmaster Tools account for this domain. I was a little perplexed that Google had been picking up 404 errors from pages that actually existed. You don’t want your site to throw 404 errors on pages that actually exist. It’s not search engine friendly at all (even though users don’t know the difference unless they’re closely examining all the headers received by their browser). I realized that this method was not the right choice for me. It was not search engine friendly at all, and would cost me big-time in my own SEO efforts.
It turns out that since I was loading the wp-blog-header.php file before loading any of the rest of my own framework, and because WordPress could not find the specific requested page in its database it sent a 404 header. I verified this with the Live HTTP Headers add-on for Firefox. I never intended for WordPress to actually handle the delivery of the page—I only wanted a few of the recent blog posts—but how was WordPress to know that? Its built to be a full CMS system (I know, I should rethink using it this way). But wait! There is a simple fix.
The Simple Fix
I did a bit of poking through the massive WordPress codebase this morning looking for all the lines the would produce a 404 error. Of course after just a few minutes I thought, “what am I doing? Certainly some astute web developer has had this issue before and has likely posted a solution on his or her blog.” I was right. Thanks to a writing style much more succinct than my own, I was able to look at the post by the nice folks at Adrogen (a Denver website design company) and quickly implement the solution they offer. They suggest using the following 5 lines of code to include WordPress instead of just including the entire WordPress framework:
Correct Search Engine Friendly Way to Include WordPress Functions In Your Site
Thanks, Adrogen. You saved me a lot of headache and a lot of illegitimate 404 errors being sent to Google.