Search Engine Optimization

Authored by Michael A. Peters on October 5th, 2024.
Last modified on November 17th, 2024.

If you want your web site to attract viewers, it is critical that you understand Search Engine Optimization (SEO). If you pay no attention to it, your site will not rank as well as it could in Internet searches.

Before you get too worried about it, I would like to warn about services that claim to boost your SEO ranking. Many of them use shady techniques to bring short term gains but ultimately reduce your ranking or may even get your site banned from search engines.

Some of them are legitimate, but you probably are better of learning good SEO practices yourself and spending your marketing budget on actual marketing.

If you do not understand the basics of SEO then you do not know enough to know the optimization company you are working with is on the level. If you do understand the basics then there may not be that much a company can do for you that you can not do for yourself.

The concepts I am presenting here are largely based upon advice from the Google publication Search Engine Optimization Starter Guide (PDF). I highly suggest you read it.

Remember The User

Your web site does not exist in order to get high search engine ranking. A high rating is worthless if your visitors do not come back. Your web site should have clean design. It should be easy to navigate. It should have fresh well written content. When your users have a favorable experience with your web site, they will visit more often. Remember that the whole point of SEO is a means to an end, not the end itself. The end you want is visitors who enjoy your web site and come back.

Generally you should avoid auto starting multimedia. You should also avoid working with advertisers that auto start multimedia. You should generally avoid animated graphics, they distract from your content. Please, please, PLEASE avoid advertisements that block your content and force the viewer to take action to remove them. These advertisements are often called Hover Ads and seem to presently be on the increase. The vast majority of the time, the only thing they accomplish is annoying the user. The user often does not even look at the ad content, and sometimes they get so frustrated with them that they leave your site for good.

You should attempt to have clean valid HTML content with layout kept in a separate style sheet. Test your content in as many browsers as you have access to. Valid markup increases the odds of your web page properly rendering everywhere, but it does not guarantee it.

Avoid telling users they should be using a specific browser. Users usually like the browser they are using and will just find a different web site if your web site does not work with their preferred browser.

If at all possible, avoid content that requires your user have a third party plugin installed. Quite frequently the user will just leave and go somewhere else.

Navigation

Your web site should be easy to navigate both on the site level and on the page level. You should have a site map that users can easily find and refer to that lists at least the primary pages of interest on your web site.

Every page should have a navigation bar listing closely related pages. If you run a large site with a lot of content, users can get lost. In that case you should also provide what is known as breadcrumb navigation.

Lengthy pages (such as this one) should have additional navigation near the top of the page using clearly labeled anchors. If a user is interrupted and has to come back to the page, it should not be difficult for the user to find where they were and resume reading.

404 Not Found

When a user requests a resource on a web server that does not exist, it already is very frustrating for that user. It is especially frustrating when the default 404 error page is used to display the message.

Sometimes a 404 error is reached because the user clicked on a link that was mis-typed or the user typed in a bad URL manually. In these cases, it is proper to send the user to an error page. However it should not be the default server error page. You should have your web server configured to pass the 404 error handling to your content management system so that the error page will have the same look and feel as your web site and perhaps even be able to suggest pages that closely match the requested page and may be what the user was actually looking for.

Another cause for 404 errors is you re-arranged your web site. The page still exists but is now at a different location. In those cases, sending the user a 404 error is completely unacceptable. What should happen is you send the user a 301 permanently moved redirect to the new location of the file.

Sending a redirect header will cause the browser to load the correct location and the user gets the content they were looking for.

A good content management system will make this automatic.

Finally, it is possible you have removed the requested content. In those cases, you should send a HTTP 410 error with a page indicating that the resource is no longer available.

CAPTCHA

CAPTCHA stands for “Completely Automated Public Turing test to tell Computers and Humans Apart”.

The problem it is trying to solve is that of malicious bots that spam message boards, feedback forms, etc.

Unfortunately they are often poorly implemented. They are frequently an obstacle that frustrates legitimate users and drives them away. I personally have left multiple web sites because I could not get the fucking CAPTCHA to work.

When technology prevents legitimate users from accessing your resource, the technology is broken and should not be used.

Do your users and thus yourself a favor and be very careful about what type of spam defenses you employ. You may (or may not) see a huge reduction of spam, but what you are not seeing is the users who got fed up and simply left, never to return again.

Describe Your Site

You may know what your web site is all about, but can a search engine figure it out? If a search engine has trouble figuring it out, it can not offer your site to people looking for exactly what you have to offer.

The Title Tag

Every valid HTML page has a title tag. Make sure yours is descriptive of the content contained in the web page. The title not only helps with indexing rank, but users see the title in search engine results. A good descriptive title is far more likely to result in a click through to your web site.

Avoid keyword stuffing in the title tag. The title should be concise and pithy and descriptive of the content within.

Bad Examples:

Acme Home Page
My Blog
Photographs

Good Examples:

Acme Corp: Explosives and Anvils
Life Lessons of a Single Mother
Photographic Tour of the Pacific Crest Trail

Description Meta Tag

Virtually every web page on your site needs to have a description meta tag. This is a tag that lives within the head portion of an HTML document and literally gives a text description about the documents contents.

This description is not displayed by the browser but it is quite possibly the most important part of the web page as far as search engine indexing is concerned.

When present, search engines will use the description meta tag to determine what kind of content the page contains. It should be a brief overview of the web page purpose and content. Do not stuff it with keywords. Even though it is not displayed by the web browser when viewing the web page, it often is displayed by search engines when listing the page in search results or by content scrapers.

When I write the description meta tag for a web page, I like to use what I call the “Facebook Share Rule”.

Facebook Share Rule

When sharing a link on Facebook, if the web page has a description meta tag, the contents of the description meta tag is what Facebook scrapes to describe the page on your wall:

Write your description meta tag how you would want the page presented when being shared on Facebook and you are probably good to go.

Also notice that Facebook incorporates the page title in the share description as well as an image from the web page. Many people are visually oriented. Pages you wish to share on Facebook should have at least one image that relates to the content, even if it is just a simple graphic.

Keywords Meta Tag

The keywords meta tag was abused (keyword stuffing) by web masters to the point that most major search engines no longer pay any attention to it. In that respect it is no longer important to use. Many small search engines however do still use it. The site specific search within DOMBlogger (still being coded), for example, allows you to optionally give weight to keywords in the keyword meta tag so that you can use keywords to assist your users in finding specific content within your site.

It does not hurt to use keywords and it does help with smaller search engines. You never really know when your site may be added to a topic specific search engine run by some enthusiast who likes your content and has enabled keyword weighting.

Crawler Friendly

The good news in Search Engine Optimization is that search engines want to index your site. Their business is providing users with content relevant to what the user is searching for, and they have participated in the development of several technologies to help you present a site for easy indexing.

XML Sitemap for Web Crawlers

You should hopefully have a site map for human viewers. While such a site map will be crawled by search engine spiders, it is not an ideal mechanism for web crawlers to obtain the information needed to index your site in the best possible fashion. Google introduced the basic specification for a site map that is friendly to crawlers, you can find more information about it at sitemaps.org.

In a nutshell, an XML sitemap file is a list of pages you want search engine spiders to crawl. You can provide additional information about the links including how important a page is relative to other pages on your site, when the content of the page last changed, and how often you expect the page to change.

An extension to the XML sitemap specification allows you to list images related to the content in the web page, and some basic information about those images.

While it is possible to manually create an XML sitemap file, it is better to let your content management system take care of it for you, as it is important that the file contain accurate up to date information about the pages in your web site.

For an example of an XML sitemap, see the one created by DOMBlogger for this web site. Even for a small site, it is quite a bit of information to try to manually keep up to date, especially if you include image information in the sitemap file.

You can have as many XML sitemap files as you want, but if it is maintained by your CMS you probably will only have one that contains all the information about every page you want indexed.

The rel Attribute

HTML provides an attribute you can use with hyperlinks called rel. If you set the the rel attribute to nofollow then some search engines will not follow the link to discover new content.

This is a little bit embarassing, but when I first wrote this web page, I forgot to use that attribute when linking to the sitemap.xml file. As a result, Google indexed the file as content and it currently shows up in search results for the word eroticaplexus. It is now changed, but it may be some time before that file no longer appears in search results.

You also want to use that attribute when linking to external web sites that are not closely related to the content of your web page. Search engines will sometimes use links in your page to try and figure out what your content is about.

Where this can be a real problem is blog comments. Many blogs (including DOMBlogger) will allow a person writing a comment on a blog to enter their web site address, and their name will appear as a hyperlink to that address. It is imperitive that those links use the rel="nofollow" attribute so that search engines understand they are not related to the content of your web site.

Robots Exclusion Protocol

The Robots Exclusion Protocol is an old but still used method for instructing web crawlers on how you wish them to behave while they are crawling your site. It is implemented by a set of instructions within a file called robots.txt located in the root directory of your web server.

Not all robots follow the instructions given, but it plays a vital part in helping robots understand how they should crawl your web site. For example, search engines may not know about your XML sitemap unless you have specified it's location in the robots.txt file:

Sitemap: http://eroticaplexus.net/sitemap.xml

That is all it takes to inform web crawlers of the name and location of your XML sitemap file.

Google Webmaster Tools

Google, which has been the leading search engine for years, has an excellent resource called Google Webmaster Tools. Use it. It is an excellent resource to assist in optimizing your web site for best search indexing, and will inform you of problems that may impact your ranking in their search engine.

Of particular interest to me within the Google Webmaster Tools is the notification of site crawl errors. It is important to fix these when they happen, and there has been more than one occasion when this feature informed me of a problem with a web site I was working on that I did not even know about.

They also will inform you of any malware they came across while indexing your web site.

DOMBlogger and SEO

The DOMBlogger engine is designed specifically to assist content producers in producing quality SEO content.

It is not automatic, but you do not have to be a complete geek or hire outside help to figure it out.

Valid W3C Markup

With DOMBlogger it is relatively easy to produce valid W3C HTML5 markup. Valid markup helps to ensure that your content will render on as many browsers as possible both now and in the future. It also makes it easier for search engine crawlers to parse and understand the structure of your content.

Using DOMBlogger obviously does not guarantee your markup is valid, but some mistakes are corrected for by the engine when you upload content and we are always checking the output of our engine as changes and improvements are made to ensure that we continue to produce valid markup on our end.

Site Navigation

DOMBlogger provides an easy interface for grouping like pages together with a horizontal menu across the top of the page and optional document specific navigation in a side column. By the time of release, a full featured site map and optional breadcrumb will be implemented.

A site specific search engine will initially be provided by a customized version of Sphider (patches will be released in compliance with the GPL) to further assist users in finding what they are looking for. Customizations includes:

Port to DOMDocument (done)
Port of configuration file to XML (security concern)
Port from MySQL to PostgreSQL with PDO
Parsing of XML sitemap for URL discovery

Handling of 404 and Related Errors

DOMBlogger keeps database records of all content pages so that it can properly send redirect headers when you move stuff around and notify the user of documents that have been removed from the server. You can even specify that you have moved a document to a different server.

At the time of release, the 404 Not Found error handling will include a list of the 5 closest matches to the requested file, increasing the odds that your visitor will find what they were looking for.

Document Meta Data

Meta data about all documents, images, and multimedia is kept readily available and easily editable allowing you to pay attention to the description meta tags, keywords if you choose to use them, image captions, etc.

You can also request a report for which documents do not have this data associated with it, allowing you to easily remedy the situation.

Sitemap Generation

DOMBlogger generates a valid XML sitemap and automatically keeps track of what images are used on what pages, allowing for accurate image indexing by search engines that index images. As soon as you alter content, the time stamp in the sitemap is updated and any changes to what images the page uses is reflected. Of course you can specify pages and images that you do not want to appear in sitemap.

The robots.txt file is also automatically generated to point to the XML sitemap, and has an interface you can use to add additional directives if you so choose.

EroticaPlexus

Chrome Has Render Bug Here

Search Engine Optimization

Remember The User

Navigation

404 Not Found

CAPTCHA

Describe Your Site

The Title Tag

Description Meta Tag

Facebook Share Rule

Keywords Meta Tag

Crawler Friendly

XML Sitemap for Web Crawlers

The rel Attribute

Robots Exclusion Protocol

Google Webmaster Tools

DOMBlogger and SEO

Valid W3C Markup

Site Navigation

Handling of 404 and Related Errors

Document Meta Data

Sitemap Generation

SEO

The User

Describe Site

Crawler Friendly

DOMBlogger

Syndication

Alternate Views

Adult Content Warning

Enter

Exit