*adjective, easily modified or changed; opposite of hardcoded

toronto web design - search engine friendly design

    home »  site map »  e-mail » 

Make Dynamic URLs
Search Engine Friendly

by Peter Lavin

Originally published at DevArticles

Overview

Using a database to dynamically create web pages makes for a much improved site in many ways. In the process, dynamic URLs with query strings in the format "http://www.mysite.com/main.php?category=books&subject=biography" are often created. Such URLs are not very search engine friendly. Search engines are much better at indexing static pages and do not do a good job of following hyperlinks that contain query strings. The advantages of a dynamic site are overwhelmingly obvious so what is to be done? Well you can have your cake and eat it too. With a little extra effort you can create a dynamic site that is easily crawled by webbots. You will even reap the additional side benefit of increased security for your site because, in the process, you will be masking how you access your database.

We will show how this can be done using an Apache web server with the module "mod_rewrite" installed. Code examples will use PHP to access a MySQL database.


Introduction

The robots used by search engines have problems with dynamic pages. You may review Google’s comments on this subject at http://www.google.com/webmasters/2.html. However, dynamic URLs can be converted into static URLs so that they can be indexed. For example, the dynamic URL "http://www.mysite.com/main.php?category=books&subject=biography", can be rewritten as "http://www.mysite.com/pagebooks-biography.htm ". This article will show you how to do this and in so doing make your dynamic, database-driven website search engine friendly.

The steps involved are:

  • make sure your web server supports "mod_rewrite"
  • create a ".htaccess" file
  • upload this file to the correct directory on your server
  • test the results by typing the modified URL into the address box of your browser
  • modify your original code to write your links in a search engine friendly form.

Check Your Server

In the introduction we alluded to something called "mod_rewrite". This is a module that is usually compiled into Apache web servers and by all accounts it is fairly complex to configure. But this is the concern of server administrators and need not worry us here. "mod_rewrite" makes use of a file called ".htaccess" to perform its rewrites and creating this file is all we will need to do. ".htaccess" tells the server how to convert between dynamic and static URLs. Writing this file usually requires some knowledge of regular expressions. Like many of us, you may wish that you had greater facility with regular expressions but have never quite gotten around to learning them properly. Well don’t worry - creating the ".htaccess" file will require absolutely no knowledge of regular expressions.

To ensure that your web host supports "mod_rewrite" you could send an e-mail to tech support or find out for yourself by creating the following text file:

<?php

phpinfo();

?>

Save it as "info.php", upload it to your server and invoke it by typing "http://www.mysite/info.php" into the address box of your browser.

The function "phpinfo" is very useful for determining how your server is configured, for looking at environment variables, cookies and the like but right now we are only concerned with "mod_rewrite". Look for the heading "Apache" and then "Loaded Modules". You should find "mod_rewrite" amongst the many modules listed there. If so you are all set to begin and if not, speak to your web host.

Create a ".htaccess" File

Open the source code of a page on your site and find an URL with a query string similar to: "http://www.mysite/main.php?category=$category&subject=$subject". If your URL is relative, rather than absolute, change it to an absolute URL and then copy it to the clipboard. You will have a much clearer understanding of what’s happening if you do it this way. Now open your browser and go to http://www.webmaster-toolkit.com/.

Take a moment to appreciate what is available here. I’m sure you’ll want to return so bookmark the site.

On the left, under the "SEO Tools" heading find and click the link "Rewrite Rule Generator". Scroll down until your screen looks like this:

Friendly web design

Now paste your dynamic URL into the appropriate textbox and decide if you want to represent your dynamic URL as a page or a directory. It’s up to you but make sure that you don’t create a directory name that matches an existing directory and that you don’t generate a page name that is longer than 255 characters. If you choose to generate a page name set the appropriate radio button and enter a page name into the textbox – something that best describes your page. We will be using the page name style of static URL in our examples. Click the generate button and you should have your ".htaccess" file in seconds. Copy the returned textarea and paste it into your favourite text editor. The text returned when entering the word "type" into the page name textbox and using our example URL, "http://www.mysite/main.php?category=$category&subject=$subject", is as follows:

Options +FollowSymLinks

RewriteEngine on

RewriteBase /

RewriteRule type(.*)-(.*)\.htm$ /main\.php?category=$1&subject=$2

This is all you will need for your ".htaccess" file. This file will intercept all page requests within the directory in which it is located and convert specified URLs with a static format to ones using a query string. That may sound like the opposite of what you want to achieve but read on and things will become clearer.

As promised, you’ve now created a ".htaccess" file without any reference to regular expressions. Don’t leave the webmaster-toolkit site just yet. Have a look at what the rewritten URL should be. Our example looks like this:

http://www.mysite.com/type\$category-\$subject.htm

Copy it into a text file, remove the backslashes that precede each dollar sign and save it to a text file called "format.txt". You’ll need this when revising your PHP scripts.

Upload ".htaccess" to Your Server

Now that you have created the ".htaccess" file you need to upload it to your server. It must reside in the same directory as the page that invokes your dynamic URL. Likewise, the script that is invoked is also assumed to be in this same directory. In our example we use the root directory of the server.

If you are only familiar with file naming conventions under Windows the name of this file will strike you as odd. Why start a file name with a dot? On Unix/Linux systems files of this type are hidden. For this reason you need to make sure your FTP programme is set up to view hidden files. Usually this is done using a site configuration option. With my FTP programme I need to enter "-al" into a "Remote File Mask" textbox. If you can’t configure your FTP programme most operating systems have an FTP utility that is invoked by typing "ftp" at the command line. Once you’ve connected to your site, to view hidden files you need to list them using the "-al’ option. This is done by typing "ls –al". This is an important point because you want to be able to see your file on the server, especially if you are using a graphical programme to transfer it. If you successfully transfer it and it’s not correct you won’t be able to see it to delete it. Be warned that a misconfigured ".htaccess" files can make your site inaccessible from a browser.

Make sure that you transfer your file in ASCII mode. When automatically transferring files, most FTP programmes determine the transfer mode by looking at file extensions. Under Windows this may mean that a file named ".htaccess" will transfer in binary mode. Configure your software so that files with this "extension" will upload as text files or, alternately, transfer the file manually.

Test the Results

If there are any problems we want to know about them before proceeding. There is no point in making the wrong modifications to our code! Type the rewritten URL into the address bar of your browser substituting actual literal values for the PHP variables. In our example, we know that there is a category called "books" and a subject called "biography". That would make our URL:

http://www.mysite.com/typebooks-biography.htm

If the right page is returned then you’ve done everything correctly.

If not here are a couple of suggestions. Make sure you have uploaded ".htaccess" into the right directory. Still have problems? Try the original, "unrewritten" URL in the address bar. If that also doesn’t work then the format of your URL is incorrect. Go back, recopy it from your code and re-enter it into the rewrite rule generator.

Modify Your Code

Now that you’ve tested your URL and it works, modifying your code is all that remains to be done. Every instance of a dynamically created URL must be revised. Just to clarify, in our example that would be all URLs that invoke the page "main.php" using a query string with two parameters, the first named "category" and the second named "subject". Any other dynamic URLs that do not match this pattern will have to have their own separate rewrite rule. But let’s keep it simple and look at a code example, again referring to our sample URL.

Original Code

Assume that our database has been opened and the result set of a query has been returned into the variable "$rs". Iterating through this result set using a "while loop" creates the dynamic URLs. This is done with the following code:

<?php

while($row = @ mysql_fetch_array($rs)){

$category = $row["category"];

$category = URLencode(htmlentities($category,ENT_QUOTES));

$subject= $row["subject"];

$subject = URLencode(htmlentities($subject,ENT_QUOTES));

/*format for following HTML result

http://www.mysite.com/main.php?category=books&subject=biography

*/

echo "<a href=\"http://www.mysite.com/main.php?category=$category&subject=$subject\">";

echo "$row[description]</a><br>\n";

}

?>

The fields "category" and "subject " are self-explanatory. "description" is simply the text that will appear as the clickable hyperlink. You can see in this example how a query string with two parameters has been created. The first parameter is separated from the page itself by a "?" and the second by an "&". This is the line of code that will need modification.

Revised Code

<?php

while($row = @ mysql_fetch_array($rs){

$category = $row["category"];

$category = URLencode(htmlentities($category,ENT_QUOTES));

$subject= $row["subject"];

$subject = URLencode(htmlentities($subject,ENT_QUOTES));

/*format for the URL rewrite is as follows

http://www.mysite.com/type$category-$subject.htm

*/

echo "<a href=\"http://www.mysite.com/type$category-$subject.htm\">";

echo "$row[description]</a><br>\n";

}

?>

Notice that a comment showing the format we are aiming for has been inserted into the code. This is the text that was saved in the file "format.txt". It will serve as a handy reference so that mistakes are avoided. You can see that all that has been changed is the one line that actually creates the query string.

Conclusion

With minimal effort, a website can have all the advantages of being created dynamically from a database without compromising its ability to be indexed by search engines. This is achieved by using "mod_rewrite", a ".htaccess" file and by making minor adjustments to the original scripts. The robots used by search engines have no trouble following the resulting "static" HTML page. No knowledge of configuring "mod_rewrite" was required nor any knowledge of regular expressions.

Resources

Google on dynamic pages - http://www.google.com/webmasters/2.html.

Generate a ".htaccess" file - http://www.webmaster-toolkit.com/.

About the Author

Peter Lavin runs a Web Design/Development firm in Toronto, Canada. He has been published in a number of magazines and online sites, including UnixReview.com, php|architect and International PHP Magazine. He is a contributor to the recently published O'Reilly book, PHP Hacks and is also the author of Object Oriented PHP, published by No Starch Press.

Please do not reproduce this article in whole or part, in any form, without obtaining written permission.

top