Understanding Website Logs

Web logs describe the traffic patterns and visitor levels of your site.  As you can imagine this information is very valuable to understanding what is working on a website and what is not.  Web logs can be misunderstood, or worse ignored, because of a lack of education and knowledge. Understanding web log terms and how to analyze traffic numbers will help site owners make sound decisions about their website.

Glossary of Terms

Hit – Indicates the request of a single file to the web server.  As web pages are usually made up of multiple files (one hit for the page, each image file and script and style files used), hits have very little meaning except to administrators for understanding load numbers.

Page View or Page – A request for a web page or document.  This represents an accurate indicator of the number of pages viewed on a web site.

Visitor – When indicated as unique, represents a uniquely different visitor to the web site during the specified time frame.  Usually unique visitors are tracked by day and month, but over longer periods unique visitors are combined and the result is no longer unique.

Number of Visits – Represents the total number of visits in a time period.  A site may have 10 unique visitors in a month, but 300 visits, indicating the 10 visitors averaged 30 visits each in the month to total 300 visits.

Bandwidth – The amount of data transferred by the website.  This would be used for capacity planning and understanding load issues.

Spider, Robot or Crawler – An automated program that crawls the web, following links, in order to index content for search engines or other aggregators.

Entry Page – The page, by best estimate, that a visitor entered the site on.

Exit Page -The page, by best estimate, that a visitor left the site on.

Landing Page – The page a visitor first arrived on when directed from a search engine.

Sitemap – An XML formatted file provided for search engines describing the pages available on a website.

Referrer or Referring Site – The site that referred a visitor to the site via a link.

Hot link – A link to a resource on a website (usually an image) without requesting the full page.  This usually indicates someone has used an image from your site on their own site, effectively stealing your image and your bandwidth.

Analyzing Data

The base numbers that are most telling about a website’s traffic are unique visitors, visits and page views.  These are normally analyzed on a monthly basis.  Growth in these three areas indicate a growth in the number of people browsing a site.  Hits and bandwidth should be ignored for everything except load planning purposes.  Visits and page views can be broken down a number of ways, allowing for viewpoints from day of the month, day of the week, hour of day to geographic location of the visitor.  It is also possible to see numbers on the operating systems and browsers of a site’s visitors.  Also of use is looking at page views from a page perspective to show the most and least popular pages.

Beyond those base statistics there are other numbers that can provide an insight into visitor traffic patterns.  Valuable to any site owner is how there site is being found.  Statistics can be provided on the amount of traffic that came from direct address or bookmark, search engine referral or other website referral.  Search engine traffic can additional be broken down by search engine and the search terms used to arrive on the site aggregated.  Further, search engine landing pages is a valuable resource for determining the popularity of pages driving traffic from search engines.

Web logs can provide plenty of numbers to analyze and display trends in traffic.  In addition to what I’ve mentioned there are other statistics that can be extracted from web logs that I have not touched on.  Web logs are a good starting place for extracting this data, but more advanced users may require additional information or more real-time logs that what web log systems can provide.

Other Options

Complementing web logs are search engine tools.  Google, Yahoo and Microsoft all provide webmaster tool sites to provide tools and information on a site from the search engine’s perspective.  Google provides very valuable information on search queries, providing data on where a site appeared in the search results for specific search terms without the searcher having visited the site.

To satisfy the extra needs of some site owners, other packages are available to provide more advanced and real-time statistics.  They work by using a script to record information about a visit instead of analyzing web logs.  This allows them to provide more information than web logs, like screen resolution.

One of the leaders in this area is Mint.  Mint is a very affordable, addictive, nicely designed statistics package.  The most impressive aspect of Mint is its extensibility with plugins called Peppers.  There is a wealth of peppers already available for most people’s needs and for someone with programming experience it is possible to create your own.

Google Analytics is another option to Mint.  Google’s solution transmits your visitor information to Google’s servers so it has its downside but it is free.  This old comparison of Google Analytics and Mint describes some of the differences.

Finally, Crazyegg is a paid service that has very advanced functionality that may benefit high traffic e-commerce sites.  This article describes some of the features of Crazyegg.

Further Reading

Understanding web traffic statistics can be daunting for business people whose focus is not technology and the Internet.  There is plenty of information available on the web to explain web analytics.  Review these articles or search the web for more information.

Web Analytics Basics: Learn to Measure Your Web Site

A Quick Guide to Reading Webstats

Know Your Site

Measure Online Advertising with Google Analytics

How to measure the success of your web app