An Introduction to CSS and Structured Markup, Part I
by Sandra Clark
Introduction
Those who know me are probably slightly aware that besides programming in ColdFusion and Fusebox, I also tend to voice my opinion on two other subjects, Accessibility and Cascading Style Sheets (CSS). This is the first in a series of articles designed to introduce CSS to you. The series will examine the steps necessary to convert a site from the standard HTML that we are usually familiar with to a site that uses only CSS for all layout and styling.
The article assumes that you've already decided to convert your application to CSS, and provides an example to guide you through that process. It doesn't address the question, "Why bother converting my application to CSS?" That is a topic for a different article (and if you'd like me to write that, you should let me know.)
When casting my eyes for a site, whose should appear but the person we all know and love, Ben Forta. (Those who know me, know that I love Ben; I even staged a Rock Star rush for him at CFUN02). With his kind permission, this article series will focus on CSS while changing his current site to one that uses only CSS for presentation.
Ben's Site - Challenges
I took two pages from his site (
http://www.forta.com) and will work over this series of articles on converting these pages into an xHTML/CSS structured site. I'm sure some of you are asking, "Why Ben's site?" There were a variety of reasons it was chosen, one of them being that most people who would read this article know who Ben Forta, the Macromedia ColdFusion evangelist, is, and have probably hit his site at some time in their ColdFusion career.
More importantly from my point of view, Ben's pages are built using the old table/ clear pixel method and will convert well to using a more standards-based approach. The pages that will be converted in this series of articles will be Ben's home page (which is his blog) and the ColdFusion page.
So what are the challenges with Ben's site? First and most importantly, content and presentation are combined, rather than being separated. Wrapping content in presentational markup does several things.
- It bloats the code. As seen in the table below, removing all presentational code that does not affect content results in a 10-25% reduction in the file size without doing anything else.
- Code that is presentational in nature is code that gets in the way of maintenance. Nested tables used for layout makes finding the code needed to change that much harder.
- Pages that are coded with presentational markup sacrifice structure. When a table is removed, and images are hidden, the resulting page is harder to understand.
Table 1: Presentational Markup vs. Structural Markup
| |
Home.html (orig.) |
Home.html (structured) |
CF.html (orig.) |
CF.html (structured) |
| HTML |
19.98 |
12.04 |
13.54 |
10.25 |
| DEPENDENCIES |
12.98 |
12.89 |
23.05 |
23.01 |
| |
|
| Total |
32.96 |
24.93 |
36.59 |
33.25 |
Table 1: Stripping out all presentational markup and re-rendering in a structural (semantic) markup, the total size of the page and its dependencies goes down by 25% for the home page and 10% for the ColdFusion page.
Figure 1: The home page to forta.com with all tables stripped and images hidden.
To see the full image, click here.
In this series, we will be converting these pages slowly to correspond to the lessons in each article. Thus, in the beginning, our pages will definitely not look as nice as Ben's original. Instead, we will be laying the
foundation for the CSS that is to come by deconstructing Ben's pages, making them cleaner and simpler. By the end of this series, we hope that the pages are close to the original pages in look, with a much cleaner structure and more easily maintainable.
To guide you through the initial process of converting to CSS, we've provided you with five files per page that you can
download and view as you read. Each file represents a stage in the process, explained in our Files Table at the end of the article.
DocTypes, Sniffing, and Rendering Modes
One of the first things I noticed with these pages is that there is no DocType. All the pages simply start with an html tag. While DocTypes have not been commonly used in the past, the W3C specification requires that a <!DOCTYPE declaration be the first item on a page. Having a valid DocType specifies to a browser what version of HTML or xHTML is being used and dictates how that page will be rendered. While many pages in the past haven't required the use of a DocType, any page that relies on CSS absolutely must.
Modern browsers still have to support their mistakes of the past. Therefore, they have two rendering modes:
- Standards Mode is the mode in which a browser will attempt to render a web page and its CSS following the standards specifications from the W3c.
- Quirks Mode is the mode in which a browser will render a web page and any CSS following its old, proprietary methods from the past. These rendering modes differ from browser to browser and are not guaranteed to work the same way across browsers.
The two modes are necessary since there are so many pages out there that are written for older browsers and which allowed the browsers themselves to dictate how a page should be written. (Remember "Best viewed with Netscape"?) Browsers use DocType Sniffing to determine what mode to render. Without exception, having no DocType on a page will result in a browser rendering that page in "Quirks Mode".
Unfortunately, different DocTypes trigger different rendering modes on different browsers. At this time, there are only 2 DocTypes that render standards mode across all browsers. (HTML 4.01 strict and XHTML 1.0 Strict (without XML prolog). For some reason, the XML prolog (. <?xml version="1.0" encoding="UTF-8"?>) throws Internet Explorer 6 into quirks mode.
So out of the two DocTypes available, which should we use? Most people are more familiar with HTML 4. However, I opt to use xHTML 1.0. The reasons I prefer using xHTML are:
- It's more stringent in requiring that a user follow the XML standards. While I know many people don't like that, I have found that by forcing myself to code in this way, I make far fewer syntax mistakes and the ones I do make are far easier to track using a validator. In the long run, coding to xHTML is actually faster.
- Because xHTML requires that all tags be closed (either self closing or in pairs), I find it far easier to determine what is affected by a CSS rule. Since HTML 4 does not require paragraphs be closed, tracking down CSS problems becomes far harder (especially since a validator won't catch something like this).
- xHTML is XML, which means that if I need to in the future, I can parse xHTML using ColdFusion's XML functionality.
One of the challenges with ColdFusion is managing white space. Because the specification requires that a DOCTYPE be the first line of a page, I tend to place my DOCTYPE in Application.cfm (as the first line). In these examples, however, since I am using HTML source, I'll simply place it at the top of each page.
Converting a Page
So now that I have a DOCTYPE on the page, it's time to start converting. The conversion process will be shown in a series of articles, with this first article devoted mainly to giving us a clean slate to begin with. The first step to a clean document is all about removal.
Removing Tables and Spacing gif's
The first step in preparing a document for conversion is removing any presentational markup. This includes tables, spacing images, font tags, and any other purely presentational markup (such as bold and italic) as well as line breaks <br>. At the same time, I'm also removing all the inline CSS that is in the document. (To see the result, compare home1.htm and cf1.htm (the original pages from Ben's website) to home2.htm and cf2.htm, the pages that have been stripped of presentational markup.)
xHTML
The next thing to do is to make sure the page conforms to its stated DocType (in this case, xHTML 1.0). This means converting any other markup in the pages to xHTML and then validating the pages and fixing problems until we have a valid xHTML 1.0 page.
xHTML 1.0 follows the same specification as HTML 4.01. It goes further by requiring that all markup be XML-compliant. This means that the following rules must be followed:
- All tags must be lowercase.
- All attributes must be lowercase and their values must be enclosed in double quotes.
- All attributes must be in name/value pairs.
- In HTML, there are some attributes which have no value. For example, an input tag could have <option selected>. In xHTML, this must be set at <option selected="selected">. Just make sure the attribute name and the value are the same.
- All tags must be closed.
- Therefore, all paragraphs must be opened and then closed. If a tag does not usually have a closing tag, use a space and a slash to close the tag. Examples would be <br />, <input />, and <img />.
If you are efficient, like I am (also read as lazy), you can use an open source program called HTML Tidy to massage most of your HTML into xHTML. The program can be found at
http://tidy.sourceforge.net/. For those who want to access it via a web page, you may do so at
http://infohound.net/tidy/. Also, HTML Tidy is integrated into a number of Editors, including my favorite CSS Editor, Topstyle Pro (
http://www.bradsoft.com/topstyle/index.asp).
After turning a document into xHTML, it still needs to be validated. Think of validating a page as debugging your site. A page that is validated will have fewer bugs in it, either CSS or structural. A validated site makes sure that any problems resulting in your CSS are either yours or browser bugs (and minimizes the latter considerably). By far the best validator out there is the one offered by the W3C itself:
http://jigsaw.w3.org/css-validator/ (also available through TopStyle!). I ran home2.htm and cf2.htm (the pages that had had all of their presentational markup taken out ) through the validator. It found 22 errors in home2.html and 38 errors in cf2.html.
In this case, the errors were items that were not in compliance with the HTML 4.01 specification. (Remember, xHTML 1.0 is HTML 4.01 in an XML format.) HTML 4.01 took away a lot of presentational markup in favor of CSS.
After all the errors are fixed, and the page validates to the proper xHTML 1.0 format, we are ready for the next step. (The resulting pages are cf3.html and home3.html.)
Structuring Content
As I mentioned before, one of the biggest drawbacks to using tables as layout is the lack of structural markup in a page when the tables are removed. Structural Markup is also known as "Semantic Markup," giving your content structure in such a way that it can be easily understood with no other formatting and layout.
This means that headings are structured as headings <hn> with the most important heading on a page being given an <h1>. Subsequent headings are always marked in order with <h2>'s descending from the <h1> and <h3>'s descending from <h2>'s. Headings should always be in order and should never be skipped. I've found that many authors use an <h4> instead of or after an <h1> simply because that is the size they are looking for. Any heading can be styled any way we want using CSS. So use headings properly.
On the ColdFusion page (cf4.htm), at this time, the most important item on the page is the logo. (We will discuss changing the logo to text in a later article). So at this time, the logo needs to be surrounded by an H1.
I have further identified two other items that are important on this page. One of them is "Ben's Books". The other is "About ColdFusion". In my mind, both are equally important so both should be structured with an <h2>.
For the home page, I structured the blog as follows:
Dates are structured as <h3>. (The image showing the word "News" is currently structured as an <h2>.) Entries under a date are structured as <h4>. The formatting of <strong> is taken out.
All text should be enclosed by headings or paragraphs. A group of related items, such as links or notes, should always be marked up as a list. Ben's CF page (cf3.html) shows some links
|
• <a href="./">ColdFusion</a>
• <a href="bookshelf">Bookshelf</a>
•
<a href="tips">Tip-of-the-Day</a>
|
I assume that Ben wanted to style the links with a particular bullet and chose to do so using the actual special character. Using CSS, we can style a list bullet so it's not really an issue anymore. Always group lists of related items as a list (either as ordered <ol>, unordered <ul>, or definition <dl>, depending on the content of the list itself. In this case the links should be an unordered list. I've also gone ahead and structured the items for the horizontal navigation as a list, taking out the images that were placed in between them as a separator.
To see the difference, look at home3.html or cf3.html and then go ahead and look at home4 or cf4. The differences in looking at a page with just content and structural markup are striking.
Structuring for Display
While structuring our content is all well and good, there are some things we need to group together to facilitate styling and positioning. To this end, we will use <div>'s, which are structural markup tags that have no semantic meaning on their own. Giving each div its own id will facilitate styling later on. (We will go into id's and classes in the next article.)
Home Page
The home page is styled with a header and two columns contained in a white, page-like area. So we basically need a div for the header and 2 divs wrapped in another div for the columns. (See home5.html.) Divs may be nested within each other. The format (as seen with everything else taken out) is:
|
<div id="header"></div>
<div id="main">
<div id="col1"></div>
<div id="col2"></div>
</div>
<p id="footer"></p>
|
So why not wrap the paragraph with the footer text in a div? There really is no reason to. Since the footer paragraph is already a block level item and there is nothing else necessary to group, it's unnecessary to add extra markup.
CF Page
The CF Page at this time is structured a bit differently. We still have a header, but Ben stylistically presents the top level navigation differently while adding a side navigation area. So our structure must accommodate that.
|
<div id="heading"></div>
<div id="mainnav"></div>
<div id="sidenav"></div>
<div id="content"></div>
<p id="foot"></p>
|
Notice that I gave different ID's to the CF page structure than I did to the home page. I know that these pages will be styled differently, and rather than using two style sheets to apply different styles to the same id names, I can instead create one style sheet for both pages.
End Result
While there is a good chance that we will have to change some of the structural markup in subsequent articles to support our aim of styling the page, these changes should be few and far between. The page itself looks boring at this point, but it is far more readable without any formatting than it was before. By using this well-structured skeleton, we will be in a much better position to begin styling the page.
Compare the differences between the end result of this article and the same pages when tables and other visual formatting were first taken out (cf3.htm compared to cf5.htm and home3.htm compared to home5.htm). If you were to choose a more readable page, which would it be?
We still have a ways to go to get this page looking nice. In the next article, I'll take you through selecting elements using CSS and fonts.
Table 2: Article Files
| File Name |
Description |
| Home1.htm |
Original Blog Page as taken from Ben's Site |
| Home2.htm |
Blog Page with Presentational Markup Removed |
| Home3.htm |
Blog Page with xHTML errors fixed and validated to DocType compliance |
| Home4.htm |
Blog Page structured for display |
| Home5.htm |
Blog Page structured to facilitate styling and positioning |
| CF1.htm |
ColdFusion Page as taken from Ben's Site |
| CF2.htm |
ColdFusion Page with Presentational Markup Removed |
| CF3.htm |
ColdFusion Page with xHTML errors fixed and validated to DocType compliance |
| CF4.htm |
ColdFusion Page structured for display |
| CF5.htm |
ColdFusion Page structured to facilitate styling and positioning |
Sandra Clark, an advanced Macromedia Certified ColdFusion developer, is a Senior Software Developer with the Constella Group in Bethesda, Maryland. She has contributed material to the ColdFusion 5.0 Certified Developer Study Guide published by Syngress Media/Osborne McGraw Hill and to the ColdFusion Developers Journal. She has also spoken at various CFUGS and ColdFusion User Conferences around the country. Sandra is an active proponent of applying accepted and proven web standards to development as a way of improving accessibility as well as making life easier on developers.