HTML4 variations
Since its inception, HTML and its associated protocols gained acceptance relatively quickly. However, no clear standards existed in the early years of the language. Though its creators originally conceived of HTML as a semantic language devoid of presentation details,]practical uses pushed many presentational elements and attributes into the language, driven largely by the various browser vendors. The latest standards surrounding HTML reflect efforts to overcome the sometimes chaotic development of the language and to create a rational foundation for building both meaningful and well-presented documents. To return HTML to its role as a semantic language, theW3C has developed style languages such as CSS and XSL to shoulder the burden of presentation. In conjunction, the HTML specification has slowly reined in the presentational elements.
There are two axes differentiating various variations of HTML as currently specified: SGML-based HTML versus XML-based HTML (referred to as XHTML) on one axis, and strict versus transitional (loose) versus frameset on the other axis.
SGML-based versus XML-based HTML
One difference in the latest HTML specifications lies in the distinction between the SGML-based specification and the XML-based specification. The XML-based specification is usually called XHTML to distinguish it clearly from the more traditional definition. However, the root element name continues to be "html" even in the XHTML-specified HTML. The W3C intended XHTML 1.0 to be identical to HTML 4.01 except where limitations of XML over the more complex SGML require workarounds. Because XHTML and HTML are closely related, they are sometimes documented in parallel. In such circumstances, some authors conflate the two names as (X)HTML or X(HTML).
Like HTML 4.01, XHTML 1.0 has three sub-specifications: strict, transitional and frameset.
Aside from the different opening declarations for a document, the differences between an HTML 4.01 and XHTML 1.0 document—in each of the corresponding DTDs—are largely syntactic. The underlying syntax of HTML allows many shortcuts that XHTML does not, such as elements with optional opening or closing tags, and even empty elements which must not have an end tag. By contrast, XHTML requires all elements to have an opening tag and a closing tag. XHTML, however, also introduces a new shortcut: an XHTML tag may be opened and closed within the same tag, by including a slash before the end of the tag like this:
<br/>. The introduction of this shorthand, which is not used in the SGML declaration for HTML 4.01, may confuse earlier software unfamiliar with this new convention. A fix for this is to include a space before closing the tag, as such: <br />.[67]
To understand the subtle differences between HTML and XHTML, consider the transformation of a valid and well-formed XHTML 1.0 document that adheres to Appendix C (see below) into a valid HTML 4.01 document. To make this translation requires the following steps:
- The language for an element should be specified with a
langattribute rather than the XHTMLxml:langattribute. XHTML uses XML's built in language-defining functionality attribute. - Remove the XML namespace (
xmlns=URI). HTML has no facilities for namespaces. - Change the document type declaration from XHTML 1.0 to HTML 4.01. (see DTD section for further explanation).
- If present, remove the XML declaration. (Typically this is:
<?xml version="1.0" encoding="utf-8"?>). - Ensure that the document's MIME type is set to
text/html. For both HTML and XHTML, this comes from the HTTPContent-Typeheader sent by the server. - Change the XML empty-element syntax to an HTML style empty element (
<br/>to<br>).
Those are the main changes necessary to translate a document from XHTML 1.0 to HTML 4.01. To translate from HTML to XHTML would also require the addition of any omitted opening or closing tags. Whether coding in HTML or XHTML it may just be best to always include the optional tags within an HTML document rather than remembering which tags can be omitted.
A well-formed XHTML document adheres to all the syntax requirements of XML. A valid document adheres to the content specification for XHTML, which describes the document structure.
The W3C recommends several conventions to ensure an easy migration between HTML and XHTML The following steps can be applied to XHTML 1.0 documents only:
- Include both
xml:langandlangattributes on any elements assigning language. - Use the empty-element syntax only for elements specified as empty in HTML.
- Include an extra space in empty-element tags: for example
<br />instead of<br/>. - Include explicit close tags for elements that permit content but are left empty (for example,
<div></div>, not<div />). - Omit the XML declaration.
By carefully following the W3C's compatibility guidelines, a user agent should be able to interpret the document equally as HTML or XHTML. For documents that are XHTML 1.0 and have been made compatible in this way, the W3C permits them to be served either as HTML (with a
text/html MIME type), or as XHTML (with anapplication/xhtml+xml or application/xml MIME type). When delivered as XHTML, browsers should use an XML parser, which adheres strictly to the XML specifications for parsing the document's contents.Transitional versus strict
HTML 4 defined three different versions of the language: Strict, Transitional (once called Loose) and Frameset. The Strict version is intended for new documents and is considered best practice, while the Transitional and Frameset versions were developed to make it easier to transition documents that conformed to older HTML specification or didn't conform to any specification to a version of HTML 4. The Transitional and Frameset versions allow for presentational markup, which is omitted in the Strict version. Instead,cascading style sheets are encouraged to improve the presentation of HTML documents. Because XHTML 1 only defines an XML syntax for the language defined by HTML 4, the same differences apply to XHTML 1 as well.
The Transitional version allows the following parts of the vocabulary, which are not included in the Strict version:
- A looser content model
- Inline elements and plain text are allowed directly in:
body,blockquote,form,noscriptandnoframes
- Inline elements and plain text are allowed directly in:
- Presentation related elements
- underline (
u)(Deprecated. can confuse a visitor with a hyperlink.) - strike-through (
s) center(Deprecated. use CSS instead.)font(Deprecated. use CSS instead.)basefont(Deprecated. use CSS instead.)
- underline (
- Presentation related attributes
background(Deprecated. use CSS instead.) andbgcolor(Deprecated. use CSS instead.) attributes forbody(required element according to the W3C.) element.align(Deprecated. use CSS instead.) attribute ondiv,form, paragraph (p) and heading (h1...h6) elementsalign(Deprecated. use CSS instead.),noshade(Deprecated. use CSS instead.),size(Deprecated. use CSS instead.) andwidth(Deprecated. use CSS instead.) attributes onhrelementalign(Deprecated. use CSS instead.),border,vspaceandhspaceattributes onimgandobject(caution: theobjectelement is only supported in Internet Explorer (from the major browsers)) elementsalign(Deprecated. use CSS instead.) attribute onlegendandcaptionelementsalign(Deprecated. use CSS instead.) andbgcolor(Deprecated. use CSS instead.) ontableelementnowrap(Obsolete),bgcolor(Deprecated. use CSS instead.),width,heightontdandthelementsbgcolor(Deprecated. use CSS instead.) attribute ontrelementclear(Obsolete) attribute onbrelementcompactattribute ondl,dirandmenuelementstype(Deprecated. use CSS instead.),compact(Deprecated. use CSS instead.) andstart(Deprecated. use CSS instead.) attributes onolandulelementstypeandvalueattributes onlielementwidthattribute onpreelement
- Additional elements in Transitional specification
menu(Deprecated. use CSS instead.) list (no substitute, though unordered list is recommended)dir(Deprecated. use CSS instead.) list (no substitute, though unordered list is recommended)isindex(Deprecated.) (element requires server-side support and is typically added to documents server-side,formandinputelements can be used as a substitute)applet(Deprecated. use theobjectelement instead.)
- The
language(Obsolete) attribute on script element (redundant with thetypeattribute). - Frame related entities
iframenoframestarget(Deprecated in themap,linkandformelements.) attribute ona, client-side image-map (map),link,formandbaseelements
The Frameset version includes everything in the Transitional version, as well as the
frameset element (used instead of body) and the frame element.Frameset versus transitional
In addition to the above transitional differences, the frameset specifications (whether XHTML 1.0 or HTML 4.01) specifies a different content model, with frameset replacingbody, that contains either frame elements, or optionally noframes with a body.
Summary of specification versions
As this list demonstrates, the loose versions of the specification are maintained for legacy support. However, contrary to popular misconceptions, the move to XHTML does not imply a removal of this legacy support. Rather the X in XML stands for extensible and the W3C is modularizing the entire specification and opening it up to independent extensions. The primary achievement in the move from XHTML 1.0 to XHTML 1.1 is the modularization of the entire specification. The strict version of HTML is deployed in XHTML 1.1 through a set of modular extensions to the base XHTML 1.1 specification. Likewise, someone looking for the loose (transitional) or frameset specifications will find similar extended XHTML 1.1 support (much of it is contained in the legacy or frame modules). The modularization also allows for separate features to develop on their own timetable. So for example, XHTML 1.1 will allow quicker migration to emerging XML standards such as MathML (a presentational and semantic math language based on XML) and XForms—a new highly advanced web-form technology to replace the existing HTML forms.
In summary, the HTML 4 specification primarily reined in all the various HTML implementations into a single clearly written specification based on SGML. XHTML 1.0, ported this specification, as is, to the new XML defined specification. Next, XHTML 1.1 takes advantage of the extensible nature of XML and modularizes the whole specification. XHTML 2.0 was intended to be the first step in adding new features to the specification in a standards-body-based approach.
HTML5 variations
WhatWG HTML versus HTML5
The considers their work as living standard HTML for what constitutes the state of the art in major browser implementations by Apple (Safari), Google (Chrome), Mozilla(Firefox), Opera (Opera), and others. HTML5 is specified by the HTML Working Group of the W3C following the W3C process. As of 2013 both specifications are similar and mostly derived from each other, i.e., the work on HTML5 started with an older WhatWG draft, and later the WhatWG living standard was based on HTML5 drafts in 2011
Hypertext features not in HTML
HTML lacks some of the features found in earlier hypertext systems, such as source tracking, fat links and othersEven some hypertext features that were in early versions of HTML have been ignored by most popular web browsers until recently, such as the link element and in-browser Web page editing.
Sometimes Web services or browser manufacturers remedy these shortcomings. For instance, wikis and content management systems allow surfers to edit the Web pages they visit.
WYSIWYG editor
There are some WYSIWYG editors (What You See Is What You Get), in which the user lays out everything as it is to appear in the HTML document using a graphical user interface (GUI), often similar to word processors. The editor renders the document rather than show the code, so authors do not require extensive knowledge of HTML.
The WYSIWYG editing model has been criticized, primarily because of the low quality of the generated code; there are voices advocating a change to the WYSIWYM model (What You See Is What You Mean).
WYSIWYG editors remain a controversial topic because of their perceived flaws such as:
- Relying mainly on layout as opposed to meaning, often using markup that does not convey the intended meaning but simply copies the layout.
- Often producing extremely verbose and redundant code that fails to make use of the cascading nature of HTML and CSS.
- Often producing ungrammatical markup often called tag soup or semantically incorrect markup (such as
<em>for italics). - As a great deal of the information in HTML documents is not in the layout, the model has been criticized for its "what you see is all you get"-nature.
No comments:
Post a Comment