W3CWD-script-970314


Client-side Scripting and HTML

W3C Working Draft 14-Mar-97

This version:
http://www.w3.org/pub/WWW/TR/WD-script-970314
Latest version:
http://www.w3.org/pub/WWW/TR/WD-script
Author:
Dave Raggett <dsr@w3.org>

Status of This Document

This draft is work under review by the W3C HTML Working Group, for potential incorporation in an upcoming version of the HTML specification, code named Cougar. Please remember this is subject to change at any time, and may be updated, replaced or obsoleted by other documents. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress".

A list of current W3C Working Drafts can be found at http://www.w3.org/pub/WWW/TR. This is work in progress and does not imply endorsement by, or the consensus of, either W3C or members of the HTML working group. Further information about Cougar is available at http://www.w3.org/pub/WWW/MarkUp/Cougar/.

Please send detailed comments to www-html-editor@w3.org. We cannot garantee a personal response, but summaries will be maintained off the Cougar page. Public discussion on HTML features takes place on www-html@w3.org. To subscribe send a message to www-html-request@w3.org with subscribe in the subject.

Abstract

The HyperText Markup Language (HTML) is a simple markup language used to create hypertext documents that are portable from one platform to another. HTML documents are SGML documents with generic semantics that are appropriate for representing information from a wide range of applications. This specification extends HTML to support locally executable scripts including JavaScript, VBScript, and other scripting languages and systems.


Contents


Introduction

This specification extends HTML to support client-side scripting of HTML documents and objects embedded within HTML documents. Scripts can be supplied in separate files or embedded directly within HTML documents in a manner independent of the scripting language. Scripts allow HTML forms to process input as it is entered: to ensure that values conform to specified patterns, to check consistency between fields and to compute derived fields.

Scripts can also be used to simplify authoring of active documents. The behaviour of objects inserted into HTML documents can be tailored with scripts that respond to events generated by such objects. This enables authors to create compelling and powerful web content. This specification covers extensions to HTML needed for client-side scripting, but leaves out the architectural and application programming interface issues for how scripting engines are implemented and how they communicate with the document and other objects on the same page.


The SCRIPT Element

<!-- SCRIPT is a character-like element for embedding script code
      that can be placed anywhere in the document HEAD or BODY -->

<!ELEMENT script - - CDATA>
<!ATTLIST script
        type         CDATA    #IMPLIED -- media type for script language --
        language     CDATA    #IMPLIED -- predefined script language name --
        src          %URL     #IMPLIED -- URL for an external script --
        >

The content model for the SCRIPT element is defined as CDATA. In this kind of element, only one delimiter is recognized by a conforming parser: the end tag open (ETAGO) delimiter (i.e. the string "</"). The recognition of this delimiter is constrained to occur only when immediately followed by an SGML name start character ([a-zA-Z]). All characters which occur between the SCRIPT start tag and the first occurrence of ETAGO in such a context must be provided to the appropriate script engine.

Note that all other SGML markup (such as comments, marked sections, etc.) appearing inside a SCRIPT element are construed to be actual character content of the SCRIPT element and not parsed as markup. A particular script engine may choose to treat such markup as it wishes; however a script engine should document such treatment.

The restriction on appearance of the ETAGO delimiter may cause problems with script code which wishes to construct HTML content in the code. For example, the following code is invalid due the to presence of the "</EM>" characters found inside of the SCRIPT element:

    <SCRIPT type="text/javascript">
      document.write ("<EM>This won't work</EM>")
    </SCRIPT>

A conforming parser would treat the "</EM>" data as an end tag and complain that it was an end tag for an element not opened, or perhaps actually close an open element. In any case, it is recognized as markup and not as data.

In JavaScript, this code can be expressed legally as follows by ensuring that the apparent ETAGO delimiter does not appear immediately before an SGML name start character:

    <SCRIPT type="text/javascript">
      document.write ("<EM>This will work<\/EM>")
    </SCRIPT>

In Tcl this looks like:

    <SCRIPT type="text/tcl">
      document write "<EM>This will work<\/EM>"
    </SCRIPT>

In VBScript, you can avoid the problem with the Chr() function, e.g.

    "<EM>This will work<\" & Chr(47) + "EM>"

Each scripting language should recommend language specific support for resolving this issue.

The following describe the attributes used with SCRIPT elements, all of which are optional:

TYPE
The Internet media type specifying the scripting language, for instance: type="text/javascript" or type="text/vbscript".
LANGUAGE
Names the scripting language using well known identifiers, for instance "JavaScript" or "VBScript". This attribute is deprecated in favor of the TYPE attribute.
SRC
The optional SRC attribute gives a URL for an external script. If a SRC attribute is present, the content of the SCRIPT element should be ignored.

HTML documents can include multiple SCRIPT elements, which can be placed in the document HEAD or BODY. This allows script statements for a form to be placed near to the corresponding FORM element. Note that some script engines evaluate script statements dynamically as the document is loaded, so that there is the possibility that references to objects occurring later in the document will fail.

If authors include script elements with different scripting languages in the same document, then user agents should attempt to process statements in each such language. This corresponds to programs with some procedures written in one language, and others in another language, e.g. C and FORTRAN.

Default Scripting Language

The default scripting language in the absence of TYPE or LANGUAGE attributes can be specified by by a META element in the document HEAD, for example:

    <META HTTP-EQUIV="Content-Script-Type" CONTENT="text/tcl">

where the CONTENT attribute specifies the media type for the scripting language, and the HTTP-EQUIV attribute is the literal string "Content-Script-Type". If there are several such META elements the last one determines the scripting language.

In the absence of such a META element, the default can be set by a Content-Script-Type HTTP header in the server response, for example:

    Content-Script-Type: text/tcl

If there are several such headers, the last one takes precedence over earlier ones.

Note that in the absence of either a META element or an HTTP header, many user agents assume the default scripting language to be JavaScript.

Self-Modifying Documents

Some scripting languages permit script statements to be used to modify the document as it is being parsed. For instance, the HTML document:

    <title>Test Document</title>
    <script type="text/javascript">
        document.write("<p><b>Hello World!<\/b>")
    </script>

Has the same effect as the document:

    <title>Test Document</title>
    <p><b>Hello World!</b>

From the perspective of SGML, each script element is evaluated by the application and can be modelled as a two-step process: (1) dynamically defining an anonymous CDATA entity; corresponding to the combined text written by all of the document.write or equivalent statements in the script element and (2) referencing the entity immediately after parsing the script end tag. HTML documents are constrained to conform to the HTML document type definition both before processing any script elements and after processing all script elements.

The NOSCRIPT element

    <!ELEMENT noscript - - (%body.content)>

The content of this element is rendered only when the user agent doesn't support client-side scripting, or doesn't support a scripting language used by preceding script elements in the current document. It gives authors a means to provide an invitation to upgrade to a newer browser, and is designed to work with downlevel browsers. The NOSCRIPT element can be placed anywhere you can place an HTML DIV element.

    <NOSCRIPT>
      <P>This document works best with script enabled browsers
    </NOSCRIPT>

Examples of Event Handlers using SCRIPT

You can include the handler for an event in an HTML document using the SCRIPT element. Here is a VBScript example of an event handler for a text field:

    <INPUT NAME=edit1 size=50>    
    <SCRIPT TYPE="text/vbscript">
      Sub edit1_changed()
        If edit1.value = "abc" Then
          button1.enabled = True
        Else
          button1.enabled = False
        End If
      End Sub
    </SCRIPT>

Here is the same example using Tcl:

    <INPUT NAME=edit1 size=50>
    <SCRIPT TYPE="text/tcl">
      proc edit1_changed {} {
        if {[edit value] == abc} {
          button1 enable 1
        } else {
          button1 enable 0
        }
      }
      edit1 onChange edit1_changed
    </SCRIPT>

Here is a javascript example for event binding within a script: First, here's a simple click handler:

    <script language=JavaScript>
      function my_onclick() {
         . . .
      }

      document.form.button.onclick = my_onclick
    </script>

Here's a more interesting window handler:

    <script language=JavaScript>
      function my_onload() {
         . . .
      }

      var win = window.open("some/other/URL")
      if (win) win.onload = my_onload
    </script>

In Tcl this looks like:

    <script language=tcl>
        proc my_onload {} {
          . . .
        }
        set win [window open "some/other/URL"]
        if {$win != ""} {
            $win onload my_onload
        }
    </script>

Scoping of Object Names

Scripting engines are responsible for binding object references in scripts to objects associated with documents. Script engines may support more than one language. This allows handlers to be written in one language, and the event binding to be defined in another, thereby avoiding the limitations of particular languages.

Some scripting languages like VBScript provide language conventions for binding objects that source events to script functions that handle events. Other languages typically allow you a run-time mechanism to set up such bindings, e.g. to register call-backs, or a way to poll for events and dispatch them to appropriate handlers.

How do scripts reference objects? In many cases objects associated with HTML elements such as form fields can be identified by virtue of the document markup, e.g. the tag names and attribute values. HTML ID attributes provide identifiers that are unique throughout a given document, while NAME attributes for elements defining form fields are limited in scope to the enclosing FORM element.

Scripting systems may allow authors to script objects that occur within an object associated with an OBJECT or IMG element. Document frames allow one document to be nested in another. Script handlers could be placed in a parent document and used to control the behaviour of a child document. Scripts may also be used for objects external to documents, such as the user agent or other applications. A particularly simple form of scripting is to just wire up objects that source events with ones that sink events. One event may be multicast to several recipient objects.

One way to deal with naming is to introduce language specific naming conventions, e.g. "document.form1.button1" as used by JavaScript. Another is to rely on the context in which a SCRIPT element is located to guide search for a named object. For instance, if the SCRIPT element is within a FORM element, objects associated with form elements with matching NAME values may be sought in preference to elements with matching ID values.

Some scripting languages, such as VBScript, limit the scope of references to a given module, but don't provide language specific means for defining modules. If a module is associated with an HTML document then element ID values can be used for binding handlers to objects. If the module is associated with an HTML form, then form field NAME attribute values could in principle be used by the script engine to unambigously bind handlers to objects by placing the handlers in a SCRIPT element within the associated FORM element.


Intrinsic Events

A number of common events can be handled using attributes placed on the HTML elements associated with the object generating the event. The attribute names for intrinsic events are case insensitive. The attribute value is a scripting language dependent string. It gives one or more scripting instructions to be executed whenever the corresponding event occurs. Note that document.write or equivalent statements in intrinsic event handlers create and write to a new document rather than modifying the current one.

In the following example, userName is a required text field. When a user attempts to leave the field, the OnBlur event calls a JavaScript function to confirm that userName has an acceptable value.

    <INPUT NAME="userName" onBlur="validUserName(this.value)">

Here is another JavaScript example:

    <INPUT NAME="num"
        onChange="if (!checkNum(this.value, 1, 10)) 
            {this.focus();this.select();} else {thanks()}"
        VALUE="0">

Scripting Language

The scripting language assumed for intrinsic events is determined by the default scripting language as specified above for the SCRIPT element.

SGML Parsing of intrinsic event handler attributes

The script attributes for intrinsic events are defined as CDATA. The SGML processing of CDATA attribute values requires that (1) entity replacement occur within the attribute value; and (2) that the attribute value be delimited by the appearence LIT ( " ) or LITA ( ' ). The literal delimiter which terminates the attribute value must be the same as the delimited used to initiate the attribute value. Given these lexical restrictions, the delimiters LIT or LITA, ERO (entity reference open - '&'), and CRO (character reference open - "&#") may not freely occur as script code within a script event handler attribute. To resolve this issue, it is recommended that script event handler attributes always use LIT delimiters and that occurrences of '"' and '&' inside an event handler attribute be written as follows:

    '"'  should be written as "&quot;" or as "&#34;"
    '&'  should be written as "&amp;"  or as "&#38;"

For example:

    <INPUT NAME="num"
      onChange="if (compare(this.value, &quot;help&quot;)) {gethelp()}"
      VALUE="0">

Note that SGML permits LITA ( ' ) to be included in attribute strings quoted by LIT ( " ), and vice versa. The following is therefore okay:

    "this is 'fine'" and 'so is "this"'

The following is an example of how intrinsic events are specified in the HTML document type definition:

    <!ATTLIST SELECT
        name        CDATA       #REQUIRED
        size        NUMBER      #IMPLIED
        multiple   (multiple)   #IMPLIED
        onFocus     CDATA       #IMPLIED
        onBlur      CDATA       #IMPLIED
        onChange    CDATA       #IMPLIED
        >

The Set of Intrinsic Events

The set of intrinsic events are listed below together with the HTML elements they can be used with. This set is expected to grow slightly:

onLoad
A load event occurs when the browser finishes loading a window or all frames within a FRAMESET. The onLoad event handler executes the scriptlet when a load event occurs. This attribute can only be used with BODY or FRAMESET elements.
onUnload
An unload event occurs when you exit a document. The onUnload event handler executes the scriptlet when an unload event occurs. This attribute can only be used with BODY or FRAMESET elements.
onClick
A click event occurs when an anchor or form field is clicked. The onClick event handler executes the scriptlet when a click event occurs. This event is generated by buttons, checkboxes, radio buttons, hypertext links, reset and submit buttons. This attribute can only be used with INPUT, and anchor elements.
onMouseOver
This event is sent as the mouse is moved onto an anchor. This attribute can only be used with anchor and AREA elements.
onMouseOut
This event is sent as the mouse is moved out of an anchor or textarea element. This attribute can only be used with anchor and AREA elements.
onFocus
A focus event occurs when a field gains the input focus by tabbing or clicking with the mouse. Selecting within a field results in a select event, not a focus event. This attribute can only be used with the SELECT, INPUT and TEXTAREA elements.
onBlur
A blur event occurs when a form field loses the input focus. This attribute can only be used with the SELECT, INPUT and TEXTAREA elements.
onSubmit
A submit event occurs when a user submits a form. This may be used to control whether the form's contents are actually submitted or not. For instance, JavaScript won't submit the form if a scriptlet for the onSubmit event returns false. This attribute can only be used with the FORM element.
onSelect
A select event occurs when a user selects some of the text within a single or multi-line text field. This attribute can only be used with the INPUT and TEXTAREA elements.
onChange
A change event occurs when a form field loses the input focus and its value has been modified. This attribute can only be used with the SELECT, INPUT and TEXTAREA elements.

Note some user agents support onMouseover, onMouseOut and onClick on a much wider variety of elements, not just the elements listed above.


Reserved Syntax for HTML CDATA attributes

This specification reserves syntax for the future support of script macros in HTML CDATA attributes. The intention is to allow attributes to be set depending on the properties of objects that appear earlier on the page. The syntax is:

   attribute = "... &{ macro body }; ... "

The remainder of this section describes current practice for the use of script macros, but is not a normative part of this specification.

Current Practice for Script Macros

The macro body is made up of one or more statements in the default scripting language (as per instrinsic event attributes). The semicolon following the right brace is always needed, as otherwise the right brace character "}" is treated as being part of the macro body. Its also worth noting that quote marks are always needed for attributes containing script macros.

The processing of CDATA attributes proceeds as follows:

  1. The SGML parser evaluates any SGML entities, e.g. "&gt;"
  2. Next the script macros are evaluated by the script engine
  3. Finally the resultant character string is passed to the application for subsequent processing.

Note that macro processing takes place when the document is loaded (or reloaded) but isn't redone when the document is resized or repainted etc.

Here are some examples using JavaScript. The first one randomizes the document background color:

    <BODY BGCOLOR='&{randomrbg()};'>

Perhaps you want to dim the background for evening viewing ...

    <BACKGROUND SRC='&{if(Date.getHours > 18)...};'>

The next example uses JavaScript to set the coordinates for a client-side image map:

    <MAP NAME=foo>
      <AREA SHAPE="rect" COORDS="&{myrect(imageurl)};" HREF="&{myurl};">
    </MAP>

This example sets the size of an image based upon document properties:

    <IMG SRC=bar.gif WIDTH='&{document.banner.width/2};' HEIGHT='50%'>

You can programmatically set the URL for a link or image:

    <SCRIPT>
      function manufacturer(widget) {
          ...
      }
      function location(manufacturer) {
          ...
      }
      function logo(manufacturer) {
          ...
      }
    </SCRIPT>

    <A HREF='&{location(manufacturer("widget"))};'>widget</A>

    <IMG SRC='&{logo(manufacturer("widget"))};'>

This last example shows how SGML CDATA attributes can be quoted using single or double quote marks. If you use single quotes around the attribute string then you can include double quote marks as part of the attribute string. Another approach is use &quot; for double quote marks, e.g.

   <IMG SRC="&{logo(manufacturer(&quot;widget&quot;))};">

Using form fields without a NAME attribute

For an INPUT, TEXTAREA or SELECT element to be considered as part of a form, when submitting the form's contents, both of the following conditions must apply:

  1. The element must have a NAME attribute.
  2. The element must be contained by a FORM element.

If either of these two conditions are not met, then the field is not treated as being part of a form. This allows fields such as text fields and buttons to be used together with scripting to build user interfaces independent of the role of these elements for forms.


Deployment Issues

Authors may wish to design their HTML documents to be viewable on older browsers that don't recognise the SCRIPT element. Unfortunately any script statements placed within a SCRIPT element will be visible to users. Some scripting engines for languages such as JavaScript, VBScript and Tcl allow the script statements to be enclosed in an SGML comment syntax, for instance:

<SCRIPT LANGUAGE="JavaScript">
<!--  to hide script contents from old browsers
  function square(i) {
    document.write("The call passed ", i ," to the function.","<BR>")
    return i * i
  }
  document.write("The function returned ",square(5),".")
// end hiding contents from old browsers  -->
</SCRIPT>

The JavaScript engine allows the string "<!--" to occur at the start of a SCRIPT element, and ignores further characters until the end of the line. JavaScript interprets "//" as starting a comment extending to the end of the current line. This is needed to hide the string "-->" from the JavaScript parser.

Down-level browsers will ignore the SCRIPT start and end tags and interpret the "<!--" string as the start of an SGML comment. In this way the contents of the SCRIPT element are hidden within an SGML comment.

In VBScript a single quote character causes the rest of the current line to be treated as a comment. It can therefore be used to hide the string "-->" from VBScript, for instance:

   <SCRIPT TYPE="text/vbscript">
     <!--
       Sub foo()
        ...
       End Sub
     ' -->
    </SCRIPT>

In Tcl, the "#" character comments out the rest of the line:

<SCRIPT LANGUAGE="tcl">
<!--  to hide script contents from old browsers
  proc square {i} {
    document write "The call passed $i to the function.<BR>"
    return [expr $i * $i]
  }
  document write "The function returned [square 5]."
# end hiding contents from old browsers  -->
</SCRIPT>

Some browsers close comments on the first ">" character, so to hide script content from such browsers, you can transpose operands for relational and shift operators (e.g. to use "y < x" rather than "x > y") or to use scripting language dependent escapes for ">".


References

HTML 2.0 Proposed Standard - RFC 1866
T. Berners-Lee, D. Connolly November 1995. This can be found at ftp://ds.internic.net/rfc/rfc1866.txt.
W3C Recommendation for HTML 3.2
Dave Ragget, January 1997. This can be found at http://www.w3.org/pub/WWW/TR
Internet Media Types - RFC 1590
J. Postel. "Media Type Registration Procedure." RFC 1590, USC/ISI, March 1994. This can be found at ftp://ds.internic.net/rfc/rfc1590.txt.
MIME - RFC 1521
Borenstein N., and N. Freed, "MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies", RFC 1521, Bellcore, Innosoft, September 1993. This can be found at ftp://ds.internic.net/rfc/rfc1521.txt
OBJECT Elements
The syntax and semantics for OBJECT are defined in http://www.w3.org/pub/WWW/TR/WD-object.html.

Bibliography

JavaScript
An overview is available from http://home.netscape.com/eng/mozilla/Gold/handbook/javascript/index.html
Visual Basic Script
An overview is available from http://www.microsoft.com/vbscript/default.htm
ActiveX Scripting
An introduction to ActiveX(tm) Scripting is available from http://www.microsoft.com/intdev/sdk/
Tcl
The Tcl faq can be found at http://www.NeoSoft.com/tcl/tclhtml/tclFAQ/part1/faq.html You can also look at the tcl news group comp.lang.tcl.