An initial HTTP attempt to access a "password protected" Web page of this type (without providing suitable "authentication" information) will generate an HTTP error message together with a Web page which explains the nature of the error. Typically the response headers will contain:
In HTTP/1.0, only theHTTP/1.1 401 Authorization Required Date: Wed, 17 Mar 2004 01:17:56 GMT Server: Apache/1.2.6 WWW-Authenticate: Basic realm="ByPassword" Last-Modified: Mon, 15 Mar 2004 00:43:51 GMT ....etc....
Basicauthentication method was available, as used in this example.
Upon receiving this error, the Web browser will normally pop up a dialog box
similar to the above, collect a user-ID and password from the user, and then
retry the request with an additional "
request header containing the additional information.
Let's use as an example, a page for which the username is
student", password "
student" -- pretty
:-). The concantenation is thus
student:student". We can use the Unix commandline base64
mimencode to encode the data, (it encodes to
c3R1ZGVudDpzdHVkZW50") so that the request header will look
This, of course, begs the obvious question -- why on earth do they do this? The obvious answer is "for security reasons" -- to deter casual network snoopers who might be observing traffic, watching for passing user-IDs and passwords. We are left wondering...GET /subjects/int21cn/test/index.html HTTP/1.0 Authorization: Basic c3R1ZGVudDpzdHVkZW50 ....etc....
A browser which is "cookie-enabled" will normally store this name/value pair, and future requests to the same server will contain an additional request header, thus:HTTP/1.0 200 OK Set-cookie: myname=myvalue ....etc...
Cookies are extensively used in Web session management, which is discussed later in the unit.GET /somefile.html HTTP/1.0 Cookie: myname=myvalue ....etc...
 In fact, cookie operation is
rather more complex than we discuss here -- for example, the
Set-cookie: " header can take several additional parameters
(which affect how the cookie is interpreted), and the behaviour of browsers with
respect to cookies can be changed by the end-user.
A form in HTML is an area of a Web page which is used to gather input from a
human user. The information which is gathered can then be returned to the page's
owner using a
The form is, as expected, delimited by a
</FORM> markup pair.
<FORM> markup has two important attributes:
ACTIONURL is accessed. There are two methods,
<FORM ACTION="http://ironbark.bendigo.latrobe.edu.au/cgi-bin/myprog" METHOD="GET">
INPUTtag has an associated TYPE attribute.
INPUTtype can take several further attributes, eg:
In a browser, this would be presented as a (scrollable) textbox, 20 characters wide (but able to accept 64 characters of input).<INPUT TYPE="TEXT" NAME="Name" MAXLENGTH="64" SIZE="20">
There are several other INPUT types:
OPTIONmarkup tag, which can take a couple of extra attributes.
COLSand can have a
NAMEattribute and an initial value.
...or simply "URL-encoded". In this format:application/x-www-form-urlencoded
+" character. This is a hangover from an older format and is normally, but not universally, used -- see next point.
%HH, where the
Hcharacters are the two hexadecimal digits of the byte. Sometimes the space character is also sent in this format, as "
%20", instead of as "
name=value, with each name-value pair separated by the "
&" (ampersand) character.
GETrequest is issued to the
ACTIONURL specified in the
<FORM>markup tag, with the urlencoded form information appended after a separating "
?" character. This can generate very long URLs.
POSTtransaction is performed. The "body" of the transaction contains the urlencoded form data, as a single long line of text. The POST transaction is directed at the URL specified in the
ACTIONattribute of the
In "real life",
are used pretty much interchangeably, depending on the programmer's or system
Submitbutton, you should pay close attention to two things:
The HTML for our FORM looks like:
This is rendered in your Web browser as:<FORM action="/subjects/int21cn/cgi/L06CGIa.cgi" method="GET"> info1: <INPUT type="text" name="info1" size="20"><br> info2: <INPUT type="text" name="info2" size="20"><br> <input type="submit" value="Submit"> <input type="reset" value="Clear Form"> </FORM>
In this case, we're going to try something different -- the CGI program which is the target of this Form is going to show us the actual HTTP request as it was received.
Again, try it.
 Actually, it's a "reconstructed" version of the HTTP request: not all request headers are necessarily shown. But it's close enough for our purposes!
When a user clicks the
SUBMIT button on a form, the HTTP
server starts up the specified CGI program, and makes the form data available to
From a programming perspective, the difference between
POST is the way in which a CGI
program receives the form data. If the method was
information is usually obtained by examining the contents of an
environment variable (usually called
QUERY_STRING) containing the URL-encoded form data. Other
environment variables contain additional useful information.
If the method was
POST, the CGI program usually receives
the form data on its standard input stream, with any extra
stuff obtained, as before, from environment variables.
CGI programs can, as a rule, be written in any language (compiled or interpreted) supported on the system running the HTTP server.
On Unix servers, they are commonly written in
C or as Bourne shell (
A CGI program (almost) always generates (to standard output) a Web page which is returned to the browser, in addition to any other effect.