Computer Networks Practical #5

INT21CN Computer Networks

Practical Exercises #5

NB: Prac 5 was originally all about HTML -- from way, way back in the dark ages when HTML was new and exciting, and hardly anyone knew about it, and we used to have a whole lecture on HTML. We don't do that anymore. However, if you'd like to see how things used to be, it's still here.

Here's your beginner's introduction to HTTP 0.9, exercising the example from Lecture 2.If you're a Unix user (and you're on the Bendigo campus) give yourself a Unix command shell on one of the SGI systems and type the following command:
```
telnet ironbark.bendigo.latrobe.edu.au 80
```
Once you get the "connected" message, type the following (note uppercase):
```
    GET /
```
Both of these commands are terminated by you hitting the RETURN (enter) key. What should happen? Answer: you should see a large amount of HTML text scrolling past, followed by a "Connection Closed" message. This was how the original Web worked, way back in the early '90s.
Now we'll try HTTP/1.0. First, telnet to ironbark.bendigo.latrobe.edu.au, port 80, as in the previous exercise. Recall from the lecture that you have to add a protocol specifier onto the GET request line, as in:
```
    GET /index.html HTTP/1.0
```
Notice that you have to hit the RETURN key twice before you see the HTML. If you can scroll backwards to the HTTP/1.0 command you gave, you'll see the HTTP/1.0 "Response Headers". Have a look though them, noting the Content-length: , Content-type: and the various date-related headers.
Recall that HTTP/1.1 (and in HTTP/1.0) you can request a full URL. Repeat the exercise, except this time do:
```
GET http://ironbark.bendigo.latrobe.edu.au/index.html HTTP/1.0
```
Try an HTTP/1.1 request -- nothing changes from the previous example, except the protocol specifier at the end of the request line, which is now set to HTTP/1.1. Does it work? What happens if you request a file instead of a full URL with HTTP/1.1? Try these experiments on a variety of local servers (eg www.latrobe.edu.au) Check the Etag: header for the returned document.
By the way: you can fetch just the headers, and not the whole HTML document, by replacing the word GET with the word HEAD in the above request. If you're doing this regularly, you'll probably find it more convenient.
OK, our final piece of fun -- try a Conditional-GET operation. This requires a bit more typing, and you've got to be very careful to get the Date: header in exactly the right format. It's worth the effort, IMHO.
```
    GET http://ironbark.bendigo.latrobe.edu.au/index.html HTTP/1.0
    If-modified-since: Fri, 10 Mar 2004 02:22:52 GMT
```
If you're using Netscape, have a look at the "Page Info" for this Prac exercise page(or any other) Web page -- it's under the "View" menu. It shows most of the information supplied in the HTTP/1.0 Response Headers.
It's interesting to try out the Proxy Server features of HTTP -- in fact, these were present in HTTP/1.0 "as implemented", although I don't think they were in the standard - correct me if I'm wrong. In this case, you'll have to telnet to your local proxy server. Within La Trobe Uni, you do this by telnet proxy.latrobe.edu.au 8080. Once connected, the GET command is the same as usual except you request a full URL, eg:
```
GET http://www.unimelb.edu.au/ HTTP/1.0<newline><newline>
```
In fact, from within La Trobe this is the only way you can access Web pages outside the university. Can you tell if the page you fetch was cached at the proxy server? Try it again and see if it was a cached as a result of the first request. Try fetching some pages from outside the Australian university sector. Rememberm if you're more interested in the headers than the actual document, use HEAD instead of GET.
If you've got some Web pages on redgum, you might be interested to have a look at the HTTP access log to see who's been looking at them. The file is /usr/freeware/apache/var/log/access_log. This files is usually huge, so the easy way to look at your own stuff is to use the Unix grep utility to extract the lines you're interested in. For example, I use the following to see who's reading my (psc's) Web pages on redgum:
```
grep -v latrobe /usr/freeware/apache/var/log/access_log | grep psc | less +G
```
A few words of explanation: the grep -v latrobe only selects records (ie, lines) which don't contain the string "latrobe" -- I'm interested in accesses from outside La Trobe. The result of this is piped into a second grep which selects only those lines containing the string "psc", as all references to my Web pages do.
You can copy-and-paste this command into a text file on Unix, run chmod a+x on the file to make it executable, and use it as a command to check your own logs -- changing "psc" to your own username, naturally.
Last one -- so far we have been using telnet to fetch pages (and headers). You can avoid this step by simply using the Unix GET and HEAD commands which are installed on (some of) our systems. That is, at the command line you can simply type, without first doing the telnet thing:
```
GET http://www.uninmelb.edu.au/
```

La Trobe Uni Logo