Lecture 3: Applications #1: Intro and Telnet

Digression #1: RFCs and Internet Documentation

In this subject, we concentrate fairly heavily on the protocols and architectures used in the global Internet.

Every aspect of the Internet is documented in a series of documents called "RFCs" (Request For Comment). RFCs are the means by which new technologies are introduced in the Internet: after suitable research work has been done, the new proposals are published as an RFC. On the other hand, some RFCs document standard Internet protocols. RFCs are usually in plain text form.

Once an RFC is published, it is not changed. However, it may be "obsoleted" by later work. Unfortunately, there is no easy way to "browse" RFCs to discover which RFC is the latest on a particular topic, although there are various Web Indexes which can be useful^[1]. All RFCs are available on-line on the Internet. In Australia, they are available at several sites: in particular your lecturer's favourite (and fastest) RFC archives are at:

http://mirror.aarnet.edu.au/pub/rfc/
ftp://munnari.oz.au/rfc
ftp://ftp.monash.edu.au/pub/disk2/internet-standards/

You will probably (later) need to learn how to download RFCs to answer some of the assignment questions.

^[1] See, for example, the excellent index at http://www.faqs.org/rfcs/

Application Protocols

Application protocols define the way in the TCP reliable service can be used to achieve network-based computing. Because applications can assume reliability, their protocols can be relatively simple.

Most Internet application protocols use commands (and, in some cases, responses) in a human-readable form. They normally also use plain ASCII text (see later) where possible. This makes debugging the protocols quite straightfoward in most cases.

Some application protocols include:

Telnet: provides remote login allowing a user to log into a remote computer as though it was local. We examine Telnet in this lecture.
SMTP: (Simple Mail Transfer Protocol) is the Internet electronic mail delivery protocol. We look at electronic mail in the next lecture.
FTP: (File Transfer Protocol) is used to copy files from one system to another.
HTTP: (HyperText Transfer Protocol) is the protocol basis of the World Wide Web.

Remote Login

Remote login means to "log in" to a remote computer (or, to use the historical term, a "host") as though it were local.

The idea of "remote login", as opposed to "local login" is significant in the history of computing -- originally, a user "logged in" at a basic display terminal which was directly connected to a hardware port on a multi-user host computer. When the Internet's predecessor ARPANET was being developed, this was viewed as its likely main application.
If the host was connected to a network, remote login allowed users to log in to other networked hosts over the network as though their terminal was directly connected. Either way, the idea of "logging in" is still based on getting a command line shell on the target system. On our Unix systems, we nowadays talk about a shell window, which performs this function.
Some systems do not (even now) support remote login, in most cases because they don't support a decent "command-line" interface.
Different operating systems have (or used to have... ) quite different procedures for handling local logins, making the problem of providing a generic remote login facility (potentially) quite complicated.

Telnet

Telnet is the basic remote login protocol, and is supported on virtually all time-shared operating systems.

Basic Telnet operation:

The user invokes the telnet client process, usually by name from the command line, eg:
telnet redgum
Once running, the client process then establishes a TCP connection to the desired telnet server, which is "waiting for connections" at the well-known port 23 -- note that we are again ignoring the question of how the name "redgum" gets translated to a network address, see later. If you like, you could simply replace the word "redgum" with its IP address, 149.144.21.3
In the case of Unix, the telnet server connects the incoming connection to a variation of the standard "login" process on the server host. This may work differently on other systems.
The user's keystrokes are transmitted to the remote server, and output is displayed on the user's screen. Thus, initially the user can "log in", and once authenticated (using a username/password pair) has a normal shell, or command line interface, on the remote host.

Digression #2: ASCII Text

Fundamental idea: the most basic unit of data is the byte -- virtually all computers (and network systems) handle data one byte at a time. Recall that a byte is an 8-bit value and thus can take any value between zero and 255_decimal.

US-ASCII (or, just "ASCII") was the first widely accepted data representation system, and is universally recognised. In its traditional form, it's a 7-bit code, meaning that if an ASCII message is stored or carried in a modern byte-oriented system, the Most Significant Bit (MSB) of every byte will always be zero. For this reason, ASCII messages are sometimes called "7-bit data". An ASCII-valued byte has traditionally been called a "character", and obviously takes any value between zero and 127.

Within the ASCII "character set" there is a further subdivision:

Printable ASCII: characters with values between 32 (the ASCII "space" character) and 127 (the "DEL" character). This includes all of the uppercase and lowercase letters, the digits and the punctuation characters.
Control Characters: character values between zero and 31. These were originally designed for a range of "official functions", most of which are now irrelevant.

Telnet NVT

The telnet protocol defines a "Network Virtual Terminal" (NVT) that provides a standard interface to remote systems, regardless of their particular approach to terminal login. A telnet implementation (client or server) maps the semantics of local terminal operation to the NVT before sending data over the connection. Some aspects of the NVT include:

The basic unit of transmission is the "line of text" -- ideal for command-line interfaces.
An NVT text line contains only standard printable US-ASCII characters, terminated by an NVT "newline" indicator.
The NVT "newline" or "line ending" indicator is the two-character sequence: carriage return, decimal 13 followed by linefeed, decimal 10. Traditionally this has been written as <CR><LF>^[2]. A telnet implementation "maps" the "enter" or "return" key to this sequence before sending the line of text over the TCP connection.
The telnet NVT has a few other interesting characterictics: it defines the meaning of a few other ASCII control codes, permits certain "out of band" commands to be sent to the remote host, and has faciliites for "Option Negotiation".

^[2] The "angle brackets" here (ie, < and >) were traditionally used to indicate an ASCII control character. They are now so commonly used in HTML markup (see later) that this older usage is disappearing.

Other Aspects of Remote Login

Programs which implement the telnet protocol are widely (and freely) available, and telnet is much used.

The BSD version of Unix introduced (in the mid 1980s) a remote login utility with enhanced characteristics called "rlogin." Some of its features are:

It supports the idea of "trusted" hosts, whereby a remote login request from a trusted host (providing the usernames match) is not re-authenticated. This can be administered on a per-host basis (/etc/hosts.equiv) or a per-user basis (~/.rhosts).
rlogin exports the user's local "login environment" to the remote host, so that an rlogin session can look almost identical to a local login.

Nowadays, computer users who wish to use remote login facilities normally use a "secure" software package such as "ssh" (for "Secure SHell"). This software encrypts (for security, see later) and compresses the data (for efficiency) before sending it to the remote host. If you actually want to do remote login nowadays, you should always use ssh!

Telnet as a "Debugging Weapon"

A telnet program can be used to connect to other services than the standard telnet (ie, login) server at port 23. Most telnet implementations allow the user to specify a port number on the command line, and will open a TCP connection to that port. This can be very useful in debugging communications protocols.

The reason this works is that virtually all "traditional" Internet application protocols are based on the telnet idea of exchanging lines of text^[3]. In fact, as we shall see, they usually use the telnet NVT specification.

In this subject, we will use telnet to demonstrate the operation of various Internet application protocols. For example, to investigate the Internet "email delivery" protocol SMTP (see next lecture) we could do:

telnet redgum 25

A final comment: telnet is a valuable tool to learn about network applications. In fact, it's so powerful that in some educational institutions, possession of a copy of telnet is regarded as prima facie evidence of intending to "hack into" computer systems... Be careful!

^[3] In some cases this is only true for "commands" and "responses" -- so-called "8-bit data" can subsequently be transferred. La Trobe Uni Logo