Personal Study Notes: The Internet

Introduction, addressing scheme, architecture, protocols, client/server ports. Email: addresses, how to find them using finger, whois, X500, know­bot. Mailing lists, List servers: ListServ, Majordomo, UseNet. Email viewers: trn (for Unix), Trumpet (for Windows). UNIX's Internet Utilities: Telnet, FTP. Internet File Types, Archiving & Compression. Search Engines: Archie, Go­ph­er, WAIS. World Wide Web Hypertext Document System.

Introduction

The Internet is a world-wide network of networks. It comprises 20,000 networks with a total of 3,000,000 host computers growing at 100,000 hosts per month. You can connect your computer to the Internet via a permanent phone line or the dial-up phone network. Either way, your computer can be connected to the Internet either as a host within the Internet or as a terminal of a host owned by an access provider. I wonder if there is also a way of connecting free via AMSAT through the Amateur packet radio network?

For a permanent direct high-speed connection, the cost is prohibitive for all but large establishments. For dial-up, the connection rental is about £6.50 a month + plus BT's normal local-call rate charges for the time you spend on the phone line. Connection via packet radio is free. Almost all services on the Internet are free — world-wide. They are mostly provided by the American government departments, colleges and universities, voluntary organisations and as good PR exercises by very large commercial organisations like IBM.

Addressing

All host computers on the Internet are in reality UNIX boxes. To use Internet ser­vices effectively and efficiently you need to be familiar with UNIX. Each host on the Internet has a unique 32-bit address:

Form of Internet addresses.

Class A and Class B networks can be divided into subnets within their organisa­tions. Users who need more than the 256 hosts that a Class C network can provide can have Class C supernets which comprise a group of consecutively addressed Class C networks. Some hosts — especially routers — can have more than one In­ternet address. This is because they are connected to more than one network. All service-providing hosts (servers) on the Internet also have names whose formats conform to the Domain Naming System as follows:

Structure of an Internet domain name.

These are translated into numeric host addresses by reference to name servers. Each name server maintains a list of the names and addresses of hosts in its vici­nity. Given a host name, it returns the address.

Architecture

The Internet is not a single world-wide network of computers. It is a network of net­works-of-computers. The things that link the networks together to form the Internet are called routers. They themselves are hosts. They simply contain the necessary routing software. An example of how the Internet is made up is shown below.

Schematic of Internet architecture.

Originally the three hosts shown were probably the main servers for independent company and campus local area networks and a conventional host with a bunch of dumb terminals. Then, when the Internet arrived, they were equipped with the necessary routing and Internet servicing software and linked together to form part of the Internet backbone.

Dumb terminals access Internet services through the UNIX shell of the host to which they are connected. They can also use various items of client software dedi­cated to specific services like email. LAN-based and dial-up PCs can work in the same way, but PCs have the advantage of being able to run client software directly. Client software for a PC is usually more advanced than what is available to a dumb terminal on the local UNIX host. A UNIX workstation on a LAN has all the UNIX facil­ities of the host router and also has its own Internet address. It can therefore use all the UNIX Internet client software and indeed the latest Xwindows-based GUI client software.

The local area networks (LANs) can be of different physical types like token ring or Ethernet. They can be different types of Ethernet — thicknet, thinnet or twisted pair & hub. Networks of the same type can be linked directly by 'bridges'. They can be running any of the popular LAN protocols — Novell, 3Com, Banyan, DECnet, SNA and TCP/IP.

Networks of the same physical type but running different protocols can be linked by gateways. However, gateways are application-specific in that the way you convert between protocols is different for email than it is for a terminal session.

Communication Protocols

However, the Internet backbone always uses the Internet Protocol (IP) and the hosts are normally connected via fast digital trunks. Small dial-in UNIX hosts are connec­ted by the Point-to-Point Protocol, PPP.

The Internet Protocol (IP) is the way in which data is sent from one host to another across the Internet. All data to be sent is split up into packets of from 200 to 2000 bytes (1536 bytes is the traditional standard size for a packet). Each packet is headed with the address of the host it is bound for, and also the return address of the host that sent it. The format of an IP packet is show below:

Internet protocol packet addressing.

Each packet then moves across the Internet like a mail item being forwarded from host to host en route until it reaches its destination host. Each en-route host must be able to understand IP in order to be able to forward each packet to the next way­point on the fastest permissible route to its final destination.

IP tries its best to deliver packets. However, 1% of packets may not reach their destinations. They get lost due to transmission errors. The Internet's Transmission Control Protocol (TCP) is a higher level protocol than IP. It runs on top of IP. It pro­vides what looks like a dedicated connection or virtual circuit between client and server across the Internet.

Host to host transmission control protocol.

TCP acts like a certified mail system ensuring that all packets sent are in fact re­ceived. The sending TCP numbers each packet before it is sent. The receiving TCP tells the other what it has received and what it has not. Any lost packets are then re-sent.

Unlike the X25 protocol which validates the integrity of all packets at each en-route data switch, TCP operates only at the end of the virtual circuit. The routers take no part in the TCP protocol.

Routers only deal with IP. They do not take notice of TCP.

TPC is for interactive work. Other high-level protocols such as FTP are used on the Internet also. These are discussed later.

Client & Server Ports

Inside any given host there will be a number of different servers running simult­aneously. Each server provides a particular kind of user service or application. Each server has a port number.

A client program communicates with its remote server by sending it a request. The request is broken down into packets and sent across the Internet to the server host. There the packets are re-assembled into the original request. The request is then put onto a queue to await service.

Internet client-server operation.

As soon as the server concerned is free it processes the request and produces the response. The response is then broken down into packets by the TCP and sent to the client over the Internet. The client's TCP then re-assembles the packets into the response and delivers it to the client program.

Each application server has a permanently-assigned port number which is univers­ally known throughout the Internet. Each time a user starts a client program it is assigned an arbitrary port number at the time. A port number is in effect the addr­ess of a piece of software. It is where you send data to it.

Each request from a client contains the port number of its server on the remote host together with its own port number. When the request gets to the other end, the port number ensures that it is given to the right server. After the server has pro­cessed the request and produced the response, it heads the response with the port number of the client that sent the request. This ensures that when it gets to the sender's host, the sender's host gives it to the right client. In the diagram, Port 13 is used as an example. It is the universal port number of the telnet server.

Electronic Mail

Companies and other organisations (domains) have Internet addresses such as ebs.co.uk. Furthermore, individual computers (both hosts and workstations) within a domain can also have individual identifiers. These can be names like sharon, tracy and eustace as shown below:

Schematic showing how electonic mail is exchanged.

Likewise, a person within a domain can have an individual identity. It is quite separ­ate from that of his workstation. It is his mailbox address, examples of which are shown above for Ruby and me. Everyone's mailbox is usually held on the local host on which a mail daemon like smail that uses the Simple Mail Transfer Protocol SMTP which requires the host to be on all the time waiting for electronic mail to arrive off the Internet.

A mailbox is an ordinary text file on the UNIX host. The individual messages within it are separated by a separator which is usually a line containing four Ctrl-A charac­ters. These look like smiley faces in the PC8 character set. As each new message comes in it is appended to the mailbox file.

To send email you need a mail client like Berkeley mail, xmail, pine, elm or Eudora (PC Windows). The last four allow you to write, queue and send outgoing mail. They also allow you to select from a list and read received mail.

Mail robots can respond to certain email messages automatically. They are used for such things as responding to requests for files or news.

MIME - Multipurpose Internet Mail Extensions. This is a standard for encoding binary (exe, dat, image, video, sound, etc.) files as 7-bit text to be sent as email messages and decoded by the recipient. Pine and Eudora can display/play MIME messages.

Mail Sorters look for key words on the 'From' and 'Subject' lines of each message as it is received. They put each message into a sub-mailbox according to sender and subject categories. I think building a pair of index files would be slicker. UNIX Mail sorting programs procmail and delivermail require shell scripts to be written. Eud­ora and Pine have built-in mail filters that do the same thing.

Email Addresses

RFC822 is the standard electronic mail addressing system used on the Internet. Internet mail addresses comprise two parts:

user name: (eg Robert.J.Morton or rob.morton)
domain id: (eg ebs.co.uk)

This is written as rob.morton@ebs.co.uk
That's all there is to it.

RFC822's arch rival is the X400 standard devised by the CCITT.

Some private email service which are connected to the Internet use X400. Among these, X400 is becoming more and more widely used. The most relevant of the labyrinth of pre-defined formal fields that make up an X400 email address are as follows:

S Surname
G Given name
I Initials
Q Generation qualifier (Jr III etc.)
A Administration domain name
P Private domain name
O Organization
C Country
DD Domain Defined Attributes

An X400 email address is of the form:

/S = Morton
/G = Robert
/I = J
/O = Eastern Business Systems
/A = CompuMail
/P = EBS
/C = UK
/DD = rob
/@compuserve.co.uk

The order of the slash-separated items does not matter. The best way to find the exact form of an X400 subscriber's email address is to phone them and ask them to send you a message. Get their address off the 'From' line on the message you receive.

And there are more...

Many email systems manufacturers, private email providers and indeed user organ­isations have their own proprietary addressing systems. This means that some people's email addresses may become very large and complicated. For instance, AT&T Mail provides gateways to companies' internal mail systems. So you can have internal sub-domains appended to the individual's username eg:

alpha!beta!delta@acmecorp.attmail.com

See pages 111 - 116 for private email services and their addressing peculiarities.

Finding Email Addresses

Finger
The UNIX finger command normally returns a list of who is currently logged in on the local host. It gives the Login name, the user's full name, which terminal they are on, when they logged on, and how long it is since they last did a key­stroke.

However, if you specify one of the users by name, eg: finger rob, it returns all the above plus which directory they are in, what project they are working on, and what their plan is. The project details are kept in a file called .project and the plan is given in a file called .plan. Only the first line of the .project file is dis­played but up to 10 lines of the .plan file is displayed.

You can finger a remote host by specifying its Internet name after the finger command eg: finger @ebs.co.uk. This will return who is logged in at the mom­ent at ebs.co.uk. And you can finger an individual at a remote host viz: finger marylin.monroe@ebs.co.uk. Not all remote hosts allow you to finger their users.

Whois
Some organisations maintain directories of users on their hosts. You query these directories using the whois command. On your local host you could enter something like whois rob. To find somebody on a remote host you would enter something like:

whois -h whois.ebs.co.uk rob

X500
This is an electronic white-pages directory of email addresses. It is a CCITT standard. It is currently the only existing standard system.

The commonest X500 service is called FRED (FRont End to Directories). You use it by telnetting to wp.psi.com or wpl.psi.com and logging in as FRED. If you want to find the email address of Pierre Petit whom you know works at some French university you aren't sure of, you then type in something like:

whois Pierre Petit -org *.ac -geo @c=FR

The -org switch takes a regular expression for the organisation name. The -geo switch takes a regular expression for the domain.

Knowbot
To use this service, you have to telnet to it as follows:

telnet info.cnri.reston.va.us 185

The 185 is the port number of the Knowbot server. When you get the prompt, just type in the person's name and wait. This can be several minutes. Knowbot has access to many private email service directories that the X500 systems do not, so it is worth trying.


Mailing Lists

An email mailing list is associated with a discussion topic. Any one person on a list can send a message to the list which then re-mails it to everyone else who is curr­ently on the list.

Manual Lists

A manual list is run and maintained by a human being. To get on to, send a mes­sage to, and get off a manual list called eg save-the-whales at ebs.co.uk, send the following messages respectively:

To: save-the-whales-request@ebs.co.uk
From: rob.morton@hide1.ac.uk
Please add me to the save-the-whales list.

To: save-the-whales-request@ebs.co.uk
From: rob.morton@hide1.ac.uk
Subject: Whale counts in the Atlantic
Blah blah blah.

To: save-the-whales-request@ebs.co.uk
From: rob.morton@hide1.ac.uk
Please remove me to the save-the-whales list.

Automatic List Servers

An automatic list is run by a list management program.

LISTSERV

A clunky IBM mainframe automatic list management program. To get on to, send a message to, and get off an email list called ECO-L, send the following messages respectively:

To: LISTSERV@ebs.co.uk
From: rob.morton@hide1.ac.uk
SUB ECO-L Rob Morton

To: ECO-L@ebs.co.uk
From: rob.morton@hide1.ac.uk
Subject: Global Warming
Blah blah blah.

To: LISTSERV@ebs.co.uk
From: rob.morton@hide1.ac.uk
SIGNOFF ECO-L

To contact the human life-form in charge of the an automatically run list, use the address eg

OWNER-ECO-L@ebs.co.uk

See p120 for more.

Majordomo
A workstation version of LISTSERV. For instance there is a list called 'explosive cargo' run by a computer technical writer in Boston on an automatic list server called Majordomo@world.std.com.

To subscribe, send:  subscribe explosive cargo
To unsubscribe send:  unsubscribe explosive cargo
The human owner is:  owner-Majordomo@world.std.com

UseNet: Global Bulletin Board
A world-wide bulletin board system on which 30,000 articles are posted each day. The system comprises thousands of news server hosts which constantly keep each other posted each day on all the articles that have been submitted by subscribers all over the world. Articles are classified into a hier­archical structure of news groups defined by multi-part names formatted something like comp.dcom.fax (a 3-level hierarchy). The top level classes of news group are:

comp  Topics relating to computers (lots of fairly meaty discussions)
sci  Topics relating to one of the sciences (also fairly meaty)
rec  Recreational (sport, art, hobbies etc.)
soc  Social (both sociological issues and just plain socialising)
news  Topics to do with NetNews itself
misc  Topics that do not fit anywhere else
talk  Long arguments — frequently political

Each news group contains subdivisions, the lowest of which contains the act­ual current articles on the topic concerned. All the main groups are, in theory, of world-wide interest. However, regional subdivisions exist — especially for such topics as odd bits of hardware for sale. Geographic division identifiers are things like: world, na = North Americas, usa = United States, can = Canada, uk = United Kingdom, ne = New England, ba = Bay Area (California).

You subscribe and unsubscribe to any news group you wish at any time. You can subscribe to many when you're slack and cut down when busy. You can read articles in news groups you are currently subscribed to. You can respond to them by emailing their authors directly or by sending a response to the relevant news group. You can also submit articles yourself. Most news groups have a human moderator who filters out what, in his perception, are cranky re­sponses and articles.

The technical set-up of the UseNet system is shown below:

Functional schematic of UseNet.


Email Viewers

The UNIX trn and Windows Trumpet news clients provide you with the means to sub­scribe and unsubscribe to news groups, read and respond to articles, and origi­nate and submit articles of your own. They also enable you to selectively reject art­icles on certain topics or by specified authors.

Articles are usually text files, but do not have to be. Articles can be in the form of GIF and JPEG image files, MPEG movie-clip files, executable files and data files. They can also be all these things packed into a UNIX 'shar' (shell archive) file which can also contain direct UNIX commands to your host. Beware: such could be a 'Tro­jan horse'. If you get an article in the form of a 'shar' file, use a 'shar' sanitizer on it.

Unix's Internet Utilities

Telnet: The UNIX Remote Login Facility

If you are on a UNIX host or workstation coupled to the Internet, the 'telnet', 'rlogin' (remote Login) and 'rsh' (remote shell) commands allow you to enter direct UNIX commands on a remote host as if you were one of its local users. To 'telnet' a re­mote host you enter:

% telnet dellboy.ebs.co.uk
Trying 240.197.016.001 ... Connected to ebs.co.uk
Escape character is '^]'.

System V UNIX (ebs.co.uk)
login: rob
password: ********

Terminal type (default VT100):
...

If the remote host does not echo what you type, you can turn on local echo with Ctrl-E. To close your session on the remote host, type the escape character in this case ^] (meaning you hold down the control key and press the right-hand square bracket key). If this does not work straight away, press the RETURN key. You will then get through to your local telnet prompt. Type quit. Telnet then confirms that it has disconnected from the remote host.

Some hosts operate as IBM 3270 servers. Instead of the UNIX command line, they present you with screens containing fields like a form that you fill in. For these sys­tems use 'tn3270' instead of telnet.

If you are on a PC, you can get a Windows-based TCP/IP package called Chameleon. This allows you to choose a host to log on to from a list box by clicking the host name. You then see the login prompt from the remote host and go on from there as above. It also supports 'tn3270'.

Where a group of hosts all share the same set of users 'rlogin' can be used instead. This does not require you to login to remote hosts as this is done automatically from user-details in a common system file. If you are a registered user on unrelated hosts, you can set up your details in a '.rhosts' file so that you can use 'rlogin'. See page 177.

% rlogin dellboy.ebs.co.uk
Last login: 14:30:28 Fri 08Jan96 from constance
System V UNIX (dellboy.ebs.co.uk)
login: rob
password: ********

Terminal type (default VT100):
...
~.
(The ~. (tilde dot) is rlogin's escape sequence)

If all you want to do on the remote host is execute a single command you can use 'rsh'. If on your own machine you are known as 'rob' and on the remote machine you are known as 'robboge', you log on as follows:

% rsh rob -l robboge ls -R

Once the command has been executed, you are back on your own local machine's command line.

FTP: The UNIX File Transfer Protocol

FTP enables you to copy files between your own local host and any other host on the Internet. If you are on a PC you must connect with your local host in the usual way as a terminal to conduct the FTP session.

Schematic of FTP: File Transfer Protocol.

If you want to transfer any files you have acquired from around the world to your PC you must download them using Kermit or zmodem. An FTP session is of the following form:

% ftp dellboy.ebs.co.uk
Connected to dellboy.ebs.co.uk.
220 ebs FTP server (Version 4.1 8/7/95) ready.
Name (dellboy.ebs.co.uk): rob
331 Password required for rob.
Password: ********
230 User rob logged in.
ftp> get README README.TXT
150 Opening ASCII mode data connection for README (12696 bytes)
226 Transfer complete.
local: README remote: README.TXT
12979 bytes received in 28 seconds (0.44 Kbytes/s)
ftp> quit
221 Goodbye.

The commands you can use at the 'ftp>' prompt are as follows:

cd rdirnamechanges the directory to rdirname on remote host
lcd dirnamechanges the directory to dirname on local host
dir patternlists files in current directory
cdupchanges up to next higher directory
ascset to transfer files in ASCII mode (text files)
bintransfer files in binary mode (images, data etc)
get rcrap crap.txtcopy file rcrap on remote to crap.txt on local
put crap.txt rcrap copy file crap.txt on local to rcrap on remote
del rcrapdelete the file rcrap on the remote host
mget patternget or put groups of files in remote host's
mput patterncurrent directory whose names match pattern
mdel patterndelete files on remote whose names match pattern
promptstop Yes/No prompts at each file for mget et al
quitdisconnect and get back to local's command line

All the other stuff in the session are messages telling you what is going on. The 3 digit number before each message tells 'ftp' what is going on. FTP clients for PC Windows exist but are slow due to their insistence on loading lots of directories at start of each session. Host-to-host file copying can also be done with the 'rcp' UNIX command. This has the same conditions and restrictions as 'rlogin' and 'rsh'. You have to be known and authorised by each host you deal with. See p203.

Internet File Types

The Internet supports two fundamental kinds of file transfer: ASCII and binary. FTP automatically translates from one ASCII format to another if the sending and re­ceiving hosts use different text file formats. Binary files are always transferred ex­actly as they are. A sample of different types of text and binary files are:

Text Files
TXTplain English, French, Spanish, German...
Cprogram source files in C, Pascal etc..
INIprogram initialisation and control files
PSpostscript printer control files

Plus any other type of file containing ASCII characters.

Binary Files
GIFGraphics Interchange Format
JPEGJoint Photographic Experts Group
MPEGMoving Photographic Experts Group
EXEExecutable program files
DATprogram data files
ARCarchive files, & ZIP - compressed files

You can send files as they are, but for large files it is better to compress them be­fore transmission to reduce the amount of data to be transferred and hence net­work time. To send large numbers of files it is also better to compile them into a single archive file before compressing and transmitting them. The whole process of archiving and compressing is shown below. You reverse the process at the other end.

Schematic of archiving and compressing.

Archiving & Compression

compressclassic UNIX compression utility
zcatviews compressed UNIX files without de-compressing
tarold UNIX disc to tape archiver
cpioa re-invention of the above wheel
paxlatest UNIX archiver/compressor
PKZIParchiver and compressor for PCs
gzipFree Software Foundation compressor
gcattheir compressed file viewer

You can get both text and binary files by email by sending a request to an FTP mail server as follows:

FTP ftpmail@doc.ic.ac.uk uuencode
USER anonymous
binary
cd fyi
get fyi-index.txt
quit

The uuencode option converts binary files into a 7-bit text form that avoids the ASCII control characters 0 - 31. When you get the file from your mailbox you have to uudecode it to get the original binary.

Search Engines

Archie: world-wide Filename Search Utility

Archie is a server demon that searches the Internet for software. Archie can be contacted by telnet, client or email. The only sensible way to use Archie is to email him with a search request. He will then do the search and leave the results in your mailbox. To search the Internet for a program file called mktr, send the following email message:

To:archie@archie.doc.ic.ac.uk
From:robmorton@ebs.com.uk
progmktr
quit

Archie then searches all software directories on all the servers on the Internet for occurrences of this file name. He then returns the list to you as an email message which he can post to your mailbox. As well as the program files themselves, servers also hold text files that describe the programs. You can ask Archie to search the Internet for occurrences of the name of the program (or any other string for that matter) within the program description files by sending an email message like:

To:archie@archie.doc.ic.ac.uk
From:robmorton@ebs.com.uk
whatismarketeer
quit

This asks Archie to search all program description files on the Internet for all occur­rences of the name marketeer. Both the 'prog' and 'whatis' commands take a string argument that is assumed to be a UNIX regular expression. This allows you to do things like:

prog [0-9]searches for filenames containing digits
prog ^[a-z]or filenames containing no small letters
prog ^birdie.*txt$This causes Archie to search for filenames that begin with birdie and end with txt. The ^ ties birdie to start of filename, $ ties txt to end of the filename, .* means there can be any number of any type of characters in between.

As well as the 'prog' and 'whatis' commands as shown in the above email request messages, Archie will respond to other commands as follows:

compressmakes Archie compress his reply file before sending it to you as an email message.
pathlets you specify a directory path on your host where you would like Archie to put his reply file (assuming you don't want him to put it in your normal incoming email directory)
serversmakes Archie send you an up-to-date list of Archie servers on the Internet
helpreturns the help text for using Archie by email.
quitends your email request to Archie

Gopher: A Menu-Based Search System

Gopher finds documents on the Internet using menus. Menus can contain 4 types of item: other menus, search items, telnet items and files:

Gopher search menu structure.

You can 'telnet' gopher, but it is better to use a gopher client program. HGOPHER is a good Windows client written in England by Martyn Hampson and is free. You go trogging through the menus until you get to one listing the files you want. Select the file you want from the menu. Gopher then starts displaying it. Press Q to stop it. Gopher then asks if you want it copied (s) or mailed (m) to you. You can only use (m) if you are 'telnetted'. If you accessed a UNIX Gopher via dial-in from a PC, you can press return and then D to download the file to your PC. Gopher can download any file in its menu whether 'txt' or binary. The UNIX Gopher controls are shown below:

Enterselect current menu item (where cursor is)
ugo back to previous menu (same as left arrow)
+move to next menu page
-move back to previous menu page
mgo back to main menu
digitsgo to the numbered menu item
/search menu for the string that follows it
nsearch for next match
qquit - leave Gopher
>=get description of current item
aadd current item to your bookmark menu
Aadd this whole menu to your bookmark menu
vview bookmark menu
ddelete current bookmark menu
mmail current file to your mailbox
ssave current file (not for telnet)
pprint current file (not for telnet)
Ddownload current file

When you come across a menu item you may want to go back to, mark it and then hit 'a' (lower case) to put it on your own bookmark menu. You can build a bookmark menu of all your items of interest as you are trawling through the Internet. Veronica is an index of all Gopher menus which can be accessed on the Internet.

WAIS: The Wide-Area Information Server

This is implemented on a CM5 Connections Machine — a neural network super­com­p­uter built by a company called Thinking Machines. Apple and Dow Jones are also involved in WAIS. Archie and Gopher only search filenames and titles for a single given string. WAIS searches the actual contents of files for occurrences of sets of strings. Eg. if you set the search string as "florida pie", it will search documents for occurrences of the words "florida" and "pie". It will capture them whether they app­ear together as "florida pie" or separately. You can 'telnet' WAIS to get into its ab­solutely horrible UNIX command-line interface. When you 'telnet', you first get a list of about 500 WAIS servers. Press

oif required to set WAIS display options (see page 266)
wto enter your search words

A Search Results Menu appears listing databases, each with a relevance rating from 0-1000. Mark the ones you want as follows:

jto move down the list
­kto move back up the list
Jpage down the list (Ctrl-V and Ctrl-D also do this)
Kpage back up the list (Ctrl-U also does this)
/namesearch for name within the menu list (can be a partial name)
item N°positions cursor at a menu list item (ie a database).
.shows a short description of the database (or hit space bar)
umark the database to use for part of your search.
sreturn to the sources menu where now are shown the databases you have just marked.

Navigate through this list in the same way as above, using the space bar to select the ones you finally decide to search on. An asterisk appears next to selected data­bases. Space bar can also de-select an already selected one. Then press w to enter your key words again as before. You then get a list of files. To get them you must then use anonymous FTP to get them to your own Internet host and then if neces­sary download them to your PC. But for short text items you can type:

.to display the contents of the file
mto mail the file to you
sto return for another search session
qto quit WAIS

You can get to WAIS access points in Gopher Menus and the World-wide Web. It's best however to use the client software package WinWais written by the United States Geological Survey. To use this you:

  1. Set up sources to search, ie list of servers or databases
  2. Enter the search text in the Tell me about... box
  3. WAIS comes back with a list sources containing your text words
  4. It gives them a starred relevance rating

The World-Wide Web

Hypertext Document System

The World-Wide Web allows you to hop across hypertext links between a vast num­ber of different topics in all the databases in all the WWW servers all over the world as depicted below.

Schematic of the worldwide web hyperlink structure.

The World-Wide Web's hypertext files can have multimedia file links embedded within them. These can be links to still images files (GIF or PEG), sound files or movie clips (in MPEG files). Information is transferred from the server's hypertext files to the client's browser program using the hypertext transfer protocol http. Image, sound and video files are very large, even for a small image or movie clip. So they take a long time to arrive across the Internet. Text is fast. If you are doing a serious web-walk, it is therefore best to stick to text only until you have got really close to what you want.

If you have suitable client software on your PC such as Mosaic, you can walk the web quite easily. It is a graphics package with Windows, Xwindows and Mac ver­sions. It was written by the National Center for Supercomputing Research in the USA and is free. It is full of bugs, but is OK to use. Improved versions are emerging all the time. But because it is graphical and displays all images as they come, it is very very very slow — so slow as to be unusable on anything below a Pentium PC or a SUN or SPARC-type UNIX workstation.

It is probably better to 'telnet' to a WWW server and invoke Lynx. This is a char­acter based hypertext browser. It may look frumpy, but it is fast. Hyperlinks to multi­media items are indicated by [IMAGE]. If you want to see the image or run the movie clip you click on the [IMAGE].

Some of the hyperlinks in WWW documents lead to Gopher Menus and WAIS. WWW can launch these for you but it is better to note them and access them directly later through your Gopher or WAIS client software.


© 1998 Robert John Morton