The project directory is to contain an RCS subdirectory where you are
to keep the source files for two separate programs (a Web server and a
Web client), and a Makefile, all under rcs management. In addition,
the directory ~/man/man1
is to contain man pages
for the executable programs you write for this project.
You may leave a README
file in the project directory if
you think that one is needed, but I neither expect nor require you to
do so.
It is a requirement for this project that both your programs interact properly with standard Web programs in use on the Web. That is, your server must be able to respond to requests from Netscape and Lynx, and your client must be able to make requests to servers such as the ones running on babbage and qcunix1.Your programs are to exchange information using the Hypertext Transfer Protocol (HTTP). The full documentation of the HTTP protocol is available at The WWW Consortium, but you should not need to read the full document. This assignment page includes a summary of the parts of the protocol that you need to handle.
The steps you are to follow to develop the two programs are listed below. First develop your server, and then start working on your client, which is an open-ended exercise. The list of steps for the programs tell you how far you are expected to get this term.
HTTP servers use a stream socket to receive request messages from HTTP clients, and to send response messages in return. Your server is to operate as an iterative server: It accepts a connection from a client, reads the client's request, sends a reply to the client, closes the socket, and then becomes ready to accept a connection from another client.
Your server is to use the default port number assigned to you, based on the last three digits of your ID number. It must accept a command line argument that allows the user to specify an alternate port number.
CRLF
(Ascii Carriage Return and Linefeed characters). The
format of a Request-Line is "GET <uri> HTTP/1.0
CRLF
", with <uri> (Universal Resource Identifier)
being a Unix pathname relative to the server's root directory.
The optional Request Headers consist of any number of text lines terminated by
CRLF
. Your client must generate at least one valid
request header consisting of a line in the form User-Agent:
<identifier>CRLF
. The <identifier> is a string that
identifies your client program, including its RCS version number. Your
server will not interpret request headers, but will copy them all to
its log file. Fuller descriptions of your client and server are given
below.
In summary, a request message consists of a Request-Line, zero or more
Request-Header lines, and a terminating CRLF
. The server
knows it has read all of a client's request when it gets a blank line
of text.
CRLF
, optionally
followed by an Entity-Body.
The Status Line consists of a version string ("HTTP/1.0"), one or more
spaces, a 3-digit code number, one or more spaces, some text that
describes the status code, and finally the end of line code,
(CRLF
). Here is a list of status codes and suggested text
that your server might generate:
The meanings of these codes should be pretty self-evident: 200 means the request was all right, 204 means the request was for an empty file, 400 means there was something wrong with the syntax of the request, 403 means the request was for a file that exists but for which the server does not have read access, 404 means the request was for a file that does not exist. 500 will probably never occur, and 501 is a way of handling requests that your server recognizes but can't handle yet.200 OK 204 No Content 400 Bad Request 403 Forbidden 404 Not Found 500 Internal Server Error 501 Not Implemented
Your server will return at least one Response-Header, a line containing
"Server: <identifier>CRLF
". Like the User-Agent line
sent by the client, this line is to contain an identifier string giving
the name and RCS version of your server.
If the server is going to return a file to the client, it first sends
an Entity Header that describes the file being returned. In
particular, the server sends an Entity Header line containing
"Content-Length: <size>CR
", where size is the
length of the file, in bytes, being returned as the Entity Body. Your
server must also send a second Entity Header line containing
"Content-Type <media-type>CRLF"
. The
media-type will normally be the string,
"text/plain
", but it might be one of these common
names:
Note: Your server might return one of these additonal content types, but I don't expect your client to know what to do with them!
text/html
- A Hypertext Markup Language web page.
image/gif
- A picture in GIF format.
application/octet-stream
- An arbitrary binary file.
audio/basic
- An audio file.
video/mpeg
- An MPEG movie.
Unless the server returns a status code of 204 (no content), it must return the contents of the requested file as the Entity Body part of its reply. The client can tell when the Entity Body begins because it is separated from the header lines by an empty line. When the server has written all of the file to the client, it closes its end of the socket, which signals the client that the reply is complete. The client could also tell how many bytes it will receive by examining the Content-Length header line, but this is not a reliable technique. The Content-Length value can be used by clients to provide users with feedback about how much longer it will take to retrieve the requested document.
The server's log file is to be written in the same directory as the
server's executable file, and must have the same name as the server's
executable file, but with an extension of .log
. You must
determine the name of the server's executable file from the value of
argv[0]
passed to main().
Optional: If the environment variable SERVER_LOG
is set,
use that as the full pathname to the server's log file instead of the
name specified above.
Note: If you write debugging information to the log file (a deprecated practice; use gdb or ddd instead), you must be sure no such messages are written in the version of the assignment that you submit.
Remember! You must code this step, and no more, as a working program before you proceed to work on the next step. The RCS version number for this program is to be 1.1. This policy applies to all steps in this project.
Note: Test this and all subsequent versions of your server using Lynx and Netscape.
Remember! The RCS version number for this step must be 1.2.
GET
" and the HTTP-version is "HTTP/1.0
". If
either is wrong, send a "400 Bad Request" response and close the
socket. Otherwise send a "501 Not Implemented" response.Optional: You could accept version numbers other than 1.0, provided they start with "HTTP/."
Remember! The RCS version number for this step must be 1.3.
SERVER_ROOT
". Use stat() to determine
whether it is a directory or a file. If it is a directory, return "501
Not Implemented." If it is a file of size zero, return "204 No
Content." If it is a file, open it for reading and copy it to the
client as the Entity Body of the response. (Be sure to set the
Content-Length" header line properly.) If the call to open()
fails, return "403 Forbidden." If the file does not exist, return "404
Not Found."Remember! The RCS version number for this step must be 1.4.
Remember! The RCS version number for this step must be 1.5.
Remember! The RCS version number for this step must be 1.1.
parseurl.h
containing the function prototype for this function. (#include the
header file in both your .c files.) parseURL() must receive the
command line entered by the user as one argument, and it is to return
the parsed URL either as a structure or as an array of strings, as
discussed in class. Use gdb or ddd to verify that your
parser is working correctly.Remember! The RCS version number for this step must be 1.2.
CLIENT_FILE
is
set to the pathname of a file, write the server's reply to it. If the
status code of the server's reply does not begin with the character "2"
display the complete Status Line of the server's reply to the user.Remember! The RCS version number for this step must be 1.3.
Remember! The RCS version number for this step must be 1.4.