Assignment 2, CS-701, Spring 1997

Ship in a Bottle

Due Date:

May 27, 1997

Deliverables:

When the assignment is complete, send me an email message telling me the path to the project directory. The project directory is to contain an RCS subdirectory, and nothing else. Consult "Using RCS" for information on how to set up your project directory and use RCS for project management. In addition, the ~/man directory tree is to contain man pages for the two executable programs. See "How to Write a Man Page" for information on how to prepare your man pages.

Requirements:

The project is to be done as a sequence of Steps as outlined below. You will not receive credit for work on a particular step unless all previous steps are complete. All the files for a given Step (Makefile, source code, and documentation) must have RCS Revision numbers with the number to the left of the decimal point equal to the Step number given below. After checking out the Makefile for a given step, typing "make STEP=<n> install" must build and install the executable and documentation files for that step. (<n> is the Step number.)

You may also leave a README file in the project directory if you think that one is needed, but I neither expect nor require you to do so.

In addition to the information given here, be sure to consult the Grading Form for this assignment, which includes additional information about the requirements for this project.

Project Description
Reference Tables
- Data Type Codes
- Message Types

Project Description

This is an exercise in distributed computing. There are two programs to write. In this document they will be called fe and be for "front-end" and "back-end" respectively, but you may give them any descriptive names you choose.

The be program is a network server, a program that runs continuously on some host. It's job is to respond to requests from client programs, which are the fe program being run interactively by users. The be maintains a database called a property list. People use the fe program to add items to the property list and request information about it.

The project is called "Ship in a Bottle" because the operations of building and querying a property list are done by commands entered remotely from the computer that maintains the list. The analogy is to building a wooden ship model inside a bottle using special tools from outside the bottle.

A property list is a data structure that contains {<name><type><value>} triples. Each <name> may occur any number of times in the list, but each {<name><type>} pair must occur no more than once in the list. The names are arbitrary text strings, and the values are arbitrary numbers or text strings. For example, a property list for a computer might have two properties named "host ID," one with an integer value giving the computer's IP address and the other with a text string value giving the computer's fully qualified domain name. ("babbage.cs.qc.edu" is an example of a fully-qualified domain name.)

Code and document each step of the project separately. Use the Makefile provided to you as a starting point for your own Makefile so that any step can be recreated when the project is complete.

Step 1 -- Send strings from client to server

Write a server (be) that accepts socket connections on a "well-known" port number. Your assigned well-known port number is the last four digits of your ID number. (If your assigned port number is less than 1,000, add 1,000 to it.) Allow the user to override the default port number by giving an alternate value as a command line argument. Everyone has at least five port numbers they can use without conflicting with anyone else in the course: add 1, 2, 3, or 4 to you assigned number.

Write a client (fe) that takes a host name and and optional port number as command line arguments, and connects to your be running on that host. If the user doesn't supply a port number, use your assigned well-known port number. This program prompts the user to enter a string. If the user types "q", close the connection to the server, and exit the program. Otherwise, write the string to the server, and prompt the user to enter another string.

When you write the string to the server, first send a 4-byte integer (in network byte order) telling how long the string is, and then write the string itself. When be receives a string, it prints it out, and tries to read another string from the client. When be detects end of file on the socket (the read() system call will return zero as the number of bytes read), it closes its end of the socket, and tries to accept a connection from another client.

For this step, the normal way to terminate be is to send it SIGINT.

Notes: For this and all succeeding steps, fe must run correctly on qcunix1 as well as in the laboratory. Of course you will have to recompile your program to run it on qcunix1, but it must not require any changes to the source code in order to compile and run the program there. You do not have to be able to run be on qcunix1, however.

When you build your programs in the laboratory, they have to be linked with libsocket.so and libnls.so, which is set up in the sample Makefile. However, neither of these libraries is used on qcunix1.

Step 2 -- Terminate and Value messages

In this step we start to develop the protocol for exchanging messages between be and fe processes.

Each message begins with a "siab header", which consists of four 4-byte integers; each integer is transmitted in network byte order. ("siab" stands for "Ship in a Bottle"). The first integer tells how many bytes there are in the entire message, including the header and any additional bytes that are part of the message. The next integer tells the type of message being sent. The message type will be either 0, meaning "Terminate," or 1, meaning "Data Value". For data value messages, the third integer tells the type of data value being sent. The data type will be 1, meaning "text string," 2, meaning "integer," or 3, meaning "floating-point string." The last integer in the header is the value of the integer if the data type is 2. Otherwise, it is the length of the text or floating-point string, in bytes.

If you send the entire siab header using a single write() system call, the program that receives a message will be able to read the entire header with one call to read(). Strictly speaking, you should not be able to count on this when using SOCK_STREAM sockets, but for our purposes it will "always" work.

[Experiment: A client writes 1,000,000 bytes to a socket endlessly, and a server reads 1,000,000 bytes from the other end of the socket endlessly. Run the client on qcunix1, babbage, and cougar, and run the server on qcunix1 and on babbage.

Server Client Total Bytes
Transferred Average
Bytes per Read Minimum
Bytes per Read Maximum
Bytes per Read
babbage babbage 1,006.0M 94,995 576 131,072
babbage cougar 42.9M 4,222 1360 8,760
babbage qcunix1 18.7M 3,029 80 7,300
qcunix1 babbage 19.1M 1,902 1,360 33,580
qcunix1 qcunix1 153.1M 13,317 8 64,240
The important statistic here is the minimum number of bytes received by a single call to read(). Except when both the client and server run on qcunix1, I never observed fewer bytes than the size of a complete siab header being read by the server.]

Server	Client	Total Bytes Transferred	Average Bytes per Read	Minimum Bytes per Read	Maximum Bytes per Read
babbage	babbage	1,006.0M	94,995	576	131,072
babbage	cougar	42.9M	4,222	1360	8,760
babbage	qcunix1	18.7M	3,029	80	7,300
qcunix1	babbage	19.1M	1,902	1,360	33,580
qcunix1	qcunix1	153.1M	13,317	8	64,240

If the data type is 1 or 3, the siab header is followed immediately by the string itself.

For this step, fe again connects to the server, and then prompts the user to enter a string. If the string is "q" the program closes its socket connection to the server and exits. If the string is "exit" the program writes a siab message with the message type field set to 0, then closes its socket connection to the server and exits.

If the user types a string consisting entirely of decimal digits, convert the string to an integer and send it to the server in a properly formatted siab data message. If the string consists entirely of digits and exactly one period, send the string to the server as a floating-point string. (Optional: accept plus and minus characters and an exponent field.) If the string does not meet either of these criteria, send it to the server as a text string.

As in Step 1, fe continues to run until the user types "q" (or "exit") in response to the prompt.

For this step be displays each data value and its type. It may seem silly, but floating-point strings must be converted to doubles (use strtod()) and printed using the printf()'s %f format. (C++ programmers may use cout, but it still must be the double value that is output, not the string.) When a terminate message is received, the server prints an appropriate message and exits.

Step 3 -- Command File (fe) and Log File (be)

When the server starts up, it opens a "log file" with a filename that consists of the program's name with ".log" at the end. Use argv[0] to determine the program's name. Instead of writing messages to stdout, write text messages to the log file. Each log file message must include the time and date when it was written, the IP address (or hostname) of the computer that sent the message, the message type, the data type, and the data value. Be sure the log file includes a message telling when the server exited, even if it is terminated by SIGINT. It must be possible to read the latest message written to the log file as soon as it is written.

Add a command line option to fe that specifies a filename from which it reads commands. Echo commands to the screen as they are read from the file. If end of file is reached without a "q" or "exit" command, the program should do an automatic "q" command. (If you prefer, you may make the program prompt for additional commands from the user if it reaches the end of the command file.) Use "-f" to indicate the presence of a filename on the command line. If there is no command file, fe runs interactively as usual.

Step 4 -- Add and Change messages

Now the the client and the server both send messages to each other. An exchange always begins with the client sending a request to the server and the server sending a reply to the client. Requests and replies always alternate. A request may consist of a sequence of siab messages, but for this step a reply always consists of exactly one siab message.

Now fe prompts the user to enter a command instead of just a data value. Each command consists of a sequence of tokens separated by spaces. (Use strtok() to parse command lines.) The first token is a command name, which will be one of the following words:

Command	Meaning
q	Exit fe.
exit	Send a terminate message to be and exit fe.
add	Add a node to the property list. There will be two more tokens on the command line. The first is the name of the property node, and the second is the value of the node, with the data type implied by the format of that token, as in Step 2. In the case of text string values, the value may consist of an arbitrary number of tokens.
change	Change the value of a node that is already in the property list. The command line has the same syntax as add commands.

For both the add and change commands, fe will send two siab messages to the server. The first message will have a message type of 2 for "add" or 3 for "change;" the last word of the header will tell the length of the property name, and the name itself will be sent immediately after the header. The second message will be type 1 ("data"), and will send the property value as described for Step 2.

The server will send a single reply message after receiving a complete add or change request. The message will be type 4 ("positive acknowledge") if the server was able to comply with the request or type 5 ("negative acknowledge") if the server was unable to comply with the request. The server will reply with a positive acknowledgement unless (1) The request was add, but the {<name><type>} tuple already exists in the property list or (2) The request was change but the {<name><type>} tuple does not exist in the property list.

The fe program does not prompt for a new command until it receives a reply from the server. If the reply is a positive acknowledgement, fe justs prompts for the next command as usual. However, if the reply is a negative acknowledgement fe displays an error message before issuing the next prompt.

Note: Now that both programs need to read siab messages, write a single function that will read a complete siab message from a socket. Put the code for this function in a separate source file, put the function prototype for the function into a header file, and link both fe and be to the common object file. Of course, you may define other utility functions and put them in your separately-compiled module also. There is a header file in ~vickery/CS-701/siab_utils.h that you may use as a starting point for your own header file, if you like.

Step 5 -- Get messages

Now add query commands to fe. The new commands and their meanings are as follows:

Command	Meaning
get value	There will be two more tokens on the command line, the first is a property name, and the second is a property type, which will be one of the words, "string," "integer," or "float." fe responds with a message giving the corresponding value or an error message indicating that the specified tuple does not exist in the property list.
get types	There is one more token on the command line, a property name. fe displays a list of types defined for that name in the property list, or an indication that the specified name does not exist in the list at all.
get names	There are no more tokens on the command line. fe displays a list of all property names defined in the property list or a message saying that the list is empty.

For "get value" commands, fe sends a siab message with message type equal to 6, data type of 1, 2, or 3 depending on the type specified by the user, and the rest of the message consists of the name length in the last word of the header and the name string itself following the message header. The server replies with either a message type of 1 (data) or 5 (negative acknowledgement).

For "get types" commands, fe sends a siab message with message type equal to 7, data type of one (text), and the rest of the message consists of the name length in the last word of the header and the name string itself following the header. The server replies with a message type of 4 (Positive Acknowledgement) message with a data type code of 4 (type list) and a count of the number of types in the last word of the header. The message header is followed by that number of additional integers containing codes for the types defined for the property name. The integer codes are the same as always: 1 for text, 2 for integer, and 3 for floating-point.

For "get names" commands, fe sends a siab message with message type equal to 8, and the rest of the header must contain zeros. The server replies with a message of type 4 (positive acknowledgement). The last word of the header of this reply tells how many different property names there are. The server then sends a type 1 (data value) message for each different name that occurs in the property list. This list of names should be sent in alphabetical order, but that is not required. If the list is empty, the acknowledgement message will have a length field of zero, and there will be no data value messages after it.

Step 6 -- Curses User Interface

Use the curses package to provide a full-screen user interface to fe instead of the command-line version. The screen shows all property names in alphabetical order in one column. The next two columns give the types and values for that name. The user can use the arrow keys and <tab> key to scroll through the list and to move to diffent columns, and can edit the list by typing in new values.

Step 7 -- Concurrent Server

This step is optional!

Change from an iterative server to a concurrent server so that two users can view and edit the property list simultaneously without interfering with each other. You will have to introduce a mechanism so that all clients are notified with the list is changed, and you will have to insure that changes made by one user do not conflict with changes made by another user at the same time.

Ideas for Other Features

Support quoted strings for property names and for text values.
Implement support for deleting nodes from the property list.
Add a command for saving the property list to a file. A command line option would tell the server to load the property list from a file when it starts up.
Add more data types. Money and dates are good candidates.
Timestamp changes to the property list, and let users query creation and last modification times.
Make the property list hierarchical. Each list has a key (such as an IP address(?)) and users can query/edit different lists.
Distribute property lists across the network. When fe starts up it sends out a datagram asking what servers are available, and lets the user select which one is to be the target of each command.

Reference Tables

Data Type Codes

All messages include a Data Type Code field in the header. Here is a table of Data Type Code values, mnemonic names for the codes (that are used in the next table, but which do not need to be used in your code), and their meanings.

Data Type Code	Mnemonic	Meaning
0	UNDEF	Use this value in messages that do not use the the Data Type Code for anything else.
1	TEXT	The message is followed by a text string. The length of the text string, in bytes, is in the "Value or Length" field of the message.
2	INTEGER	The "Value or Length" field of the message contains a two's complement integer in network byte order.
3	FLOAT	The message is followed by a text string that represents a floating-point number. The length of the string, in bytes, is in the "Value or Length" field of the message.
4	TYPE_LIST	The message is followed by a list of integers. The number of integers in the list is in the "Value or Length" field of the message. If the Value or Length field contains zero, no integers follow the message header.

Message Types

Here is a summary of all the types of messages. All messages start with a "message header" consisting of:

Message Length. A 4-byte integer in network byte order giving the total number of bytes in the message, including any bytes that follow the message header.
Message Type. A 4-byte integer in network byte order that is a code for the type of message. The table below lists the Message Type Codes and their meanings.
Data Type. A 4-byte integer in network byte order that is a code for the type of data that is conveyed with the message. The next table above lists the Data Type Codes and their meanings. Some message types do not use this field, in which case it is zero.
Value or Length. A 4-byte integer in network byte order that contains either an integer value or the length of a string or list that follows the message header. Some message types do not use this field, in which case it is zero.

The message header may or may not be followed by additional information, depending on the Message Type and Data Type values. See the table below for details.

Msg Type	Data Type	Value or Length	Meaning
0	zero	zero	Terminate Server. Message from client to server telling it to shut down.
1	type code	value or length	Data Value. The data type code is one of the values from the previous table, and the remainder of the message depends on the data type code.
2	TEXT	length of name	Add Node. Message from client to server telling it to add a node to the property list if possible. This message is followed immediately by the property name, then by a Data Value message giving the value for the new node. The {<name><data type>} tuple must not exist in the property list for this message to succeed.
3	TEXT	length of name	Change Value. Message from client to server telling it to change the value of a node. The property name follows this message, and the next message gives the new value. The {<name><data type>} tuple must already exist in the property list for this message to succeed.
4	type code	value or length	Ack (Positive Acknowledgement). Sent by server to client if it is able to process an Add or Change request successfully or to reply to any of the "Get" requests (see below) Ack following an "Add" or "Change" request will have the data type code and value or length fields equal to zero. Ack following a "Get Value" message will have the data type code and value or length set the same as for a Message Type 1 (Data Value) message, and the message header will be followed by a string if the type code is TEXT or FLOAT. Ack following a "Get Types" message will have the data type code set to TYPE_LIST and the Value or Length field set to number of integer type codes that follow immediately after the message header. The Value or Length field will be zero in this message if the Get Types message specified a name that does not exist in the list. Ack following a "Get Names" message will have the data type code set to zero and the Value or Length field set to the number of Data Value messages that will be sent immediately after this message. Each of those Data Value messages will specify a property name as a TEXT item. The value or length field of this Ack will be zero if the list is empty.
5	zero	zero	Nak (Negative Acknowledgement). Sent by server to client if it is not able to process an Add, Change, or any of the "Get" requests successfully.
6	type code	name length	Get Value. Message from client to server asking for the value of a {<name><type>} tuple. The Data Type field specifies the property type, and the property name is sent immediately after the header with its length in the Value or Length field.
7	TEXT	length	Get Types. Message from client to server asking for a list of types that are defined for a particular property name. The length of the property name is in the Value or Length field, and the property name itself follows the header. The server replies with an Ack message that tells the number of types defined for the given name in its Value or Length field, followed immediately by that many integers, each of which contains a type code. If the name does not occur in the property list, the Value or Length field of the Ack message will contain zero.
8	zero	zero	Get Names. Message from client to server asking for a list of the different property names in the property list. The server replies with an Ack message that tells how many names there are in its Value or Length field, followed by a Value message with a Data Type of TEXT for each of the names. (If the list is empty, the Ack will indicate zero names, and no Value messages will be sent.)

Christopher Vickery
Queens College of CUNY