The Ouch Shell

Version 2.x

Introduction

In this assignment you will continue development of a Unix shell program, begun in Assignment 3. In this assignment, you will build the general structure for an interactive shell, extending the -c option you wrote for Assignment 3. Future assignments will add successively more features to the interactive shell.

Project Management

I assume that all files you wrote for Assignment 3 had RCS version numbers in the range 1.1 to 1.x, where x might vary from one file to another. If that is not the case, you will need to create a new project directory and check in the final version of each Assignment 3 file as the initial versions of the files for this assignment. If you need to do this, be sure to edit out previous $Id$ and $Log$ sections of the files before checking them in to the new RCS subdirectory. But if none of your files have version numbers 2.1 or above, you don't have to worry about this: Just use the same project directory you used for Assignment 3.

All files for this assignment are to have version numbers in the range 2.1 through 2.x. Check out each file from Assignment 3 for editing and, without making any changes, check it in again as version 2.1. For example, you would use the following commands for Makefile:

    % co -l Makefile
    % ci -f -r2.1 Makefile
    > No changes.  Starting work on Assignment 4.
    .
    %
  
The -r option specifies the revision number you want to use, and the -f option forces the check in to succeed even though the file hasn't been changed. Normally, ci won't let you check in a file if you haven't actually changed anything in it.

Future assignments will use version numbers 3.x, 4.x, etc.

Multiple Source Modules

Starting with this assignment, you are required to work with multiple source modules. There is to be a common header file named ouch.h which will be included in all other source modules for the project. Put all #include for system header files (stdio.h, etc.) in here, along with function prototypes, etc. for the project. Be sure to code this header file so it cannot accidentally be included recursively.

The main() method is to be defined in a file named ouch.cc. For this assignment you will be coding the following methods, each of which is to be in its own .cc file:

 
Function File Name Purpose
getCommandLine() cmd_line.cc Prints a prompt string and reads an entire command line.
getSubcommands() subcmds.cc Breaks a command line into sub-commands.
tokenize() tokenize.cc Breaks a subcommand into tokens.
expandToken() expand.cc Does variable substitutions and pattern matching for a token.
executeCommand() exec_single.cc Executes a single command.

Requirements for Version 2.x

Each function description starts with the function prototype for the function.

getCommandLine()

    int getCommandLine( char *buf, int max );
If the value of the environment variable "PS1" is set to anything, use that as the prompt string to display before reading the command line from the user. If PS1 is not set, print "no-prompt> " as the prompt string.

Use fgets() to read a command line from the user into .buf If a command line ends with a backslash (\), read another command line and append it to whatever you have read so far. When reading continuation lines, use the value of "PS2" as the prompt string if it is set, and use the string "no-prompt>> " if PS2 is not set. Never store more than max bytes into buf, including the end-of-string '\0'.

This function returns 0 if it is successful. It should never fail, but would return -1 if it did.

getSubCommands()

    subcommand_t* getSubcommands( char *buf );
This function receives a complete command line (the one returned by getCommandLine()), and returns a pointer to the head of a linked list of structures. The structure type includes a char* pointing to a complete command, and a char that contains the character that terminated the command, defined like this:
      struct subcommand_t
      {
        char          *cmdString;
        char          termChar;
        token_t       *tokens;
        subcommand_t  *next;
      };
    
The tokens field of this structure will be filled in with the result of a subsequent call to tokenize().

The terminating character will be removed from the command string by this function, and returned in the termChar field of the subcommand_t structure. Possible terminating characters are ';', '&', '|', '\0', and possibly other characters to be added in future versions of the shell. (For example, ksh lets you type a subcommand inside parentheses, which makes the command run in a "subshell." The terminating character would be ')' in that case.)

You need to code this function so it does not get fooled by terminating characters that appear inside strings or which are escaped by backslashes. For example, the following is all one subcommand, terminated by '\0':

      echo 'Semicolons (;) & pipes (|)' \& " ||| " are nice.
  
There are three quote characters this function should recognize: single ('), double ("), and backquote (`). The function does not have to recognize nested quotes, like "The character '!' is nice."

The memory for the linked list will have to be allocated dynamically, but the cmdString pointers may be pointers into the buf parameter. That is, this function may modify buf, much the way strtok() modifies the strings it tokenizes.

tokenize()

  token_t *tokenize( const subcommand_t *cmd );
This function returns the head of a linked list of token_t structures defined as follows:
    struct token_t
    {
      char          *token;
      token_tag_t   tag;
      token_t       *next;
    };
The tag for each token with be taken from the following list of enumerated values:
 
Value Meaning
T_default A token that is neither a string nor a redirection operator.
T_double_quote A token that was enclosed in double quotes ("). The double quotes are not part of the token.
T_single_quote A token that was enclosed in single quotes ('). The single quotes are not part of the token.
T_back_quote A token that was enclosed in back quotes (`). The back quotes are not part of the token.
T_redirect_in A token for input redirection (<).
T_redirect_out A token for output redirection (>).
T_redirect_append A token for output redirection with the append option (>>).
T_redirect_clobber A token for output redirection with the "clobber" option (>!).

This function does not modify the string passed to it because the character that terminates one token might be the beginning of the next one, so there is no easy way to instert '\0' characters to break up the original string. Rather, the memory for the tokens has to be allocated dynamically.

Note that strings enclose tokens. The echo command listed above contains six tokens:

  1. "echo"
  2. "Semicolons (;) & pipes (|)"
  3. "&"
  4. " | | | "
  5. "are"
  6. "nice."
The quotes in the list above are not part of the tokens; they are there to show you where there are spaces inside tokens. This function should recognize the I/O redirection operators as tokens even if they are not separated from other tokens by whitespace.

Although you can't use this code directly, the function [ strtok2() ] might give you some ideas how to work on the design of this function.

expandToken()

    char *expandToken( const char *token )
This function is called by the main program for each token that is of type T_default or T_double_quote. If the token is the string "$?", this function substitutes a string representing the previous command's result code for the token. Otherwise, this function returns the same pointer that was passed to it. Since tokens are allocated memory dynamically, this function must free the pointer passed to it if it does perform a substitution.

executeCommand()

    int executeCommand( subcommand_t *cmd );
This function executes a single command, which might be either a builtin command or an external command. It returns the "result code" for the command, which is the exit code for external commands, or the the return code for builtin commands. If the command is neither an external command nor a builtin command, this function writes an error message to stderr and returns the value 1.

Development Strategy

For Version 2.x of this project, your program is to implement only the following features and no more! You will lose credit if this version of your program does anything more than the following: Recode main() so that something like the following gets executed if the user doesn't type any command line options:

    int result;
    char inBuf[ 1024 ];
    while ( true )
    {
      //  Read a command line
      result = getCommandLine( inBuf, sizeof( inBuf ) );
      if ( result != 0 )
      {
        fprintf( stderr, "Error reading command line\n" );
        continue;
      }
      fprintf ( stderr, "Result: %d\nStrlen: %d\nCommand: %s\n", 
                        result, strlen( inBuf ), inBuf );
      
      //  Break line into subcommands and tokenize each one
      subcommand_t *cmd_list = getSubcommands( inBuf );
      if ( 0 == cmd_list )
        continue; //  Empty command line
      subcommand_t *cmd = cmd_list;
      while ( cmd )
      {
        token_t *token_list = tokenize( cmd );
        if ( 0 == token_list )
          continue; // Not empty, but no tokens
        cmd->tokens = token_list;
        // execute it
        exitCode = executeCommand( cmd );
        cmd = cmd->next;
      }
    }

The main purpose of Version 2.x of the assignment is to be sure your project is set up with the proper source modules and Makefile. Be sure "make clean" leaves your project directory with nothing but the RCS subdirectory in it, be sure that "make" builds program correctly after you run "make clean,", be sure "make" recompiles only those files you have edited since the last make, and be sure that the sequence "touch ouch.h ; make" recompiles everything before linking.

Be sure to test your program carefully to be sure it works. In particular, be sure nothing happens in interactive mode if the user types an empty command line. Be sure prompt strings are handled correctly; you will probably find that the PS1 and PS2 variables are already set up if you log into a Linux account, and the prompt printed for PS1 will not be "pretty." But on forbin you will have to set up your own values for PS1 and PS2.

Due Date and Grading Criteria

The due date for this assignment is midnight Thursday, December 6, but no late points will be deducted for assignments submitted as late as midnight December 13. This assignment will not be accepted after December 13.

The assignment will be graded on a 5-point scale, with each of the following counting approximately equally:

Note: Your program may implement more features than specified here, and that is okay. But only the features listed here will be examined for this assignment.