THE SIMPLEDB DATABASE SYSTEM
                  General Information and Instructions


This document contains the following sections:
    * Release Notes
    * Server Installation
    * Running the Server
    * Running Client Programs
    * SimpleDB Limitations
    * The Organization of the Server Code


I. Release Notes:

  This release of the SimpleDB system is Version 2.10, which was
  uploaded on January 16, 2013.  This release provides the following
  fixes to Version 2.10:

    * The files simpledb.Startup and remote.SimpleDriver have been changed 
      to use a server-specific registry, instead of forcing the user to 
      run rmiregistry as a separate process.
    * The files ConnectionAdapter, DriverAdapter, StatementAdapter, and 
      ResultSetAdapter in simpledb.remote have been changed to handle
      the new Java 7 JDBC methods.
    * A bug was fixed in the file SortScan.java.
    * The new client file StudentMajorNoServer was added.

  SimpleDB is distributed in a WinZip-formatted file. This file contains
  four items:

    * The folder simpledb, which contains the server-side Java code.
    * The folder javadoc, which contains the JavaDoc documentation 
      of the above code.
    * The folder studentClient, which contains some client-side code 
      for an example database.
    * This document.

  The author welcomes all comments, including bug reports, suggestions
  for improvement, and anectodal experiences.  His email address is 
  sciore@bc.edu
  

II. Installation Instructions:

  1)  Install the Java SDK, level 1.5 or higher.

  2)  If you do install Java 1.5, you need to make some minor changes 
      to the package simpledb.remote:
    
      * The classes named xxxAdapter provide default implementations of 
        the interfaces in java.sql. Java 1.6 added several extra methods 
        to these interfaces. If you are using Java 1.5, just comment out 
        those methods. (You can tell which ones they are because you'll 
        get an error when you try to compile them.)
      
      * The classes named SimpleXXX call the SQLException constructor 
        with a Throwable argument.  This constructor is new to version
        1.6. To use in 1.5, rewrite the code "throw new SQLException(e)"
        to be "throw new SQLException(e.getMessage())".

  3)  Decide where you want the server-side software to go. Let's assume 
      that the code will go in the folder C:\javalib in Windows, or the 
      folder ~/javalib in UNIX or MacOS.

  4)  Add that folder to your classpath. In other words, the javalib 
      folder must be mentioned in your CLASSPATH environment variable.
    
      * In UNIX, your home directory has an initialization file, 
        typically called .bashrc.  If the file does not set CLASSPATH,
        add the following line to the file:  
               CLASSPATH =.:~/javalib     
               
        Here, the ‘:’ character separates folder names.  The command 
        therefore says that the folder "." (i.e., your current diretory) 
        and "~/javalib" are to be searched whenever Java needs to find a 
        class.  If the file already contains a CLASSPATH setting, modify 
        it to include the javalib directory.
 
      * In Windows, you must set the CLASSPATH variable via the System 
        control panel.  From that control panel, choose the advanced tab 
        and click on the environment variables button.  You want to have 
        a user variable named CLASSPATH that looks like this:
               .;C:\javalib
               
        Here, the ‘;’ character separates the two folder names.

  5)  Copy the simpledb folder from the distribution file to that
      folder. Within the simpledb folder should be subfolders 
      containing all of the code for SimpleDB.


III. Running the Server:

  SimpleDB has a client-server architecture. You run the server code on 
  a host machine, where it will sit and wait for connections from clients.
  It is able to handle multiple simultaneous requests from clients, 
  each on possibly different machines. You can then run a client program
  from any machine that is able to connect to the host machine.

  To run the SimpleDB server, run Java on the simpledb.server.Startup class.  
  You must pass in the name of a folder that SimpleDB will use to hold the 
  database. For example in Windows, if you execute the command:
      
         > start java simpledb.server.Startup studentdb
             
  then the server will run in a new window, using studentdb as the
  database folder.  You can execute this command from any directory;
  the server will always use the studentdb folder that exists in your
  home directory.  If a folder with that name does not exist, then
  one will be created automatically.
 
  If everything is working correctly, when you run the server with a
  new database folder the following will be printed in the server’s 
  window:

      creating new database
      new transaction: 1
      transaction 1 committed
      database server ready

  If you run the server with an existing database folder, the following
  will be printed instead:

      recovering existing database
      database server ready

  In either case, the server will then sit awaiting connections from
  clients.  As connections arrive, the server will print additional
  messages in its window.

  The server is implemented using RMI, and requires that an RMI registry
  be running on port 1099. If a registry is running when the server is 
  started, it will use that registry; otherwise, it will run the registry
  itself.


IV. Running Client Programs 

  The SimpleDB server accepts connections from any JDBC client. The client
  program makes its connection via the following code:
            Driver d = new SimpleDriver();
            String host = "mymachine.com"; //any DNS name or IP address
            String url = "jdbc:simpledb://" + host;
            Connection conn = d.connect(url, null);

  Note that SimpleDB does not require a username and password, although
  it is easy enough to modify the server code to do so.

  The driver class SimpleDriver is contained in the package 
  simpledb.remote, along with the other classes that it needs. A client
  program will not run unless this package in its classpath. Note that
  you could install the entire SimpleDB server code on a client machine,
  but that is overkill.  All you need is simpledb.remote.

  The studentClient folder contains client code for a simple university
  student-course database.  The folder contains two subfolders, named 
  simpledb and derby.  The simpledb subfolder contains programs that 
  run with the SimpleDB database server. The derby subfolder is not
  relevant here. (It contains programs for the Derby database server,
  which can be downloaded from db.apache.org. That code is used to
  illustrate some examples from my text "Database Design and
  Implementation", published by John WIley.) 

  The following list briefly describes the SimpleDB clients.

    * CreateStudentDB creates and populates the student database used
      by the other clients.  It therefore must be the first client run 
      on a new database. 
    * StudentMajors prints a table listing the names of students and 
      their majors.
    * FindMajors requires a command-line argument denoting the name of 
      a department.  The program then prints the name and graduation
      year of all students having that major.
    * SQLInterpreter repeatedly prints a prompt asking you to enter a 
      single line of text containing an SQL statement.  The program then 
      executes that statement.  If the statement is a query, the output 
      table is displayed.  If the statement is an update command, then
      the number of affected records is printed.  If the statement is ill
      formed, and error message will be printed.  SimpleDB understands 
      only a limited subset of SQL, which is described below.
    * ChangeMajor changes the student named Amy to be a drama major.  
      It is the only client that updates the database (although you can 
      use SQLInterpreter to run update commands).

  These clients connect to the server at "localhost".  If the client is  
  to be run from a different machine than the server, then its source code 
  must be modified so that localhost is replaced by the domain name (or IP 
  address) of the server machine. 
  
  Unlike the server classes, the client classes are not part of an 
  explicit package, and thus they need to be run from the directory that
  they are stored in. For example, suppose we copy the studentClient 
  folder from the distribution file to our home directory.  In Windows
  we could execute the client programs as follows:

             > cd C:\studentClient\simpledb
             > java CreateStudentDB


V. Running SimpleDB as a Standalone Program

  It is possible to write a program that calls the SimpleDB source code 
  directly, instead of calling server.Startup. The demo program 
  StudentMajorNoServer is an example. In this case, the entire database 
  source code is available to the program. Such programs are very useful
  for testing changes to the source code without having to run the server 
  and a client.  


VI. SimpleDB Limitations

  SimpleDB is a teaching tool. It deliberately implements a tiny subset
  of SQL and JDBC, and (for simplicity) imposes restrictions not present
  in the SQL standard.  Here we briefly indicate these restrictions.


  SimpleDB SQL
  
  A query in SimpleDB consists only of select-from-where clauses in which
  the select clause contains a list of fieldnames (without the AS 
  keyword), and the from clause contains a list of tablenames (without
  range variables).
 
  The where clause is optional.  The only Boolean operator is and.  The
  only comparison operator is equality.  Unlike standard SQL, there are
  no other comparison operators, no other Boolean operators, no arithmetic
  operators or built-in functions, and no parentheses.  Consequently,
  nested queries, aggregation, and computed values are not supported.

  Views can be created, but a view definition can be at most 100 
  characters.
 
  Because there are no range variables and no renaming, all field names in
  a query must be disjoint.  And because there are no group by or order by
  clauses, grouping and sorting are not supported.  Other restrictions:

    * The "*" abbreviation in the select clause is not supported.
    * There are no null values.
    * There are no explicit joins or outer joins in the from clause.
    * The union and except keywords are not supported.
    * Insert statements take explicit values only, not queries.
    * Update statements can have only one assignment in the set clause.


  SimpleDB JDBC
  
  SimpleDB implements only the following JDBC methods:

   Driver

      public Connection connect(String url, Properties prop);
      // The method ignores the contents of variable prop.

   Connection

      public Statement createStatement();
      public void      close();

   Statement

      public ResultSet executeQuery(String qry);
      public int       executeUpdate(String cmd);

   ResultSet

      public boolean   next();
      public int       getInt();
      public String    getString();
      public void      close();
      public ResultSetMetaData getMetaData();

   ResultSetMetaData

      public int        getColumnCount();
      public String     getColumnName(int column);
      public int        getColumnType(int column);
      public int getColumnDisplaySize(int column);


VII. The Organization of the Server Code

  SimpleDB is usable without knowing anything about what the code looks
  like. However, the entire point of the system is to make the code
  easy to read and modify.  The basic packages in SimpleDB are structured
  hierarchically, in the following order:

    * file (Manages OS files as a virtual disk.)
    * log (Manages the log.)
    * buffer (Manages a buffer pool of pages in memory that acts as a
              cache of disk blocks.)
    * tx (Implements transactions at the page level.  Does locking
          and logging.)
    * record (Implements fixed-length records inside of pages.)
    * metadata (Maintains metadata in the system catalog.)
    * query (Implements relational algebra operations.  Each operation 
             has a plan class, used by the planner, and a scan class,
             used at runtime.)
    * parse (Implements the parser.)
    * planner (Implements a naive planner for SQL statements.)
    * remote (Implements the server using RMI.)
    * server (The place where the startup and initialization code live. 
              The class Startup contains the main method.)

  The basic server is exceptionally inefficient.  The following packages
  enable more efficient query processing:

    * index (Implements static hash and btree indexes, as well as 
             extensions to the parser and planner to take advantage
             of them.)
    * materialize (Implements implementations of the relational 
                   operators materialize, sort, groupby, and mergejoin.)
    * multibuffer (Implements modifications to the sort and product 
                   operators, in order to make optimum use of available
                   buffers.)
    * opt (Implements a heuristic query optimizer)
 
   The textbook "Database Design and Implementation" describes these
   packages in considerably more detail. For further information, go
   to the URL www.wiley.com/college/sciore