DataSpread is a spreadsheet-database hybrid system, with a spreadsheet frontend, and a database backend. Thus, DataSpread inherits the flexibility and ease-of-use of spreadsheets, as well as the scalability and power of databases. A paper describing DataSpread's architecture, design decisions, and optimization can be found here. DataSpread is a multi-year project, supported by the National Science Foundation via award number 1633755.
Several key design innovations in DataSpread include, but are not limited to:
- A flexible hybrid data model to represent spreadsheet data within a database
- Speculative fetching to fetch additional data beyond the user's current spreadsheet window
- Asynchronous formulae evaluation thereby not requiring the users to wait for long running operations to complete
- A navigation panel which enables the users to explore tabular spreadsheet data and obtain additional details on demand via aggregation operations.
See the Wiki for full documentation on APIs, developer environment setup, and other information.
The current version is 0.5.1.
You can directly use DataSpread via our cloud-hosted site (Temporarily offline).
DataSpread can be deployed locally through Docker (recommended) or through Apache Tomcat. To start a new book, import a csv
file or use the /sample.csv
provided.
- Docker >= 1.13.0
-
Clone the DataSpread repository and go the directory in your terminal. Alternatively, you can download the source as a zip or tar.gz.
-
Install Docker. Docker makes it easy to separate applications from underlying infrastructure so setting up and running applications is quick and easy.
-
Start Docker and start the application. It should be accessible at http://localhost:8080/. Stop the application with
CTRL+C
.docker-compose up
Any changes to the code can be rebuilt by adding the build tag when starting the application.
docker-compose up --build
If there are any errors or the docker image needs to be built from scratch, run the following.
docker-compose down
docker-compose build --no-cache
docker-compose up
Data is automatically persisted in a Docker volume across shutdowns. Erase the persisted data by running the following.
docker-compose down -v
Docker uses the /docker-compose.yml
to startup the application. For more information about how the application is deployed, look at /docker-compose.yml
, /Dockerfile
, and the files in the /build-db
and /build-web
folders.
To host DataSpread locally on Tomcat, you can either use one of the pre-built WAR files, available here, or build the WAR file yourself from the source.
- Java Platform (JDK) >= 8
- PostgreSQL >= 10.5
- PostgreSQL JDBC driver = 42.1.4
- Apache Tomcat >= 8.5.4
- Apache Maven >= 3.5.0
- NodeJS >= 10.9
-
Clone the DataSpread repository. Alternatively, you can download the source as a zip or tar.gz.
-
Use maven to build the
war
file using the following command. After the build completes, the WAR is available atwebapp/target/DataSpread.war
.mvn clean install
-
Install PostgreSQL database. Postgres.app is a quick way to get PostgreSQL working on Mac. For other operating systems check out the guides here.
-
Create a database and an user who has access to the database. Note the database name, username and password. Typically when you have PostgreSQL installed locally the password is blank.
-
Install Apache Tomcat. You can use the guide here. Make a note of the directory where tomcat is installed. This is known as
TOMCAT_HOME
in all documentation. -
Update the Tomcat configuration. You need to update the following file, which is present in
conf
folder underTOMCAT_HOME
folder.context.xml
by adding the following text at the end of the file before the closing XML tag.
<Resource name="jdbc/ibd" auth="Container" type="javax.sql.DataSource" driverClassName="org.postgresql.Driver" url="jdbc:postgresql://127.0.0.1:5432/<database_name>" username="<username>" password="<password>" maxTotal="20" maxIdle="10" maxWaitMillis="-1" defaultAutoCommit="false" accessToUnderlyingConnectionAllowed="true"/>
Replace
<database_name>
,<username>
and<password>
with your PostgreSQL's database name, user name and password respectively. -
Copy
postgresql-42.1.4.jar
(Download from here) tolib
folder underTOMCAT_HOME
. It is crucial to have the exact version of this file. -
Deploy the WAR file within Tomcat as the root application. This can be done via Tomcat's web interface by undeploying any application located at
/
and deploying the WAR file with the context path/
. To do this manually, delete thewebapps/ROOT
folder underTOMCAT_HOME
while the application is not running, copy the WAR file to thewebapps
folder, and rename it toROOT.war
. -
Now you are ready to run the program. Visit the url where Tomcat is installed. It will be typically http://localhost:8080/ for a local install.
To work with the DataSpread source code, follow the developer setup guide. Read the contributing guide before making a pull request. Contributions are welcome!
For bugs and feedback, please use the GitHub Issues.
MIT