DataSpread is a spreadsheet-database hybrid system, with a spreadsheet frontend, and a database backend. Thus, DataSpread inherits the flexibility and ease-of-use of spreadsheets, as well as the scalability and power of databases. A paper describing DataSpread's architecture, design decisions, and optimization can be found here. DataSpread is a multi-year project, supported by the National Science Foundation via award number 1633755.
The current version is 0.3.
DataSpread is built using PostgreSQL and ZKSpreadsheet, an open-source web-based spreadsheet tool.
DataSpread's version 0.1 enables users to scale to billions of cells and return results for common spreadsheet operations within seconds. It does so via on-demand loading of spreadsheet data.
Like traditional spreadsheet software, DataSpread supports standard spreadsheet book and sheet operations like Load, Rename, Delete, and Import (via XLS and XLSX, and CSV). Any updates to the spreadsheets are automatically saved.
Like traditional spreadsheet software, DataSpread supports the use of 225+ spreadsheet functions, along with formatting and styling operations. It also supports row and column operations like insert, delete, cut, copy, and paste; during insertion and deletion, formulae are updated as is the case in traditional spreadsheet software.
It supports all these operations while scaling to arbitrarily large spreadsheets.
In future releases, DataSpread will support SQL on the spreadsheet frontend, along with other relational algebra-based interactions. It will also support joint formula evaluation and optimization.
- DataSpread employs a flexible hybrid data model to represent spreadsheet data within a database.
- DataSpread uses positional indexing techniques to both locate data by position, and keep it up-to-date as the data is updated.
- DataSpread also employs a LRU caching mechanism to retrieve and keep in memory data from the database on demand.
- DataSpread also employs speculative fetching to fetch additional data beyond the user's current spreadsheet window.
You can directly use DataSpread via our cloud-hosted site.
To host DataSpread locally you can either use one of the pre-build war files, available here, or build the war file yourself from the source.
- Java Platform (JDK) >= 8
- PostgreSQL >= 10.5
- PostgreSQL JDBC driver = 42.1.4
- Apache Tomcat >= 8.5.4
- Apache Maven >= 3.5.0
- NodeJS >= 10.9
-
Clone the DataSpread repository. Alternatively, you can download the source as a zip or tar.gz.
-
Use maven to build the
war
file using the following command. After the build completes the war is available atwebapp/target/DataSpread.war
.mvn clean install
-
Install PostgreSQL database. Postgres.app is a quick way to get PostgreSQL working on Mac. For other operating systems check out the guides here.
-
Create a database and an user who has access to the database. Note the database name, username and password. Typically when you have PostgreSQL installed locally the password is blank.
-
Install Apache Tomcat. You can use the guide here. Make a note of the directory where tomcat is installed. This is known as
TOMCAT_HOME
in all documentation. -
Update the Tomcat configuration. You need to update the following file, which is present in
conf
folder underTOMCAT_HOME
folder.context.xml
by adding the following text at the end of the file before the closing XML tag.
<Resource name="jdbc/ibd" auth="Container" type="javax.sql.DataSource" driverClassName="org.postgresql.Driver" url="jdbc:postgresql://127.0.0.1:5432/<database_name>" username="<username>" password="<password>" maxTotal="20" maxIdle="10" maxWaitMillis="-1" defaultAutoCommit="false" accessToUnderlyingConnectionAllowed="true"/>
Replace
<database_name>
,<username>
and<password>
with your PostgreSQL's database name, user name and password respectively. -
Copy
postgresql-42.1.4.jar
(Download from here) tolib
folder underTOMCAT_HOME
. It is crucial to have the exact version of this file. -
Deploy the war file within Tomcat. This can be done via Tomcat's web interface or by manually copying the war file in the
webapps
folder underTOMCAT_HOME
. -
Now you are ready to run the program. Visit the url where Tomcat is installed. It will be typically http://localhost:8080/DataSpread/ for a local install.
MIT