3.0.1
Schema Wizard is an automation driven, human verified data modeling tool.
- Provide data samples
- Verify automated analysis
- Merge similar fields
- Export your schema
Requirements -
- Linux environment
- AppArmor
- Docker 1.11
- Docker Compose 1.7.1
Install docker-compose and clone this repository. Then, from the project root directory run:
sudo apparmor_parser -r -W sw-script-profile
sudo docker-compose up
We recommend you start with the example files in this project. When you start up Schema Wizard, a guided tour will help you through your first schema creation. And remember, always check the and for guidance!
In addition to automated data sample processing, Schema Wizard has an advanced feature called the Interpretation Engine. When you are ready to get into the more advanced features of Schema Wizard, head over to the Detailed Documentation for more information.
Schema Wizard uses a best effort detection and parsing model. For any file Schema Wizard receives, it will attempt to extract fields as either numbers or strings.
Basic Formats:
- CSV (text/csv) - Comma Separated Values
- JSON (application/json) - JavaScript Object Notation
- CEF (application/cef) - Common Events Format
- XML (application/xml) - well-formed Extensible Markup Language
Application Formats:
- PDF (application/pdf) - content must be one of the "Basic Formats"
- MS Word (application/vnd.openxmlformats-officedocument.wordprocessingml.document) - content must be one of the "Basic Formats"
Compressed Formats:
- ZIP (application/zip) - archive of any of the above formats
If Schema Wizard finds a content type that could contain other content types (e.g. a zip of CSV's or a PDF containing XML), it will recursively extract embedded content until it finds numeric or string fields. For more information on Schema Wizard's parsing strategy, see Parsing Details.
Schema Wizard uses the following open sourced technologies:
- Docker
- Java
- Python
- H2
- MongoDB
- Apache Tika
- Apache Maven
- Redis
- Jetty
- Flask
- npm
- Bower
- Grunt
- AngularJS
- Ace
- Credit for Geocoding dataset goes to ThetmaticMapping.
- Credit for conversion of ThetmaticMapping goes to Ogre Web Client
Schema Wizard is happy to be a part of the open source community. See Contribute to help improve Schema Wizard.
For a list of known issues, please visit our Known Issues Page.
The DigitalEdge Schema Wizard is managed by the Leidos DigitalEdge Team. Leidos is headquartered at:
11951 Freedom Drive Reston, VA 20190 (571) 526-6000
Schema Wizard is licensed for use under the Apache 2.0 license.