An example of Business Intelligence From Origin Data to Dashboard
NOAA daily weather data.
There is a shell script who names noaa-data.sh
.It will download the weather data from the NOAA FTP server.After of all, running this script.
Python code. Process the data using the Pandas data analysis Python library.
Jupyter notebook script.
Python dependence.
Superset using Python2.7 in production, but it support 3.x. Suggestting 2.7.x.
It is recommended to install Superset inside a virtualenv.Python 3 already ships virtualenv, for Python 2 need to install it.
$ sudo pip install virtualenv
Use the command to create the virtual folder any where:
$ virtualenv venv
Then execute this virtual environment:
#source venv\bin\activate
$ . venv\bin\activate
To exit a virtualenv just type
deactivate
.
Modify the default pip server for china:
pip install -ihttp://mirrors.aliyun.com/pypi/simple/ flask。
Put all the chances on your side by getting the very latest pip and setuptools libraries.:
$ pip install --upgrade setuptools pip
Install superset:
$ pip install superset
Create administrator with Flask-AppBuilder.
# Create an admin user (you will be prompted to set username, first and last name before setting a password)
fabmanager create-admin --app superset
# Initialize the database
superset db upgrade
# Load some data to play with
superset load_examples
# Create default roles and permissions
superset init
# To start a development web server on port 8088, use -p to bind to another port
superset runserver -d
Supperset Already install!
Excute the #Root/data/noaa-data.sh
script to download and unzip NOAA weather data from the FTP server.
Or use cURL command directly:
$ curl -O ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/by_year/2015.csv.gz
$ curl -O ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/ghcnd-stations.txt
After downloading the data, gunzip it:
gunzip 2015.csv.gz
Create virtual environment, and activating it:
virtualenv venv
#activating
. venv\bin\activate
Download the dependencies:
$ pip install -r requirements.txt
Install PostgreSQL(pgsql) 10:
$ sudo apt-get install postgresql-10
Then an account names postgres has bean created.
Modify account password:
$ sudo -u postgres passwd
Login Database:
$ sudo -u postgres psql
Modify the Database password:
ALTER USER postgres WITH PASSWORD '123456';
$ python -m pip install --upgrade pip
python -m pip install jupyter
$ jupyter nodebook
$ wc -l 2015.csv
$ cut -d, -f1 2015.csv | sort | uniq | wc -l
Loading the download data——${project}/data/2015.csv ghcnd-stations.txt)
into PostgreSQL Data base.
We can use jupter noteboot to execute python code:
$ jupyter notebook IngestData.ipynb
But it exists some problems.
$ python src/execute.py
This will take some time to complete.After of all, the data will be loaded into pgSQL, and three table will be created —— weather_data
,station_metadata
and weather_types
.
Execute blow SQL:
CREATE TABLE weather_data_denormalized AS
SELECT wd.station_identifier,
wd.measurement_date,
wd.measurement_type,
wt.weather_description,
wd.measurement_flag,
sm.latitude,
sm.longitude,
sm.elevation
FROM weather_data wd
JOIN station_metadata sm
ON wd.station_identifier = sm.station_id
JOIN weather_types wt
ON wd.measurement_type = wt.weather_type;
CREATE INDEX date_index ON weather_data_denormalized (measurement_date);
CREATE INDEX type_index ON weather_data_denormalized (measurement_type);
CREATE INDEX description_index ON weather_data_denormalized (weather_description);
CREATE INDEX flag_index ON weather_data_denormalized (measurement_flag);
CREATE INDEX elevation_index ON weather_data_denormalized (elevation);
Then using Superset to create your dashboard:
- Sources->Database.
- Input the SQLAlchemy URI string.
- Then "Test Connection".
- Sources->Tables.
- Select new Database.
- Input the table name: weather_data_denormalized.
- Sources->Tables. Simply click on the table name of the list of tables.
It's developed by React where '${install_path}/superset/static/assets'.
- branding: Logo.
- images: all of the images which supports superset web.
- spec: Config file.
- src: develop file.
- stylesheets: style file, css,less e.g.
- verdor: Third part lib implements with React.
- Add path: ${superset_root}/static/assets/src/visualizations
- add custom charts as a React Component who includes .jsx/js and css part.such blow:
import React from 'react';
import PropTypes from 'prop-types';
import ReactDOM from 'react-dom';
import { XYChart, AreaSeries, CrossHair, LinearGradient } from '@data-ui/xy-chart';
class Costom extends React.Component{
//TODO
}
- Add costom component in index.js (${superset_root}/static/assets/src/visualizations):
- Add costom charts in ${superset_root}/static/assets/src/explore/visTypes.jsx for adding data options in slice.
- Add images in ${superset_root}/static/assets/images/viz_thumbnails/
About front development details see https://github.com/chkui/superset(Forked from ryoha13/superset).