/data-science

Repository for all-things-data at Heat Seek, including methodologies, datasets, and data visualization tools and techniques we utilize.

Primary LanguageJupyter NotebookMIT LicenseMIT

Data Science at Heat Seek

This is a Heat Seek repository for all-things-data. It includes our methodologies, example datasets, and information on the data visualization tools and techniques we utilize when performing data analysis. At a high level, we break our data projects into two separate portions: Data Analysis and Data Visualizations. A breakdown of each section is included below.

[Data Analysis and Methodology](doc/Data Analysis.md)

Methodology, tools, and techniques used at Heat Seek for both internal and external analyses.

This section covers:

  • Data Sources Used
    • Open Source
    • APIs
    • Web Scraping
  • Programming Languages
    • R
    • Python
  • ETL (Extract, Transform, Load) Techniques
  • Databases
    • PostgreSQL
  • Data Cleaning/Massaging Using Command-Line Tools
    • awk, sed, etc.
  • Other Resources

[Data Visualizations](doc/Data Visualizations.md)

Applications and libraries used by Heat Seek to visualize datasets and analyses.

  • Applications for Prototyping and Publishing including:
    • Spotfire
    • Tableau
    • Adobe Illustrator
    • ggplot2
  • Other Resources