Approaches to Data Science

DataScope Analytics


Courtesy of Introduction to Data Science

Jeff Hammerbacher

  1. Identify problem
  2. Instrument data sources
  3. Collect data
  4. Prepare data (integrate, transform, clean, impute, filter, aggregate)
  5. Build model
  6. Evaluate model
  7. Communicate results

Peter Huber

  1. Inspection
  2. Error checking
  3. Modification
  4. Comparison
  5. Modeling and model fitting
  6. Simulation
  7. What-if analyses
  8. Interpretation
  9. Presentation of conclusions

Ben Fry

  1. Acquire
  2. Parse
  3. Filter
  4. Mine
  5. Represent
  6. Refine
  7. Interact

Dataists (Hilary Mason and friends)

  1. Obtain
  2. Scrub
  3. Explore
  4. Model (build a Model) - Write an equation or code that describes the process just based on the data
  5. Interpret

Hilary Mason's Advanced Machine Learning Video from O'reilly

"Today's approach to problem solving"

  1. Motivation (understand the problem from human point of view
  2. Look at realistic data (coming from real apis and web applications)
  3. Explore the potential solutions (different algorithmic approaches)
  4. Make the solution work

Colin Mallows

  1. Identify data to collect and its relevance to your problem
  2. Statistical specification of the problem
  3. Method selection
  4. Analysis of method
  5. Interpret results for non-statisticians

Jim Gray

  1. Capture
  2. Curate
  3. Communicate

Ted Johnson

  1. Assemble an accurate and relevant data set
  2. Choose the appropriate algorithm