/summary-cb-ds-1

Primary LanguageJupyter Notebook

Summary-cd-bs-1

This is a summary of the the Data Science Diploma from Concordia.
https://concordiabootcamps.ca/courses/data-science-remote/

The jupyter notebooks are a mix of theory and useful python code intented to be used as a quick reference.

Table of content

  • Python

    • Data structures
    • Functions
    • Strings
    • Regex
    • Input / Output
    • Classes
    • Numpy
    • Magic commands
  • Algorithms

    • Sorting
    • Recursion
    • Dynamic Programming
    • Graphs
    • Fast code
  • Pandas

    • Importing
    • Explore
    • Slicing
    • Filtering
    • Ploting
    • Group by
    • Merging
    • Strings
    • JSON
    • Time series
  • Visualization

    • Matplotlib
    • Seaborn
    • Examples
  • Regression

    • Simple regression
    • Multiple regressions
    • Matrix form regression
    • Interpretation
    • Feature and target engineering
    • Regularisation
  • Classification

    • Root finding
    • Optimization
    • Logistic regression
    • Interpretation
    • ROC curve
    • Exotic distributions
    • Two stage modeling
    • Survival model
  • SQL

    • Connect
    • Query
    • Group by
    • Join
    • Nested queries
    • Create table
    • Create DB
  • Clustering

    • K-Means
    • Spectral Clustering
    • Image compression
    • Metrics
    • Model evaluation
    • Selecting the number of clusters
  • Scraping

    • API requests
    • BeautifulSoup
    • Selenium
  • Dimentionality Reduction (Embeddings)

    • PCA
    • UMAP
    • T-SNE, MDA, etc.
  • NLP

    • Tokenisation
    • Stemming
    • Lemmatization
    • Stop Words
    • Matching
    • Name Entity Regognition
    • Features extraction
    • Word Vectors (embeddings)
    • Sentiment Analysis
    • Topic Modeling (LDA, NMF)
    • Summarization
  • ML Models

    • SVM
    • Decision Tree
    • Gradient Boosting
    • Shapley
  • Deep Learning

    • FFNN
    • CNN
    • RNN: GRU, LSTM
    • RL: GAN
  • Time Series

    • ARIMA models
    • VAR models
    • Panels
  • Model Deployment

    • AWS
    • GCP
  • ML Tools

    • Scaling, Normalizing
    • Polynomial features
    • Test split
    • Cross Validation
    • Grid Search
    • Pipeline
  • Boilerplate (to come)

    • SKLearn
    • TensorFlow (Keras)
    • PyTorch
  • Recommender System (to come)

    • Collaborative Filtering
    • Content-based Filtering
    • Hybrid
  • Other (to come)

    • Computer Vision
    • Naive Baye? (Classifier)
    • Markov's chain? (NN)

    Useful? "Buy Me A Coffee"