/DataAnalysis-Science

A repository containing projects for Data Analysis and Data Science.

Primary LanguageJupyter Notebook

DataAnalysis-Science

A repository containing projects for Data Analysis and Data Science. All projects will be done using Python in Jupyter Notebooks.

Below is a list of current projects in this repository

  • Computation with Numpy and N-dimensional arrays
    • First major project using Numpy and learned how to access individual values and subsets inside an n-dimensional array, how to do linear algebra with NumPy and how to manipulate images as ndarrays.
  • Data Visualization with Matplotlib-Programming Languages
    • Downloaded an excel file from Stack Overflow and used Matplotlib and Pandas to visualize and manipulate the data.
  • Movie, budget and revenue using Seaborn and Scikit-learn
    • First major project using Seaborn and Scikit-learn. Used Seaborn to visualize the data with help from the Pandas library. Created a linear regression model to predict the revenue of a movie.
  • College Major vs Salary Expectations
    • This project was done to practice basic data cleaning and processing. Using the cleaned data a few general questions were answered for example "which degrees have the highest starting salaries?".

Requirments.txt

For all the projects in this repo were done in the same Conda environment and you can find the Requirments.txt in the main repository.