Pinned Repositories
Amazon_Vine_Analysis
Data analysts were tasked with analyzing Amazon reviews written by members of the paid Amazon Vine program. The Amazon Vine program is a service that allows manufacturers and publishers to receive reviews for their products. Companies pay a small fee to Amazon and provide products to Amazon Vine members, who are then required to publish a review.
BB_Biodiversity
Biomanufacturing food company Improbable Beef is looking for people who carry a large number of a certain bacterial species. They have enlisted the assistance of a bio-research firm to create a web-based dashboard to sort the most common species of bacteria and then display the results on the webpage.
bikesharing
Analysts inspected New York City Citibike data from August 2019 in order to determine if Citibikes would be a good fit for Des Moines, Iowa. The NYC data was analyzed to show trends for times of day, days of the week, gender, starting locations, and ending locations. This analysis required analysts to used Pandas to change the 'tripduration' column from an integer to a datetime datatype. Using the converted datatype, visualizations were created to show the length of time the bikes are checkout for all riders and genders, show the number of bike trips for all riders and genders for each hour of each day of the week, show the number of bike trips for each type of user and gender for each day of the week, top starting locations, and top ending locations.
Credit_Risk_Analysis
Data analysts were asked to examine credit card data from peer-to-peer lending services company LendingClub in order to determine credit risk. Supervised machine learning was employed to find out which model would perform the best against an unbalanced dataset. Data analysts trained and evaluated several models to predict credit risk.
Cryptocurrencies
Fictional investment bank, Accountability Accounting, is interested in offering a new cryptocurrency investment portfolio to its clients; however, the company needs assistance in determining which cryptocurrencies they should offer and would like them grouped to create a classification system for this new investment. A clustering algorithm was utilized and the data was visualized so the finding could be shared with the board.
DataScience
Data science tutorials
Election_Analysis
Election analysis of Colorado counties using Python
Mapping_Earthquakes
Using Leaflet.js API, a map was populated with GeoJSON earthquake data from a URL. Each earthquake with a magnitude greater than 4.5 is visually represented with a circle and color, and a larger magnitude will have a larger diameter circle. Additionally, each earthquake marker has a popup marker that displays the magnitude of the earthquake and the location. Users can choose to view the map in street view, satellite street view, or light view. Users can also select from the following layers: Earthquakes, Tectonic Plates, or Major Earthquakes.
Neural_Network_Charity_Analysis
Fictional company Alphabet Soup requested a binary classifier that is capable of predicting whether organizations will be successful if they are provided funding. A neural network model was created from a data set that contained over 34,000 entries in order to determine if an organization would be successful or not if Alphabet Soup provided them funding.
Police_Violence_Analysis
Police Violence in the United States Analysis
acfthomson's Repositories
acfthomson/Amazon_Vine_Analysis
Data analysts were tasked with analyzing Amazon reviews written by members of the paid Amazon Vine program. The Amazon Vine program is a service that allows manufacturers and publishers to receive reviews for their products. Companies pay a small fee to Amazon and provide products to Amazon Vine members, who are then required to publish a review.
acfthomson/BB_Biodiversity
Biomanufacturing food company Improbable Beef is looking for people who carry a large number of a certain bacterial species. They have enlisted the assistance of a bio-research firm to create a web-based dashboard to sort the most common species of bacteria and then display the results on the webpage.
acfthomson/bikesharing
Analysts inspected New York City Citibike data from August 2019 in order to determine if Citibikes would be a good fit for Des Moines, Iowa. The NYC data was analyzed to show trends for times of day, days of the week, gender, starting locations, and ending locations. This analysis required analysts to used Pandas to change the 'tripduration' column from an integer to a datetime datatype. Using the converted datatype, visualizations were created to show the length of time the bikes are checkout for all riders and genders, show the number of bike trips for all riders and genders for each hour of each day of the week, show the number of bike trips for each type of user and gender for each day of the week, top starting locations, and top ending locations.
acfthomson/Credit_Risk_Analysis
Data analysts were asked to examine credit card data from peer-to-peer lending services company LendingClub in order to determine credit risk. Supervised machine learning was employed to find out which model would perform the best against an unbalanced dataset. Data analysts trained and evaluated several models to predict credit risk.
acfthomson/Cryptocurrencies
Fictional investment bank, Accountability Accounting, is interested in offering a new cryptocurrency investment portfolio to its clients; however, the company needs assistance in determining which cryptocurrencies they should offer and would like them grouped to create a classification system for this new investment. A clustering algorithm was utilized and the data was visualized so the finding could be shared with the board.
acfthomson/DataScience
Data science tutorials
acfthomson/Election_Analysis
Election analysis of Colorado counties using Python
acfthomson/Mapping_Earthquakes
Using Leaflet.js API, a map was populated with GeoJSON earthquake data from a URL. Each earthquake with a magnitude greater than 4.5 is visually represented with a circle and color, and a larger magnitude will have a larger diameter circle. Additionally, each earthquake marker has a popup marker that displays the magnitude of the earthquake and the location. Users can choose to view the map in street view, satellite street view, or light view. Users can also select from the following layers: Earthquakes, Tectonic Plates, or Major Earthquakes.
acfthomson/Neural_Network_Charity_Analysis
Fictional company Alphabet Soup requested a binary classifier that is capable of predicting whether organizations will be successful if they are provided funding. A neural network model was created from a data set that contained over 34,000 entries in order to determine if an organization would be successful or not if Alphabet Soup provided them funding.
acfthomson/kickstarter-analysis
Analysis performed on Kickstarter data to reveal trends
acfthomson/MechaCar_Statistical_Analysis
Fictional company AutosRUs’ newest prototype, the MechaCar, is suffering from production troubles that are blocking the manufacturing team’s progress. AutosRUs’ senior management enlisted assistance from the data analytics team to review the production data for insights that may help the manufacturing team overcome their production issues.
acfthomson/Mission-to-Mars
A client has been admiring images of Mars’s hemispheres online and realized that the site is scraping-friendly. They would like their current web app re-designed to include all four of the hemisphere images. Using BeautifulSoup and Splinter to scrape full-resolution images of Mars’s hemispheres and the titles of those images, the scraped data was stored in a Mongo database, and displayed in a re-designed web app to accommodate these images.
acfthomson/Movies-ETL
Fictional company Amazing Prime needs to create an automated pipeline that takes in new data, performs the appropriate transformations, and loads the data into existing tables. Code will be refactored to create one function that takes in the three files and performs the ETL process by adding the data to a PostgreSQL database.
acfthomson/Pewlett-Hackard-Analysis
Fictional company Pewlett Hackard needs to determine the number of retiring employees per title and identify employees who are eligible to participate in a mentorship program. Both tasks are accomplished using SQL queries.
acfthomson/Police_Violence_Analysis
acfthomson/Practice
Practice exercises from a variety of resources
acfthomson/PyBer_Analysis
Fictional ride-sharing app company "PyBer" needed ride-sharing data analyzed using Pandas and Matplotlib. This analysis summarized how data differs by city type and how senior leadership at PyBer can leverage that data.
acfthomson/School_District_Analysis
Comparison between school district metrics with and without altered ninth grade reading and math scores for Thomas High School
acfthomson/stock-analysis
VBA macro to find total daily volume and yearly return for stock
acfthomson/surfs_up
Fictional company Waves and Icecream wants to determine if opening a surf shop in Oahu, Hawaii has good business value. Temperature data for the months of June and December were analyzed to assist in determining if a surf and ice cream shop business is sustainable year-round. Analysis is conducted using Python, Pandas, Numpy, and SQLAlchemy.
acfthomson/Titanic---Machine-Learning-from-Disaster
Use machine learning to create a model that predicts which passengers survived the Titanic shipwreck
acfthomson/UFOs
A client needed a webpage and dynamic table built that allowed for in-depth analysis of UFO sightings by allowing users to filter for multiple criteria at the same time. UFO data is stored in a JavaScript array. The webpage was built using JavaScript and its table has the ability to filter data based on multiple criteria at the same time The webpage was customized using Bootstrap.
acfthomson/World_Weather_Analysis
Changes recommended by beta testers of fictional app "PlanMyTrip".