In this project we built a web application that scrapes various websites for data related to the Mission to Mars and displays the information in a single HTML page.
A Jupyter Notebook file, called mission_to_mars.ipynb, was created and used to complete all of the scraping and analysis tasks, using BeautifulSoup, Pandas, and Requests/Splinter.
- I scraped the NASA Mars News Site and collected the latest News Title and Paragraph Text, and assigned them to variables to reference later.
- Splinter was used to navigate the JPL Featured Space Image site and find the image url for the current Featured Mars Image (full size .jpg image) and assign the complete url string to a variable.
- The latest Mars weather tweet, from the the Mars Weather twitter account page, was scraped and the tweet text for the weather report was saved as a variable.
- Pandas was used to scrape the table containing facts about the planet including Diameter, Mass, etc. on the Mars Facts webpage and to convert the data to a HTML table string.
-
High resolution images for each of Mar's hemispheres were obtained on the USGS Astrogeology site.
-
The image url string for the full resolution hemisphere image, and the Hemisphere title containing the hemisphere name were both saved in a Python dictionary using the keys img_url and title.
-
The dictionary was then appended to a list that contained one dictionary for each hemisphere.
MongoDB with Flask templating was used to create a new HTML page that displays all of the information that was scraped from the URLs above.
- The Jupyter notebook was converted into a Python script called
scrape_mars.py
with a function calledscrape
that would execute all of the scraping code from above and return one Python dictionary containing all of the scraped data. - Next, a route called
/scrape
was created that would import thescrape_mars.py
script and call thescrape
function. Store The return value was stored in Mongo as a Python dictionary. - A root route
/
was created that would query the Mongo database and pass the mars data into an HTML template to display the data. - A template HTML file called
index.html
was created. It would take the mars data dictionary and display all of the data in the appropriate HTML elements. Bootstrap was used to structure the HTML template.
Make sure that you have the chromedriver.exe
that fits the Chrome version before you run app.py
.
- Python
- BeatifulSoup
- Splinter
- Pandas
- chromedriver
- Flask
- MongoDB
- PyMongo
- HTML5 / CSS
- Bootstrap4