/Log-Analysis

This Repository Contains Log Analysis Project of Updated Udacity Full Stack Nanodegree v 4.0

Primary LanguagePython

Log-Analysis

Project Overview

In this project, you'll work with data that could have come from a real-world web application, with fields representing information that a web server would record, such as HTTP status codes and URL paths. The web server and the reporting tool both connect to the same database, allowing information to flow from the web server into the report.

How to Run?

PreRequisites:

Setup Project:

  1. Install Vagrant and VirtualBox
  2. Download or Clone fullstack-nanodegree-vm repository.
  3. Download the data from here.
  4. Unzip this file after downloading it. The file inside is called newsdata.sql.
  5. Copy the newsdata.sql file and content of this current repository, by either downloading or cloning it from Here

Launching the Virtual Machine:

  1. Launch the Vagrant VM inside Vagrant sub-directory in the downloaded fullstack-nanodegree-vm repository using command:
  $ vagrant up
  1. Then Log into this using command:
  $ vagrant ssh
  1. Change directory to /vagrant and look around with ls.

Setting up the database and Creating Views:

  1. Load the data in local database using the command:
  psql -d news -f newsdata.sql

The database includes three tables:

  • The authors table includes information about the authors of articles.
  • The articles table includes the articles themselves.
  • The log table includes one entry for each time a user has accessed the site.
  1. Use psql -d news to connect to database.

  2. Create view article_view using:

  create view article_view as select title,author,count(*) as views from articles,log where 
  log.path like concat('%',articles.slug) group by articles.title,articles.author 
  order by views desc;
Column Type
title text
author text
views Integer
  1. Create vier error_log_view using:
  create view error_log_view as select date(time),round(100.0*sum(case log.status when '200 OK' 
  then 0 else 1 end)/count(log.status),2) as "Percent Error" from log group by date(time) 
  order by "Percent Error" desc;
Column Type
date date
Percent Error float

Running the queries:

  1. From the vagrant directory inside the virtual machine,run logs.py using:
  $ python3 logs.py

FAQ's: here