/Big-Data-Notes

This notes is from my lecturer. I just convert it into md.

Big-Data-Notes

This notes is from my lecturer. I just convert it into md.

Introduction

Below is the process involve during do Big Data Project

Flow in Big Data Project

Big Data Flow

Roles in Big Data Project

Roles Big Data

Non Functional Requirements

  • Performance
  • Scalability
    • Vertical - Need to add more RAM/resources to a single machine
    • Horizontal - Add more cluster/machines
  • Maintainability
  • Availability
  • Security

Why Big Data?

'4Vs of Big Data'

Why Relation Database Management System (RDBMS)?

'RDBMS'

Big Data Framework

It is a solution to this problem:

  • How to acquire/export data?
  • How do you store large data?
  • How do you retrieve large data?
  • How do you process large data?
  • How do you sorting large data?
  • How do you analyze large data? Structured? Semi unstructured? Unstructured?

Samples of Big Data Framework:

  • Hadoop (2007) (cover all category) Hadoop
  • Spark (start in 2009) (under data processing) Spark
  • Kafka (2011) (under content acquisition) Kafka
  • NoSQL

Hadoop Eco System

'Hadoop Eco System Source:http://www.vikramtakkar.com/2016/12/3-hadoop-ecosystem-hadoop-tutorial.html'

Type of storage

'Type of storage'