p3

An open source pcap packet and NetFlow file analysis tool using Hadoop MapReduce and Hive.

This project joins pcap-on-hadoop (https://github.com/ssallys/pcap-on-Hadoop) and nflow-on-hadoop(https://github.com/ssallys/nflow-on-Hadoop).

Installation

To install Apache Hadoop

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
To install Apache Hive

https://cwiki.apache.org/confluence/display/Hive/GettingStarted

Confiuration

put p3-default.xml to $HADOOP_HOME/conf

This file is currently not used, but some code is not modified.

This file includes:

     <property>
                     <name>pcap.file.captime.min</name>
                     <value>1168300867</value>
                     <description>stop time of packet capturing</description>
     </property>
     <property>
                     <name>pcap.file.captime.max</name>
                     <value>1168387267</value>
                     <description>stop time of packet capturing</description>
     </property>

IP Analysis

Total traffic and host/port count statistics

hadoop jar ./p3.jar p3.runner.PcapTotalStats -r[source dir/file] -n[reduces]
Periodic flow statistics

hadoop jar ./p3.jar p3.runner.PcapTotalFlowStats -r[source dir/file] -n[reduces] -p[period]
Periodic simple traffic statistics

hadoop jar ./p3.jar p3.runner.PcapStats -r[source dir/file] -n[reduces]

NetFlow Analysis

Total traffic statistics for NetFlow data

hadoop jar ./p3.jar nflow.runner.Runner -r[source dir/file] -n[reduces] -js

ssallys/p3

p3

Installation

Confiuration

IP Analysis

NetFlow Analysis