This project implements a program to parse flow log data and map each row to a corresponding tag based on a provided lookup table. The program reads flow logs and a lookup table from CSV files, processes the data, and generates output files that summarize tag counts and port/protocol combination counts.
- Java 17
-
Flow Log File: A plain text file containing flow log entries in the specified format. Each entry should be structured as follows: 2 123456789012 eni-xxxx 10.0.1.xxx 198.51.100.xxx OK
-
Lookup Table: A CSV file mapping destination ports and protocols to tags. The format should be: dstport,protocol,tag
-
Protocol Numbers File: A CSV file mapping protocol numbers to protocol names, structured as: protocolNumber,protocolName
The program generates two output CSV files:
- Tag Counts: Contains counts of occurrences for each tag.
- Port/Protocol Counts: Contains counts for each port/protocol combination.
- Clone the Repository:
git clone https://github.com/komal98/flow-log-analyzer.git
cd flow-log-analyzer
2.Compile the Code: Navigate to the src directory and compile the code:
cd src
javac Main.java
3.Run the Program: Execute the program directly from the src directory:
java Main
-
Ensure Input Files Are in Place: Make sure that the input files (flow-logs.txt, lookup.csv, and protocol-numbers.csv) are located in the src/input directory.
-
Check Output: After running, check the src/output directory for the generated CSV files containing the tag counts and port/protocol counts.
-
If you are running the code in Intellij make sure to add 'src/' to the file path in the build / run configuration as shown in the image below:
- The program only supports the default log format (version 2) as specified.
- The input flow log file can be up to 10 MB in size.
- The lookup file can contain up to 10,000 mappings.
- Matches are case-insensitive.
- We assume that the data is clean and in the required format.
- The large flow log file used for testing is randomly generated; actual data should be used to test edge cases more effectively.