/firewall

Primary LanguageJava

Design Log:

//10:20 AM Initial Thoughts: 
	//Build a sort of trie decision tree? But more commonly used rules won't have a faster processing time?
	//Researching how firewalls are built
//10:25 AM Learning about iptables, found the filter table
//10:41 AM Set up the CSV reader, as far as I can tell I can't implement an iptable, so I'll have to build a similar one
//10:48 switching from java to python, found a python library that can implement iptables
//11:55 Reread the specs, can't use special python libraries, and i'm pretty sure that qualifies
//11:22 Finished building the decision trie, if I had more time I would definitely go learn more about how the iptable is built
//11:35 Writing accept_packet, moving on to work on reading in ranges, might run out of time though
//12:00 Went a bit over because I rewrote the checkPass to be recursive, but still having errors with the checkPass for port range

Later Added:
// Learned about B-Trees, sounds much more aligned, but I'll try to figure out what I did


Algorithmic Choices:
	In hindsight, I would have preferred to be able to understand more about how IPtables organize the rules. As is, my solution fails to correctly pass packets, and I think perhaps rewriting a Trie class from scratch in 90 minutes was a bit too ambitious. I didn't think about parsing the IPAddresses or ports until much later in the process, and ended up hard coding things like the column index, which was less than ideal. I also didn't really have time to think about efficiency at all. Thinking about the trie model now, I never got to go back and consider how to update rules with a faster processing time. Perhaps adding weights to the rule values saved in each TrieNode, and organizing the hashSet as a heap with those weights as priorities (so that i could investigate the rules that are queried more and have heavier weight, faster) would have been a better tweak. This largely depends on the constant time querying of sets. Otherwise, this implementation would have been no better than simply iterating through all of the rules. 

Testing:
	I didn't have a whole lot of time to write tests. Instead, I had the a trial input csv file, and just checked to make sure the basics were working. Unfortunately, I attacked parsing the IP addresses too late, so although it correctly classifies port ranges, it will still reject all packets that don't exactly match the format of the rule. Given more time, I would have continued working on this, and tested my code to find the edge cases. I would also have written code to generate a much larger input file, so that I could test the 500K boundary as listed in the specs. 

Improvements:
	I'd like to test whether the decisionTrie actually works the way I think it does. I'll work on this some more after I turn it in anyway, but I'm fairly sure there are some errors either in the way the tree is built, or the way it's queried. The printTrie method helped me debug that initially, but my inputfile was not extensive.

Team Ranking:
	Policy Team: Interested in regulating VPN connections
	Platform Team: Interested in learning more about caching, collecting performance data
	Data Team: Interested in making use of policy metadata 

Sources & Inspiration:

https://danielmiessler.com/blog/professional-firewall-iptables/
https://www.digitalocean.com/community/tutorials/a-deep-dive-into-iptables-and-netfilter-architecture
// some previous code for parsing CSV files that I had
https://github.com/ldx/python-iptables/blob/master/iptc/ip4tc.py
http://www.geeksforgeeks.org/trie-insert-and-search/