Programmed by David Hinkle, Commissioned by DerbyTech of Illinois. Special thanks goes to Brice Beaman at brice@beamans.org for releasing the software, testing and debugging, Blaze at ts@spective.net for his excellent logo, Andreas Henriksson for polishing, testing, fixing, and all the guys at havok for distributing clue. #### LISCENCE # You may use this software under any version of the GPL that is current as of your download. For exact terms and conditions please see www.gnu.org. #### WHAT IT IS # Bandwidthd is a UNIX daemon/Windows service for graphing the traffic generated by each machine on several configurable subnets. It is much easier to configure than MRTG, and provides significantly more useful information. MRTG only tells you how much bandwidth you are using, Bandwidthd tells you that, and who is using it. Each IP address that has moved any significant volume of traffic has its own graph. The graphs are color coded to help you figure out at a glance if your user is surfing the web, or surfing Kazaa. Bandwidthd is targeted to run on my routing platforms. It is very low overhead. Easily graphing small business traffic on a 133Mhz Elan 486 every 2.5 minutes. My entire ISP (2000-3000 IP addresses across 4 states) is graphed on a Celeron 450 every 10 minutes. #### PORTABILITY # Bandwidthd compiles clean on: ix86 Solaris 9 Debian 2.2 Fedora Core 2 OpenBSD 3.4 FreeBSD 4.8 NetBSD 1.6.1 AMD64 Fedora Core 3 PPC G4 MacOSX 10.2 Thanks goes to SourceForge for providing the test platforms. #### CONFIGURATION INSTRUCTIONS # There are now two ways to install Bandwidthd. The fast easy way, which uses the built in Bandwidthd graphing system to generate static HTML pages and graphs, and the much more complicated way that supports multiple sensors, stores it's data in a back end database, and generates reports and graphs with easily customized php scripts. If you are new to Bandwidthd I would recommend just installing it the following the instructions in the bandwidthd.conf file. If you are interested in customizing your output or you need a more scalable solution, you can always come back and jump through the database hoops later. See "DATABASE SUPPORT" for information on Bandwidthd's advanced configuration. #### GRAPHING INTERVAL # Bandwidthd defaults to graph up to 4000 local IPs every 200 seconds. If you need to track more IPs, change IP_NUM in bandwidthd.h. The weekly graph updates every 10 minutes, monthly every hour, and yearly every 12 hours. A graphing run will automatically be "skipped" if that last one isn't finished before the new one would begin. #### CDF LOGGING # Bandwidthd can be made to log to CDF by setting "output_cdf" to true. This will now log out each interval's traffic, so you can import them into a database and use a tool like access to create your own graphs, or implement 95 percentile billing, for example. Sending Bandwidthd a HUP will cause it to rotate it's logs. It will rotate out 5 times before deleting the oldest log file. These logs are log.1.0.cdf-log.1.5.cdf for daily, log.2.0.cdf-log.2.5.cdf for weekly, etc, etc. If you are upgrading from an older version of Bandwidthd from before all 4 logs rotated you must rename your log files for the new Bandwidthd to find them: mv log.cdf log.1.0.cdf mv log.1.cdf log.1.1.cdf mv log.2.cdf log.1.2.cdf mv log.3.cdf log.1.3.cdf mv log.4.cdf log.1.4.cdf mv log.5.cdf log.1.5.cdf mv log2.cdf log.2.1.cdf mv log3.cdf log.3.1.cdf mv log4.cdf log.4.1.cdf The log format is best documented in the "StoreIPDataInCDF" function in bandwidthd.c. As of this writing, it consists of one line for each IP address for each interval. The line contains only data for the previous interval. Fields: IP Address,Timestamp,Total Sent,Icmp Sent,Udp Sent,Tcp Sent,Ftp Sent,Http Sent, P2P Sent,Total Received,Icmp Received,Udp Received,Tcp Received,Ftp Received,Http Received, P2P Received #### HOW TO KEEP YOUR GRAPHS BETWEEN REBOOTS # Following is the easy way to keep your graphs between reboots. You can also opt to build and use bandwidthd with database support, as described in "DATABASE SUPPORT" below. In the bandwidthd.conf file set: output_cdf true recover_cdf true output_cdf will cause Bandwidthd to log all of it's data to the log.cdf file in it's directory. recover_cdf will cause Bandwidthd to load that file when it starts. You will also want to make a crontab entry like so: 0 0 * * * * /bin/kill -HUP `cat /var/run/bandwidthd.pid` This will send Bandwidthd a HUP every night at midnight. When Bandwidthd receives a HUP it schedules a rotation of it's log files during the next graphing run. Daily log files rotate each HUP. Weekly/Monthly/Yearly log files rotate every so many HUPs. Log files get rotated out 5 times before deletion. Fyi, if you use killall instead of kill, each of the children will receive the HUP command twice, causing them to rotate their log files twice as often as they should. #### GRAPHING Also note that Bandwidthd does not bother to graph an IP that has transmitted less than 1Mbit of data. It does however, include that IP in the table of IPs at the top of the page, so it's traffic can still be viewed. This cutoff can be changed by modifying graph_cutoff in bandwidthd.conf. "graph_cutoff" is in kilobytes. Graphing can be disabled by setting "graph" to false. This will still log, but will use almost no ram or CPU cycles. #### COLOR CODES # RED ICMP BROWN UDP YELLOW IP ENCAPSULATED (IP over IP, IPSEC, most VPN's) BLUE HTTP (Port 80 TCP, actually) PURPLE Peer2Peer (Lots of TCP ports generally used by P2P software) GREEN TCP #### SPECIFYING THE LIBPCAP FILTER # if you'd like more control over what traffic is counted, you can specify a Manuel filter to be passed to libpcap. You can use this to remove certain IPs or only graph certain IPs, or limit the graphs to certain kinds of traffic. You should always specify "ip" in the filter. For example: filter "ip and host 64.215.40.1" #### HOW TO IMPROVE PERFORANCE # Bandwidthd's primary bottleneck in static HTML mode is the drawing of graphs for IP addresses. To improve bandwidthd's performance, therefore, the only thing you can really do is reduce the number of graphs it has to draw in any given run. Adjust graph_cutoff in the bandwidthd.conf file in order to tune out the IP addresses that don't use so much bandwidth. These IP addresses will still have their data displayed in the table at the top of the page, so you can still see what's going on with them. Alternatively, you can choose to graph less often. Bandwidthd automatically skips a graphing run if the last one is still going when the new one is scheduled to start. If you'd like to graph less often than your server is capeable of, you can edit skip_intervals in bandwidthd.conf. A value of 1means to skip every other interval, 3 would mean to skip three intervals between each run. This doesn't disable Bandwidthd's automatic skipping. Also, you can choose to deploy Bandwidthd with database support, which provides significant performance gains. #### DATABASE SUPPORT # Since version 2.0, Bandwidthd now has support for external databases. This system consists of 3 major parts: 1. The Bandwidthd binary which acts as a sensor, recording traffic information and storing it in a database across the network or on the local host. In this mode Bandwidthd uses very little ram and CPU. In addition, multiple sensors can record to the same database. 2. The database system. Currently Bandwidthd only supports Postgresql and SQLite. 3. The webserver and php application. Bundled with Bandwidthd in the "phphtdocs" directory is a php application that reports on and graphs the contents of the database. This has been designed to be easy to customize. Everything is passed around on the urls, just tinker with it a little and you'll see how to generate custom graphs pretty easy. Using Bandwidthd with a database has many advantages, such as much lower overhead, because graphs are only graphed on demand. And much more flexibility, SQL makes building new reports easy, and php+sql greatly improves the interactivity of the reports. My ISP has now switched over to the database driven version of bandwidthd entirely, we have half a dozen sensors sprinkled around the country, writing millions of data points a day on our customers into the system. INSTRUCTIONS As a prerequisite for these instructions, you must have Postgresql installed and working, as well as a web server that supports php. Database Setup: 1. Create a database for Bandwidthd. You will need to create users that can access the database remotely if you want remote sensors. 2. Bandwidthd's schema is in "schema.postgresql". "psql mydb username < schema.postgresql" should load it and create the 2 tables and 4 indexes. Bandwidthd Setup: 1. Add the following lines to your bandwidthd.conf file: # Standard postgres connect string, just like php, see postgres docs for # details pgsql_connect_string "user = someuser dbname = mydb host = databaseserver.com" # As alternative to the pgsql_connect_string option above, set the SQLite db path: # sqlite_filename = "/tmp/bandwidthd.sqlite" # # Arbitrary sensor name, I recommend the sensors fully qualified domain # name sensor_id "sensor1.mycompany.com" # Tells Bandwidthd to keep no data and preform no graphing locally graph false # If this is set to true Bandwidthd will try to recover the daily log # into the database. If you set this true on purpose only do it once. # Bandwidthd does not track the fact that it has already transferred # certain records into the database. recover_cdf false 4. Simply start bandwidthd, and after a few minutes data should start appearing in your database. If not, check syslog for error messages. Web Server Setup: 1. Copy the contents of phphtdocs into your web tree some where. 2. Edit config.conf to set your db connect string You should now be able to access the web application and see you graphs. All graphing is done by graph.php, all parameters are passed to it in it's url. You can create custom urls to pull custom graphs from your own index pages, or use the canned reporting system. In addition, you should schedule bd_pgsql_purge.sh to run every so often. I recomend running it weekly. This script outputs sql statements that aggregate the older data points in your database in order to reduce the amount of data that needs to be slogged through in order to generate yearly, monthly, and weekly graphs. Example: bd_pgsql_purge.sh | psql bandwidthd postgres Will connect to the bandwidthd database on local host as the user postgres and summarize the data. # KNOWN BUGS AND TROUBLESHOOTING # If Bandwidthd shows you nothing but a message saying "Bandwidthd has nothing to graph", that means it currently has recorded no data. The cause is most likely one of these: 1. It's first interval hasn't expired yet (2.5 minutes). 2. It is still chewing through large CDF logs. 3. Bandwidthd's host machine is on a switch and therefor isn't seeing any traffic not destined to or from or going through it. 4. You don't have a subnet line. 5. You have a subnet line but it doesn't match any of the packets Bandwidthd is seeing. 6. You have a filter line that is filtering out all the traffic Bandwidthd could be seeing. Bandwidthd doesn't do a very good job of tracking ftp. This is because only some ftp server software follows the rfc standard of sourceing all ftp transfers from port 20. Surprisingly, Microsofts ftp daemon is compliant in this regard, whereas most open source daemons are not. Bandwidthd forks to perform it's graphing functions. After this fork finishes it remains as a zombie until the next interval, at which time it is reaped by the main process. This zombie is nothing to worry about, it's just an entry in the process table waiting to be deleted. With the new weekly, monthly, yearly graphs you'll have up to 4 zombies now. By default, Bandwidthd now runs in promiscuous mode. This means it can be used to monitor traffic not routing through it's host. Just make sure that the host and the actual router are on the same hub (Not switch) and everything will be ok. Under some circumstances, traffic may get counted twice. If traffic routes to the other router, then routes back out along the same wire, it may get counted twice by Bandwidthd. This is much less of a problem than you might think, due to a little known packet called an "icmp redirect" that causes all packets after the first to go directly to it's target. If you find that traffic looks like it's getting counted twice, make sure you're not firewalling off the icmp redirects. If you find that you actually see none of this traffic, it may be because the icmp redirects cause it to ultimately end up going from one port on a switch to another, never touching your hub. If you want Bandwidthd to only see traffic actually going into and out of the host set "promiscuous" to false in bandwidthd.conf. Bandwidthd supports ethernet, Linux cooked sockets, raw, and token ring, and most ppp packet encapsulation. If you get an "Unknown Datalink Type" error, Bandwidthd has not been programed to handle the physical encapsulation of the ip packets on that interface. If you send me a sample capture and a copy paste of the error message I'll see if I can make bandwidthd work for you.