/gilamon

Monitoring for DFSR

Primary LanguagePythonOtherNOASSERTION

Overview

GilaMon is a simple monitoring tool for Windows Distributed File System Replication targeted for Windows system administrators.

It is also a set of tools written in Python that can be used by Windows system administrators who are comfortable with scripting to monitor DFSR or other Windows services through the Windows Management Instrumentation API.

GilaMon is BSD licensed and is designed to have cleanly separated architecture that will hopefully make hacking on the code easy even for novice programmers and/or system administrators who want to extend it. The package described here includes an example implementation of a CherryPy web service and the web page that an administrator can use to view the current status of DFSR on their network. You can use GilaMon's backend to monitor DFSR or any other WMI namespace without using the web service, by calling it as a script.

The backend for GilaMon makes WQL queries against your servers, performs introspection on the COM objects returned by WMI, and then gives you easy-to-comprehend tuples of key-value pairs. The keys will always be the same as the named properties of WMI objects described in Microsoft's documentation, and the values will be friendly types such as strings, ints, and Python datetime objects. For example, if you're making a query against the DfsrReplicatedFolderInfo class, you can check the documentation on MSDN to see that to find out the current size of the Staging folder, you should look at the CurrentStageSizeInMb property.

Motivation

Why build something like this? At Burns Engineering, Inc. where GilaMon was developed, we moved to DFSR in early 2010. We used DFSR to replicate our main file shares between the home office and four remote sites, as well as a backup hub. All was well for quite some time. But we then had a minor catastrophe - a user saved a file, and it disappeared from the share. This happened repeatedly over a short period of time, causing a lot of grief to our users.

We eventually isolated the problem to a particular application's files (Autodesk's AutoCAD) and a particular replication connection to one of our branch offices with poor connectivity. A support case with Microsoft went nowhere, and the only tools that Microsoft had available for us were log grubbers - it seemed impossible to catch DFSR in the act. Eventually the problem "went away" on its own, but we knew that it was only a matter of time until it returned.

We decided we needed a dead-simple monitoring tool to tell us what the current status of DFSR replication was, and so GilaMon was born. An enlightened IT Manager agreed that this tool would be useful to share with our community, and that's why I've been able to release it publically.

Dependencies

GilaMon relies on COM and is currently supported only on Windows, and can be run from Windows XP, Vista, 7, Server 2008, and Server 2008r2. The backend for GilaMon requires:

  • Python (Tested with 2.7)
  • pywin32 (Tested with Build216 and newer)
  • mock (You don't need this to run GilaMon, just to develop on it.)

The example GilaMon web service uses the following libraries:

Browser Support

The GilaMon example web service has been tested with Internet Explorer 7 and newer, and should work with other modern browsers as well. Internet Explorer 6 and earlier is unsupported.

The backend of GilaMon is browser-independent.

Installation

There a couple different ways to install GilaMon, depending on what you want to do with it. You can:

  • Run as a Python service or script. [Recommended]
  • Run the web service from a Windows executable.
  • Run the web service as a Python script.
  • Run DFSR queries as a Python script.
  • Run arbitrary WMI queries as a Python script.

Installing as a Python Service or Script [Recommended]

Running as a Python service is more complex if you don't already use Python. But this is the recommended way to run GilaMon because it is easier to update your software or the libraries on which it depends. You can also hack directly on the software without having to go through a complicated compiling process with py2exe. You'll need to do the following on the machine where you want to run the GilaMon web service (either your workstation or a server).

Important: If you've never installed Python on Windows before, you must set/modify Windows environment variables in order for your Python install to work.

  • Create a system variable named PYTHON_HOME pointing to the folder where you have Python installed.
  • At the end of your PATH system variable, add %PYTHON_HOME%;%PYTHON_HOME%\Scripts; to support both Python and setuptools.

If you want to run as a service instead of a script, also do the following:

  • Open a command line into the gilamon directory.
  • python setup.py -install

Installing as a Windows Executable

If you just want to run GilaMon as a Windows service, and don't plan on making any changes to the code, you don't need to install Python or any dependencies. You can just download the Windows executables. You can run the exectuable from a server and then use your workstation's web browser, or you can run the executable directly from your workstation and browse to localhost instead.

  • Download the file for your architecture (64 or 32 bit).
  • Unzip the archive into your Program Files directory.
  • Open a command line into the gilamon directory.
  • gilamon_service.exe -install

pywin32, cherrypy and jinja2 are bundled in the executable. See the licenses folder in the zip file for the licenses for these libraries.

Running GilaMon

If you want to run GilaMon as a Windows service, whether from the executable or the Python code:
  • Use a text editor to change /gilamon/config/gilamon.conf to the port and address you want your service to list on. Also add the host names of your DFSR servers under the [dfsr] section.
  • Go to the Windows Start menu, and right-click on Computer (or My Computer, depending on your version) and select Manage.
  • Under Services you should now see GilaMon. Go to the service's properties, change the logon account if you need to, and set the service to Automatic start if you'd like.
  • Click Start to start your service. If the service fails to start, you should see an event in your Event Viewer.
  • Point a web browser at the address and port you put in the gilamon.conf file.
If you want to run GilaMon with the web service as a Python script:
  • Use a text editor to change /gilamon/config/gilamon.conf to the port and address you want your service to list on. Also add the host names of your DFSR servers under the [dfsr] section. * Use a text editor to change
  • Go the command line and navigate to the gilamon directory.
  • python web_server.py
  • Point a web browser at the address and port you put in the gilamon.conf file.

The gilamon.conf file uses Python syntax. If you don't know Python, that's okay. Just use the pattern that's been provided. The IP address and server names have to be surrounded by quotes (either single or double is okay as long as they match), and the port number can't be in quotes. Use forward slashes for the log file path, or double back-slashes.

If you want to run GilaMon as a script without the web service, you'll want to open your Python interpreter and either import dfsr_query or import wmi_client to get the modules you'll need for your purposes. See the source code for documentation for these calls. (TODO: add this information to Wiki).

Support

For general questions or comments, please send me an email. To report a bug or other type of issue, please use the issue tracker.

Troubleshooting

Following are what I suspect might be Frequently Asked Questions about installing and running GilaMon.

The GilaMon service installs, but won't start.

Check the Event Log. It may show you that it's a configuration issue. Make sure the IP and port number are valid. If that's not it, please contact me or file an issue so that we can try to fix the problem (include the text of the event, if possible).

Also, make sure that you're Windows environment variables PYTHON_HOME and PATH have been set.

The GilaMon service installs and starts, but I get "Internet Explorer cannot view this page" on the web page.

Make sure that the Windows firewall on the server running the web service allows the port you've listed in the gilamon.conf.

The GilaMon service installs and starts, but I get "ERROR: Failed to get connector states" on the web page.

Check the log file found at C:/Windows/temp/gilamon.log (if you didn't change this path in your config). You may see an Access Denied error in the stack trace. Make sure the user that you're using for the GilaMon service has permissions to make WMI queries against the DFSR server (Server Manager -> Control -> WMI Control).

Yeah, I tried that already.

Sorry about that! Please use the issue tracker and file an issue so that I can fix the problem and improve GilaMon for everyone. Please send along any relevant log information.

Contributing

GilaMon is an open source project managed using Git version control. The repository is hosted on GitHub, so contributing is simple: fork and make a pull request. Make sure you've included tests.

Future Features

The following are features I'd like to add in the future:
  • A user-friendly command-line tool for making on-the-fly WQL queries.
  • ActiveDirectory-based authentication to the web page and general security improvements that would make it suitable to run on an Internet-facing page.
  • Set up and register for easy_install installation.
  • Support for running from Linux. There's a Samba-based library for WMI, but it was more trouble that it was worth at the time of release.