/MininGit

Repository for UC Santa Cruz's work on Libresoft's CVSAnalY

Primary LanguagePythonGNU General Public License v2.0GPL-2.0

MininGit

Warning: MininGit is no longer being actively developed or used. It was working last time we looked, but that may change depending on what happens with Git versions. Good luck :)

Description

The MininGit tool extracts information out of source code repository logs and stores it into a database. MininGit is a fork of Libresoft's CVSAnalY.

Quick installation

  1. Get pip: sudo easy_install pip
  2. Use pip: pip install "https://github.com/SoftwareIntrospectionLab/MininGit/tarball/master#egg=master"

Slower installation

Requirements

Note for upgraders: MininGit now uses setuptools for installation. Depending on your PYTHONPATH, the old MininGit/CVSAnalY might not be removed (or worse, override this release). Please check for and remove old installations before installing this version.

MininGit has the following dependencies:

  • Python 2.5 or higher

  • RepositoryHandler (this needs to be placed in your PYTHONPATH)

    git clone https://github.com/SoftwareIntrospectionLab/repositoryhandler.git

  • Guilty (optional. Required for the Blame or HunkBlame extensions, also needs to be discoverable in the PYTHONPATH)

    git clone http://github.com/SoftwareIntrospectionLab/guilty.git

  • CVS (optional. Required for CVS support. Make sure to read the "SCM Support" section.)

  • Subversion (optional. Required for SVN support. Make sure to read the "SCM Support section.)

  • Git (optional. Required for Git support. Must be >= 1.7.4 for HunkBlame extension to work)

  • Python MySQLDb (optional, but of course required if you wish to actually use MySQL as your database engine!)

  • python-progressbar (http://code.google.com/p/python-progressbar/)

  • Pygments (optional. Required for extension HunkBlame with the option --hb-ignore-comments. This needs to be placed in your PYTHONPATH)

Install

You don't need to do anything if you are happy using MininGit from the path you downloaded it to. This is easiest if you intend on staying up-to-date with our releases from our Git repositories. You can also move the directory around to wherever you wish.

If you want to install it to a system location, you can do this by running the setup.py script:

python setup.py install

If you do this, you'll need to remember to run this every time you get a new release.

If you don't have root privledges, you can just add MininGit to your $PATH ( is the directory where MininGit is installed):

export PATH=$PATH:<MininGit dir>

MininGit needs RepositoryHandler. If it is not installed in the usual path for Python packages, PKG_CONFIG_PATH should include the directory where it is installed (repohandlerdir is the path where RepositoryHandler is installed):

export PKG_CONFIG_PATH=$PKG_CONFIG_PATH:repohandlerdir

You are now ready to use MininGit!

Running MininGit if you installed it

Just checkout (from Git/SVN/CVS) to obtain a local version of your repository, and then run miningit. Here's an example using Voldemort

$ git clone git://github.com/voldemort/voldemort.git ~/Downloads/voldemort
$ cd ~/Downloads/voldemort
$ ~/Downloads/voldemort$ miningit 

More options, and a more detailed info about the options, can be found by running miningit --help.

Running MininGit from its directory

Just checkout (from Git/SVN/CVS) to obtain a local version of your repository, and then run miningit, pointing to where you downloaded it. Here's an example using Voldemort:

$ git clone git://github.com/voldemort/voldemort.git ~/Downloads/voldemort
$ cd [where you downloaded MininGit to]
[MininGit directory]$ ./miningit ~/Downloads/voldemort 

More options, and a more detailed info about the options, can be found by running ./miningit --help.

SCM Support

At this point in time, only Git is fully tested and supported across all of MininGit and its extensions. SVN is a "best effort" basis: things shouldn't break using SVN, but if they do, you're unlikely to have anyone respond to a bug tracker issue without a pull request patch.

MininGit was originally created to support CVS and SVN. Git support appeared later, and Bazaar support was started but abandoned. As development has continued, it has become clear that Git represents the best possibilities for data mining source code repositories. Because Git allows all the source history to be downloaded to local storage, MininGit actions are orders of magnitude faster. For example, the Content extension can get every revision of a file. With CVS and SVN, this requires sending the request to the central server, have the server (slowly) process it, and then get the content back. We've found that operations which take hours on Git can take weeks with SVN.

If you have an SVN repository that you want to mine, but you can't find a Git mirror for it, we've had good success with svn2git.

If you're having problems

Packet bigger than max_allowed_packet

Sometimes, a lot of data can pass between MininGit and MySQL, and packet limits are set too small.

Follow the instructions here.

UnicodeEncodeError: 'ascii' codec can't encode character

This happens because Python is trying to print out a Unicode string to a terminal that has told Python it only supports ASCII. You can coerce Python into printing Unicode by setting up your sitecustomize.py.

Credits

CVSAnalY is developed by the GSyC/LibreSoft group at the Universidad Rey Juan Carlos in Móstoles, near Madrid (Spain). It is part of a wider research on libre software engineering, aimed to gain knowledge on how libre software is developed and maintained.

MininGit is actively contributed to by the Software Introspection Lab at University of California, Santa Cruz, and hosts Git mirrors at https://github.com/SoftwareIntrospectionLab . UCSC can review pull requests and bug reports using GitHub's systems. This is currently more active than the official LibreSoft repository ecosystem, and may be more likely to have your issue reviewed.

More information

Main authors of CVSAnalY

Contributors of CVSAnalY

Contributors of MininGit