/cvsanaly

Fork of GSyC/LibreSoft's SCM mining tool with some added features coming

Primary LanguagePythonGNU General Public License v2.0GPL-2.0

CVSAnalY

Description

The cvsanaly tool extracts information out of source code repository logs and stores it into a database.

About this fork

This fork of cvsanaly looks to bring the tool up-to-date (such as removing dependencies on SQLite by changing the base requirement to at least Python 2.5), as well as add some extra functionality (such as storing the code of revisions into tables). It is being developed by the Software Introspection Lab at the University of California, Santa Cruz. It is not an official fork, and does not guarantee that it will maintain parity with the original cvsanaly2.

All releases to the master branch should be stable for use.

Differences to original

  • Added an extension called Content, which will save the content of source files at each revision (this takes a long time).
  • Added an extension called Hunks, which will parse diffs and find the start/end lines of changes, changes with just removed lines are ignored (this takes a really long time unless using git).
  • Updated to use sqlite3, bundled with Python 2.5+
  • Fixed Unicode handling of git repositories
  • Installable, with required repositoryhandler dependency, via pip, rather than requiring checking out multiple projects from source and performing manual installation.

Requirements

CVSAnalY has the following dependencies:

  • Python 2.5 or higher

  • RepositoryHandler (can be installed by pip requirements, if you know how to use that. More instructions forthcoming. This needs to be placed in your PYTHONPATH.)

    git clone https://github.com/SoftwareIntrospectionLab/repositoryhandler.git

  • Guilty (If you need to run Blame or HunkBlame extensions, also needs to be discoverable in the PYTHONPATH)

    git clone http://git.libresoft.es/guilty

  • CVS (optional, for CVS support)

  • Subversion (optional, for SVN support)

  • Git (optional, for Git support. Must be >= 1.7.1 for Patches/Hunks extensions to work)

  • Python MySQLDb (optional, but of course required if you wish to actually use MySQL as your database engine!)

Install

You don't need to do anything if you are happy using CVSAnalY from the path you downloaded it to. This is easiest if you intend on staying up-to-date with our releases.

If you want to install it to a system location, you can do this by running the setup.py script:

python setup.py install

If you do this, you'll need to remember to run this every time you get a new release.

If you don't have root privledges, you can just add CVSAnalY to your $PATH (cvsanalydir is the directory where CVSAnalY is installed):

export PATH=$PATH:cvsanalydir

CVSAnalY needs RepositoryHandler. If it is not installed in the usual path for Python packages, PKG_CONFIG_PATH should include the directory where it is installed (repohandlerdir is the path where RepositoryHandler is installed):

export PKG_CONFIG_PATH=$PKG_CONFIG_PATH:repohandlerdir

You are ready to use cvsanaly2!

Running cvsanaly2

For the impatient: just checkout (from svn or cvs) to obtain a local version of your repository, and then run cvsanaly2:

cd project/
~/project$ cvsanaly2 

More options, and a more detailed info about the options, can be learnt by running cvsanaly2 --help

If you're having problems

Packet bigger than max_allowed_packet

Sometimes, a lot of data can pass between CVSAnalY and MySQL, and packet limits are set too small.

Follow the instructions here.

UnicodeEncodeError: 'ascii' codec can't encode character

This happens because Python is trying to print out a Unicode string to a terminal that has told Python it only supports ASCII. You can co-erce Python into printing Unicode by setting up your sitecustomize.py.

How to get the original CVSAnalY2

git clone http://git.libresoft.es/cvsanaly
git clone git://git.libresoft.es/git/cvsanaly	

Credits

CVSAnalY is developed by the GSyC/LibreSoft group at the Universidad Rey Juan Carlos in Móstoles, near Madrid (Spain). It is part of a wider research on libre software engineering, aimed to gain knowledge on how libre software is developed and maintained.

More information

CVSAnalY: https://forge.morfeo-project.org/projects/libresoft-tools/

The GSyC/LibreSoft group: http://libresoft.es

Fork authors

Main authors of CVSAnalY

Contributors