To my surprise, this code has stayed on my hard disk for 5 years so I think I'd better put it on github.

DESCRIPTION

This code implements a paper [1], which won the best paper award in BigMine 2012.

The code is easy to read and hopefully it may benefit others' research. However, note that this code was written in 2011 and relied on pretty old versions of hadoop and dumbo.

INSTALLATION

There is no need to compile or install the codes in this package. To run the program, please make sure the settings in the script files are correctly set. Especially those with the '?=' marks.

REQUIREMENTS

This code uses third-party softwares/packages as follows:

  • Hadoop 0.21.0
  • Dumbo library 0.21.31

DATA SETS

The data sets evaluted in our paper are not included in this code package. One may download them from SNAP network analysis library (http://snap.stanford.edu/data/).

Citation

[1] "Delta-SimRank Computing on MapReduce", Liangliang Cao, Hyun Duk Kim, Min-Hsuan Tsai, Brian Cho, Zhen Li, Indy Gupta, ChengXiang Zhai, and Thomas S. Huang. Big Data Mining 2012.

COPYRIGHT

Copyright (c) 2010-2011 Liangliang Cao, Min-Hsuan Tsai and Zhen Li Beckman Institute, University of Illinois All rights reserved.

This code provided here is for non-commercial purposes.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 'AS IS' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.