Pinned Repositories
Arabesque
Scalable Graph Mining
gdfm.github.io
My GitHub user page
ica
Python implementation of the Iterative Classification Algorithm
incubator-samoa
Mirror of Apache Samoa (Incubating)
partial-key-grouping
An implementation and example of Partial Key Grouping for Apache Storm. Partial Key Grouping is a load balancing strategy for distributed stream processing systems.
s4
S4 repository
similarity-self-join
Hadoop code for "Document Similarity Self-Join With Mapreduce" (ICDM'10)
sssj
Streaming Similarity Self Join
twitter-crawler
Crawler for the social network of Twitter
gdfm's Repositories
gdfm/partial-key-grouping
An implementation and example of Partial Key Grouping for Apache Storm. Partial Key Grouping is a load balancing strategy for distributed stream processing systems.
gdfm/sssj
Streaming Similarity Self Join
gdfm/similarity-self-join
Hadoop code for "Document Similarity Self-Join With Mapreduce" (ICDM'10)
gdfm/twitter-crawler
Crawler for the social network of Twitter
gdfm/s4
S4 repository
gdfm/Arabesque
Scalable Graph Mining
gdfm/gdfm.github.io
My GitHub user page
gdfm/ica
Python implementation of the Iterative Classification Algorithm
gdfm/incubator-samoa
Mirror of Apache Samoa (Incubating)
gdfm/kafka
Mirror of Apache Kafka
gdfm/keystone
Simplifying robust end-to-end machine learning on Apache Spark.
gdfm/Mr.LDA
gdfm/okapi
Large-scale ML & graph analytics on Giraph
gdfm/samoa
SAMOA (Scalable Advanced Massive Online Analysis) is an open-source platform for mining big data streams.
gdfm/shobai-dogu
Tools of the trade
gdfm/storm
Mirror of Apache Storm
gdfm/vowpal_wabbit
John Langford's original release of Vowpal Wabbit -- a fast online learning algorithm