Pinned Repositories
ARCInputFormat
Packages the ARCInputFormat used in Common Crawl in a small jar file that can be used in MapReduce jobs. Implements HdfsARCSource. See README for details
commoncrawl
CommonCrawl Project Repository
tar2seq
Simple utility to convert an existing compressed file in an Hadoop SequenceFile
noiano's Repositories
noiano/ARCInputFormat
Packages the ARCInputFormat used in Common Crawl in a small jar file that can be used in MapReduce jobs. Implements HdfsARCSource. See README for details
noiano/commoncrawl
CommonCrawl Project Repository
noiano/tar2seq
Simple utility to convert an existing compressed file in an Hadoop SequenceFile
noiano/aad-pod-identity
Assign Azure Active Directory Identities to Kubernetes applications.
noiano/akka-edge
Examples from Developing an Akka Edge
noiano/artifacts-keyring
Keyring backend for Azure Artifacts
noiano/ml_test
noiano/spark-cassandra-connector
DataStax Spark Cassandra Connector