USC Information Retrieval & Data Science
USC Information Retrieval and Data Science Group
Los Angeles, CA
Pinned Repositories
AgePredictor
Age classification from text using PAN16, blogs, Fisher Callhome, and Cancer Forum
autoextractor
A toolkit for clustering web pages based on various similarity measures.
dl4j-kerasimport-examples
This repository contains deeplearning4j examples for importing and making use of models trained in keras
Image-Similarity-Deep-Ranking
Deep Ranking based ImageSimilarity will be developed as plugin on ImageSpace. https://users.eecs.northwestern.edu/~jwa368/pdfs/deep_ranking.pdf
NLTKRest
This is a REST Server endpoint built using Flask and Python.
polar.usc.edu
Polar USC activities related to NSF Polar CyberInfrastructure program at the University of Southern California
SentimentAnalysisParser
Combines Apache OpenNLP and Apache Tika and provides facilities for automatically deriving sentiment from text.
sparkler
Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
supervising-ui
Web UI for labelling dataset for supervised learning.
tika-dockers
A suite of Machine Learning / Deep Learning Dockerfiles to allow Apache Tika to extract objects and to produce textual captions for images and video
USC Information Retrieval & Data Science's Repositories
USCDataScience/sparkler
Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
USCDataScience/supervising-ui
Web UI for labelling dataset for supervised learning.
USCDataScience/Image-Similarity-Deep-Ranking
Deep Ranking based ImageSimilarity will be developed as plugin on ImageSpace. https://users.eecs.northwestern.edu/~jwa368/pdfs/deep_ranking.pdf
USCDataScience/autoextractor
A toolkit for clustering web pages based on various similarity measures.
USCDataScience/SentimentAnalysisParser
Combines Apache OpenNLP and Apache Tika and provides facilities for automatically deriving sentiment from text.
USCDataScience/NLTKRest
This is a REST Server endpoint built using Flask and Python.
USCDataScience/tika-dockers
A suite of Machine Learning / Deep Learning Dockerfiles to allow Apache Tika to extract objects and to produce textual captions for images and video
USCDataScience/AgePredictor
Age classification from text using PAN16, blogs, Fisher Callhome, and Cancer Forum
USCDataScience/polar.usc.edu
Polar USC activities related to NSF Polar CyberInfrastructure program at the University of Southern California
USCDataScience/polar-deep-insights
Conceptual - Temporal - Spatial analysis of the trec polar dataset
USCDataScience/parser-indexer-py
Python tools for parsing documents and building the inverted index with enriched metadata. Java version with slightly different features - https://github.com/USCDataScience/parser-indexer
USCDataScience/uscdatascience.github.io
USC Information Retrieval and Data Science Group
USCDataScience/cmu-fg-bg-similarity
CMU Foreground/Background Similarity Server from DARPA MEMEX
USCDataScience/img2text
Models, and associated helper code for GSOC 2017 project Tensorflow Image to Text in Apache Tika
USCDataScience/svm-classifier-memex
USCDataScience/ufo.usc.edu
Collection of projects from IRDS students studying unidentified flying objects
USCDataScience/deepsentirank
Deep Learning based Sentiment Ranking for Multimedia
USCDataScience/file-content-analyzer
A set of python modules to perform Byte Frequency Analysis, Byte Frequency Correlation, Cross Correlation and FHT analysis on files
USCDataScience/pdi-topics
LDA Topic Modeling for Polar Data Insights
USCDataScience/PolarDataCollection
Using Google Search API we collect URLs relevant to the Polar Domain for deep insights and intelligent crawling
USCDataScience/PolarPostProcessing
This code gets connected to Solr DB created for Sparkler Crawled Data to do further data extraction, classification, filtering and insights generation using various Machine Learning models. The ML models are capable of using keywords list from user, extract features from URL content, and classify (score) output and update Solr parameter accordingly. Apache Sparkler Link: https://github.com/USCDataScience/sparkler
USCDataScience/sweet-neo4j
A ruby parser using linkeddata and RDF to fetch the JPL Sweet ontology and load it into Neo4J for cool graph queries and examination.
USCDataScience/liresolr
Putting LIRE into Solr - an ongoing project
USCDataScience/pdftabextract
A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
USCDataScience/tika-dl-models
A place to release saved machine learning models for tika-dl
USCDataScience/sparkler-ui
USCDataScience/tika-ner-corenlp
Stanford CoreNLP NER addon for Apache Tika's NamerEntityParser
USCDataScience/DDToolAnalysis
USCDataScience/Ocean_Observation_FacetView
This is a FacetView setup for ocean observation Crawled Data.
USCDataScience/sce-domain-discovery
Domain Discovery for the Sparkler Crawl Environment