Pinned Repositories
Crawler-for-News-Website
Developed a simple web crawler to measure aspects of a crawl, study the characteristics of the crawl, download web pages from the crawl and gather webpage metadata of C-Span website
d3-cloud
Create word clouds in JavaScript.
Database-Systems-Assignments
CSCI 585 Assignments. 1. EER Diagram for E-Learn 2. SQL 3. KML - Nearest Neighbors and Convex Hull code 4. Tinkerpop Gremlin 5. Weka, Rapid Miner, Knime tools execution.
deepsentirank
Using AlexNet CNN to classify images into one of the classes defined in caffe_classes.py. Images with similar classes can be grouped together and used for Image Similarity Search. To test the model please run testModel.py
FaceBook-Search
Facebook-Search-Android-App
Heart-Disease-Prediction-System
project related
PollApp
Polling App on WindowsPhone OS. Used for Survey purposes. Allows users to post their own questions and also vote for their favourite option for questions posted by others.
Workshop_Management
Allows Faculty to set their own slots and manage their assigned slots for the workshops for various courses in college.
Xml-Parser
XML_Parser
prenastro's Repositories
prenastro/deepsentirank-1
Deep Learning based Sentiment Ranking for Multimedia
prenastro/deepsentirank
Using AlexNet CNN to classify images into one of the classes defined in caffe_classes.py. Images with similar classes can be grouped together and used for Image Similarity Search. To test the model please run testModel.py
prenastro/PolarPostProcessing
This code gets connected to Solr DB created for Sparkler Crawled Data to do further data extraction, classification, filtering and insights generation using various Machine Learning models. The ML models are capable of using keywords list from user, extract features from URL content, and classify (score) output and update Solr parameter accordingly. Apache Sparkler Link: https://github.com/USCDataScience/sparkler
prenastro/polar-deep-insights
Conceptual - Temporal - Spatial analysis of the trec polar dataset
prenastro/polar.usc.edu
Polar USC activities related to NSF Polar CyberInfrastructure program at the University of Southern California
prenastro/Database-Systems-Assignments
CSCI 585 Assignments. 1. EER Diagram for E-Learn 2. SQL 3. KML - Nearest Neighbors and Convex Hull code 4. Tinkerpop Gremlin 5. Weka, Rapid Miner, Knime tools execution.
prenastro/Crawler-for-News-Website
Developed a simple web crawler to measure aspects of a crawl, study the characteristics of the crawl, download web pages from the crawl and gather webpage metadata of C-Span website
prenastro/Search-Engine-Enhancement
Adding Spell Checking, AutoComplete and Snippets functionality to Solr Search Engine. Enhanced Solr program with spelling correction and an autocomplete (suggest) function. Also used an external spelling correction program called Norvig’s spell correction program in conjunction with Solr, to enhance the autocomplete functionality of Solr. Norvig’s spell correction program uses a text file(‘’big.txt”) to get set of words to calculate edit distance. Here I am using Apache Tika for this purpose.
prenastro/Solr-Ranking-Algos-Comparison
Imported a set of pages on Apache Solr and analyzed different ranking Algorithms like Lucene and PageRank. Using Solr to index documents, Tika and TagSoup library to extract text from any kind of HTML found on web. Developed a PHP client which accepts input from the user in HTML form, and sends request to the Solr server. Solr server processes the query and returns results which are parsed by the PHP program and displayed. Changing the ranking algorithm in Solr to PageRank. The app loops through each fetched webpage and extracts outgoing links. Using a mapping file which has web pages mapping to actual urls, filter out the urls not present in the file. Create a network graph with web pages as vertices and links representing an edge between two files using NetworkX Library. Search for a list of keywords and compare the two Algorithms.
prenastro/Inverted-Index-Using-GCP-and-Hadoop-Cluster
Created an Inverted Index of words occurring in a set of web pages using a subset of 74 files from a total of 408 files (text extracted from HTML tags) derived from the Stanford WebBase project (https://ebiquity.umbc.edu/resource/html/id/351). Placed these files in a bucket on Google cloud storage and ran a Hadoop job to read inputs from this bucket.
prenastro/sparkler
Spark-Crawler : Evolving Apache Nutch to run on Spark.
prenastro/image_space
Image similarity and search application
prenastro/uscdatascience.github.io
USC Information Retrieval and Data Science Group
prenastro/pdi-topics
LDA Topic Modeling for Polar Data Insights
prenastro/imagecat
ImageCat is an Apache OODT RADIX application that uses Apache Solr, Apache Tika and Apache OODT to ingest 10s of millions of files (images,but could be extended to other files) in place, and to extract metadata and OCR information from those files/images using Tika and Tesseract OCR.
prenastro/img2text
Models, and associated helper code for GSOC 2017 project Tensorflow Image to Text in Apache Tika
prenastro/tika-python
Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
prenastro/Xml-Parser
XML_Parser
prenastro/Heart-Disease-Prediction-System
project related
prenastro/Workshop_Management
Allows Faculty to set their own slots and manage their assigned slots for the workshops for various courses in college.
prenastro/PollApp
Polling App on WindowsPhone OS. Used for Survey purposes. Allows users to post their own questions and also vote for their favourite option for questions posted by others.
prenastro/Machine-Learning-Data-Analysis
prenastro/Facebook-Search-Android-App
prenastro/FaceBook-Search
prenastro/tika
Fork of APACHE TIKA - Specific Customizations for textual content extraction and enrichment
prenastro/npm-bower-yo-grunt
Docker container with node, npm, bower, yeoman and grunt packages.
prenastro/d3-cloud
Create word clouds in JavaScript.
prenastro/osgoculusviewer
An OsgViewer with support for the Oculus Rift