Pinned Repositories
analysis-of-dblp-authors-data-mongodb-and-neo4j
We analysed 2GB of DBLP repository authors data by using MongoDB and Neo4j. Dataset link: https://dblp.uni-trier.de/xml/
feedback-app
React feedback app from React course
gender-detection-elastic-search-kibana-genderize
Gender study of DBLP authors by means of elasticsearch and kibana and genderize.io API for python. Dataset link: https://dblp.uni-trier.de/xml/
github-finder-app
Find github users and display their info
gravitas-agent
First agent to test with simple tasks
house-marketplace
House marketplace built with React and FIrebase
IAB-brain-drain-tableau
Getting Started The dataset is from German IAB (Insitute for Employment and Research). 20 OECD destination countries and 195 countries of origin. The data is desaggregated by gender. The time period covered is 1980 to 2010 in 5-year intervals.
madrid-metro-data-retrieval-scrappy
We retrieved data from Madrid Metro website and completed an existing csv data file with the obtained info by means of Scrappy and python.
natural-language-processing-nltk
Natural language processing of html news files by means of nltk library for python. The base project for the NLP has been provided by Alberto Fernandez Isabel and consists of news clustering by means of tf and tf-idf algorithms and the use of ARI metric.
occupancy-detection-spark-streaming
Data production and calculation by means of Spark Streaming and Kafka. Data is1 from Occupancy detection dataset from UCI repository: https://archive.ics.uci.edu/ml/datasets/Occupancy+Detection+#
marinamashina's Repositories
marinamashina/IAB-brain-drain-tableau
Getting Started The dataset is from German IAB (Insitute for Employment and Research). 20 OECD destination countries and 195 countries of origin. The data is desaggregated by gender. The time period covered is 1980 to 2010 in 5-year intervals.
marinamashina/analysis-of-dblp-authors-data-mongodb-and-neo4j
We analysed 2GB of DBLP repository authors data by using MongoDB and Neo4j. Dataset link: https://dblp.uni-trier.de/xml/
marinamashina/feedback-app
React feedback app from React course
marinamashina/gender-detection-elastic-search-kibana-genderize
Gender study of DBLP authors by means of elasticsearch and kibana and genderize.io API for python. Dataset link: https://dblp.uni-trier.de/xml/
marinamashina/github-finder-app
Find github users and display their info
marinamashina/gravitas-agent
First agent to test with simple tasks
marinamashina/house-marketplace
House marketplace built with React and FIrebase
marinamashina/madrid-metro-data-retrieval-scrappy
We retrieved data from Madrid Metro website and completed an existing csv data file with the obtained info by means of Scrappy and python.
marinamashina/natural-language-processing-nltk
Natural language processing of html news files by means of nltk library for python. The base project for the NLP has been provided by Alberto Fernandez Isabel and consists of news clustering by means of tf and tf-idf algorithms and the use of ARI metric.
marinamashina/occupancy-detection-spark-streaming
Data production and calculation by means of Spark Streaming and Kafka. Data is1 from Occupancy detection dataset from UCI repository: https://archive.ics.uci.edu/ml/datasets/Occupancy+Detection+#
marinamashina/pathway-forte
A Python package for benchmarking pathway database with functional enrichment and classification methods
marinamashina/reshaping-data-into-long
Takes in an excel file object with multiple tabs in a wide format, and a specified index of the tab to be parsed and reshaped. Returns a data frame of the specified tab reshaped to long format, suitable por data processing, for example visualization with Tableu, etc.
marinamashina/telco-customer-churn
We gave our solution to the customer churn in telcos binary classification problem from Kaggle competition dataset: https://www.kaggle.com/blastchar/telco-customer-churn
marinamashina/test
marinamashina/tripadvisor-restaurants-study-spark
Study of data on restaurants from Tripadvisor by means of Spark dataframe operations and SQL queries. Dataset link: https://www.kaggle.com/damienbeneschi/krakow-ta-restaurans-data-raw