audreyle
Linguistic anthropologist and taxonomist aspiring to build her own knowledge graphs and data pipelines for managing labeling workflows
Pinned Repositories
categorical_encoding
Repository for the research and implementation of categorical encoding into a Featuretools-compatible Python library
django-blog
Enron-email-parser
Email parser written in Scala to clean unstructured dataset
iCorruptionHack
Auditing FEC Bulk Data
IMDB-Vector-Search-DB
market-data-stream
A repository for some of the code I used in kaggle data science & machine learning tasks.
Predicting-ICU-Deaths
Submission for the WiDS 2020 Kaggle competition: Predicting ICU deaths using CatBoost, XGBoost and RandomForest classifiers and the MIT GOSSIS dataset
Real-Time-Stock-Updates
Published data stream to Kafka (aka Azure Event Hubs), serialized it and wrote SparkSQL queries against resulting Delta Lake tables.
social-network-sql-database
Wrote a SQLite backend in Python to add, delete, search and update users, their statuses and images. Ran unit tests, a REST API and a few optimizations in SQLite and MongoDB.
Wikipedia-Property-Graph
Scala function to transform RDF into property graph using Spark GraphX API
audreyle's Repositories
audreyle/categorical_encoding
Repository for the research and implementation of categorical encoding into a Featuretools-compatible Python library
audreyle/django-blog
audreyle/Enron-email-parser
Email parser written in Scala to clean unstructured dataset
audreyle/iCorruptionHack
Auditing FEC Bulk Data
audreyle/IMDB-Vector-Search-DB
audreyle/market-data-stream
A repository for some of the code I used in kaggle data science & machine learning tasks.
audreyle/my-data-ontology
Provides a standardized extensible semantics for representing information about a person’s profile.
audreyle/News-Analysis
Using Gensim and SpaCy models for topic modeling in the news, and experimenting with LTSMs and GRUs to explore features such as writing style and sentiment per news category
audreyle/Predicting-ICU-Deaths
Submission for the WiDS 2020 Kaggle competition: Predicting ICU deaths using CatBoost, XGBoost and RandomForest classifiers and the MIT GOSSIS dataset
audreyle/Real-Time-Stock-Updates
Published data stream to Kafka (aka Azure Event Hubs), serialized it and wrote SparkSQL queries against resulting Delta Lake tables.
audreyle/social-network-sql-database
Wrote a SQLite backend in Python to add, delete, search and update users, their statuses and images. Ran unit tests, a REST API and a few optimizations in SQLite and MongoDB.
audreyle/Wikipedia-Property-Graph
Scala function to transform RDF into property graph using Spark GraphX API
audreyle/private-data-objects
The Private Data Objects lab provides technology for confidentiality-preserving, off-chain smart contracts.
audreyle/social-network-backend-mongodb
audreyle/social-network-backend-sqlite