Pinned Repositories
pyjanitor
Clean APIs for data cleaning. Python implementation of R package Janitor
haystack
:mag: Haystack is an open source NLP framework that leverages pre-trained Transformer models. It enables developers to quickly implement production-ready semantic search, question answering, summarization and document ranking for a wide range of NLP applications.
MLND_capstone
Capstone project implementation, report, and proposal for Udacity Machine Learning Engineer Nanodegree
mmocr
OpenMMLab Text Detection, Recognition and Understanding Toolbox
notebooks
Jupyter notebooks for the Natural Language Processing with Transformers book
search_fundamentals_course
Public repository for the Search Fundamentals course taught by Daniel Tunkelang and Grant Ingersoll. Available at https://corise.com/search-fundamentals?utm_source=daniel.
search_with_machine_learning_course
Public repository for the Search with Machine Learning course taught by Daniel Tunkelang and Grant Ingersoll. Available at https://corise.com/course/search-with-machine-learning?utm_source=daniel.
shandou's Repositories
shandou/MLND_capstone
Capstone project implementation, report, and proposal for Udacity Machine Learning Engineer Nanodegree
shandou/.tmux
🇫🇷 Oh My Tmux! Pretty & versatile tmux configuration made with (imho the best tmux configuration that just works)
shandou/blog-binary-classification-metrics
Codebase for the blog post "24 Evaluation Metrics for Binary Classification (And When to Use Them)"
shandou/Categorical_similarity_measures
Library for python community to find the similarity or distance between two entities containing categorical data
shandou/Deep-Semantic-Similarity-Model-PyTorch
My PyTorch implementation of the Deep Semantic Similarity Model (DSSM)/Convolutional Latent Semantic Model (CLSM) described here: http://research.microsoft.com/pubs/226585/cikm2014_cdssm_final.pdf.
shandou/dotfiles
various tooling configs
shandou/Facial-Similarity-with-Siamese-Networks-in-Pytorch
Implementing Siamese networks with a contrastive loss for similarity learning
shandou/feature-engineering-book
Code repo for the book "Feature Engineering for Machine Learning," by Alice Zheng and Amanda Casari, O'Reilly 2018
shandou/gensim-data
Data repository for pretrained NLP models and NLP corpora.
shandou/geotext
Geotext extracts country and city mentions from text
shandou/Introduction_to_PyMC3
shandou/kaggle-HomeDepot
3rd Place Solution for HomeDepot Product Search Results Relevance Competition on Kaggle.
shandou/learning-to-rank
shandou/LSTM-siamese
Siamese-LSTM PyTorch Implementation for cikm 2018
shandou/nboost
NBoost is a scalable, search-api-boosting platform for deploying transformer models to improve the relevance of search results on different platforms (i.e. Elasticsearch)
shandou/numerical-linear-algebra
Free online textbook of Jupyter notebooks for fast.ai Computational Linear Algebra course
shandou/preDict
Lightning fast spell correction / fuzzy search library based on SymSpell by Commerce-Experts
shandou/pregex
Probabilistic regular expressions
shandou/probability_cheatsheet
A comprehensive 10-page probability cheatsheet that covers a semester's worth of introduction to probability.
shandou/pydata_nyc2018-intro-to-model-interpretability
Notebook and slides for my talk at Pydata NYC 2018
shandou/python-cheat-sheets
IPython notebooks demonstrating useful Python code snippets and functionality
shandou/pytorch-examples
Starting with deep learning and PyTorch
shandou/query-segmenter
Query Segmentation for search
shandou/scikit-hts-examples
Example usage of scikit-hts
shandou/statistics-in-R-data-sets
Data sets from book; also available on Sage website
shandou/tidytuesday
Official repo for the #tidytuesday project
shandou/TimeSeriesAnalysisWithPython
shandou/XGBoost-lambdaMART
Running LambdaMART using XGBoost