OlivierBinette
Data Scientist @ American Institutes for Research Duke Statistical Science PhD
Duke UniversityDurham, NC
Pinned Repositories
assert
Lightweight validation tool for checking function arguments and data analysis scripts.
cache
Easily cache and retrieve computation results in R
CSVMeta
Lightweight csv read/write, keeping track of csv dialect and other metadata.
dgaFast
Multiple Systems Estimation Using Decomposable Graphical Models. This is an efficient re-implementation and extension of the dga R package.
er-evaluation
An End-to-End Evaluation Framework for Entity Resolution Systems
fingermatchR
Fingerprint matching tools based on NIST's mindtct and bozorth3 algorithms.
groupbyrule
Deduplicate data using fuzzy and deterministic matching rules.
StringCompare
Efficient String Comparison Functions and Fuzzy String Matching
VisTree
er-evaluation
An End-to-End Evaluation Framework for Entity Resolution Systems
OlivierBinette's Repositories
OlivierBinette/er-evaluation
An End-to-End Evaluation Framework for Entity Resolution Systems
OlivierBinette/groupbyrule
Deduplicate data using fuzzy and deterministic matching rules.
OlivierBinette/VisTree
OlivierBinette/TessTools
Tools for the use of Tesseract OCR in R
OlivierBinette/simple-typo-tolerant-search
Efficient typo-tolerant search in 76 lines of code, with no dependencies.
OlivierBinette/olivierbinette.github.io
OlivierBinette/streamlit-survey
Survey components for Streamlit apps
OlivierBinette/CSVMeta
Lightweight csv read/write, keeping track of csv dialect and other metadata.
OlivierBinette/JSM-2023
ER-Evaluation Demo for JSM 2023
OlivierBinette/OlivierBinette
OlivierBinette/splink
Implementation in Apache Spark of the EM algorithm to estimate parameters of Fellegi-Sunter's canonical model of record linkage.
OlivierBinette/assignee-search
OlivierBinette/Awesome-LLMs-Evaluation-Papers
The papers are organized according to our survey: Evaluating Large Language Models: A Comprehensive Survey.
OlivierBinette/Digital-Garden
OlivierBinette/duckdb
DuckDB is an in-process SQL OLAP Database Management System
OlivierBinette/facets
Visualizations for machine learning datasets
OlivierBinette/FeatureStore-lite
A lightweight feature store for Pandas, DuckDB, or your choice of backend.
OlivierBinette/FoFo
OlivierBinette/giskard
🐢 The testing framework for ML models, from tabular to LLMs
OlivierBinette/HandsOnEntityResolution
This repository accompanies the early release of Hands On Entity Resolution
OlivierBinette/imodels
Interpretable ML package 🔍 for concise, transparent, and accurate predictive modeling (sklearn-compatible).
OlivierBinette/LLM-Hamza
OlivierBinette/mismo
The SQL/Ibis powered sklearn of record linkage
OlivierBinette/RMarkdown-Reproducibility-Template
Template for a reproducible RMarkdown document
OlivierBinette/seisbench
SeisBench - A toolbox for machine learning in seismology
OlivierBinette/streamlit-example
Example Streamlit app that you can fork to test out share.streamlit.io
OlivierBinette/trubrics-sdk
Validate your ML models and collect human feedback with Trubrics
OlivierBinette/trubrics-tests
OlivierBinette/TruthfulQA
TruthfulQA: Measuring How Models Imitate Human Falsehoods
OlivierBinette/ul-benchmark-datasets-for-entity-resolution-archive
Unofficial archive of https://dbs.uni-leipzig.de/research/projects/benchmark-datasets-for-entity-resolution