finiteautomata
🤖 Machine Learning Engineer at @Accenture 🎓 Professor at Universidad de San Andrés My interests: NLP, LLMs 🦜 and Hate Speech
Argentina
finiteautomata's Stars
apankrat/nullboard
Nullboard is a minimalist kanban board, focused on compactness and readability.
tobymao/sqlglot
Python SQL Parser and Transpiler
opencv/opencv-python
Automated CI toolchain to produce precompiled opencv-python, opencv-python-headless, opencv-contrib-python and opencv-contrib-python-headless packages.
RManLuo/Awesome-LLM-KG
Awesome papers about unifying LLMs and KGs
ftvalentini/BiasPMI
On the Interpretability and Significance of Bias Metrics in Texts: a PMI-based Approach (Valentini et al., ACL 2023)
qurator-spk/eynollah
Document Layout Analysis
microsoft/wslg
Enabling the Windows Subsystem for Linux to include support for Wayland and X server related scenarios
natdebandi/hate_speech_ar
stanfordnlp/stanza
Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages
intro-stat-learning/ISLP_labs
Up-to-date version of labs for ISLP
freddyaboulton/gradio-pdf
Source code of the gradio_pdf custom component.
VikParuchuri/marker
Convert PDF to markdown + JSON quickly with high accuracy
huggingface/trl
Train transformer language models with reinforcement learning.
eric-mitchell/direct-preference-optimization
Reference implementation for DPO (Direct Preference Optimization)
allenai/gooaq
Question-answers, collected from Google
asg017/sqlite-vss
A SQLite extension for efficient vector search, based on Faiss!
BlueLiteBlocker/BlueLiteBlocker
A Chrome & Firefox extension for filtering out tweets from Twitter Blue users based on if they follow you and their follower count.
state-spaces/mamba
Mamba SSM architecture
somosnlp/corpus-es
Lista de corpus de PLN en español ✨ #Somos600M: Ayuda a desarrollar IA inclusiva que entienda las diferentes variedades de nuestras lenguas ✨ English-speaking contributors welcome!
castorini/pyserini
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
luferrer/ConfidenceIntervals
Confidence interval computation for evaluation in machine learning using the bootstrapping approach
texttron/tevatron
Tevatron - A flexible toolkit for neural retrieval research and development.
dorianbrown/rank_bm25
A Collection of BM25 Algorithms in Python
joacosaralegui/deteccion-automatica-de-frases-chequeables
chequeado/chequeabot
This repository contains all the tools we are working with related to Chequeabot's ecosystem.
vladkens/twscrape
2024! X / Twitter API scrapper with authorization support. Allows you to scrape search results, User's profiles (followers/following), Tweets (favoriters/retweeters) and more.
kts/gzip-knn
Reimplentation of paper using gzip + knn for text classification
erikernst4/entrainment-metrics
Acoustic-prosodic entrainment measurement in spoken dialogue and approximation of the evolution of a speaker’s a/p features.
caserec/Datasets-for-Recommender-Systems
This is a repository of a topic-centric public data sources in high quality for Recommender Systems (RS)
schmich/marinara
Pomodoro® time management assistant for Chrome