aldmon's Stars
catboost/tutorials
CatBoost tutorials repository
zygmuntz/phraug2
A new version of phraug, which is a set of simple Python scripts for pre-processing large files
rasbt/mlxtend
A library of extension and helper modules for Python's data analysis and machine learning libraries.
MaxHalford/xam
:dart: Personal data science and machine learning toolbox
yandex-cloud/terraform-provider-yandex
Terraform Yandex provider
erikbern/ann-benchmarks
Benchmarks of approximate nearest neighbor libraries in Python
spotify/annoy
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk
benfred/implicit
Fast Python Collaborative Filtering for Implicit Feedback Datasets
Chuguevskij/ml_system_design_doc_Revenue_in_shops
swuxyj/DeepHash-pytorch
Implementation of Some Deep Hash Algorithms, Including DPSH、DSH、DHN、HashNet、DSDH、DTSH、DFH、GreedyHash、CSQ.
lambdazy/lzy
Platform for a hybrid execution of ML workflows that transparently integrates local and remote runtimes
john-kurkowski/tldextract
Accurately separates a URL’s subdomain, domain, and public suffix, using the Public Suffix List (PSL).
kitabisa/mubeng
An incredibly fast proxy checker & IP rotator with ease.
mattes/rotating-proxy
Rotating TOR proxy with Docker
yakimka/python_interview_questions
Вопросы для подготовки к интервью на позицию Python Developer
akamhy/waybackpy
Wayback Machine API interface & a command-line tool
microsoft/playwright
Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.
rusq/slackdump
Save or export your private and public Slack messages, threads, files, and users locally without admin privileges.
wkentaro/gdown
Google Drive Public File Downloader when Curl/Wget Fails
twintproject/twint
An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.
trinodb/trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
ossc-db/pg_hint_plan
Extension adding support for optimizer hints in PostgreSQL
danielgatis/rembg
Rembg is a tool to remove images background
OpenGVLab/InternVideo
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
google-ai-edge/mediapipe
Cross-platform, customizable ML solutions for live and streaming media.
open-mmlab/mmaction2
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
youssefHosni/Data-Science-Interview-Questions-Answers
Curated list of data science interview questions and answers
AgaMiko/waste-datasets-review
List of image datasets with any kind of litter, garbage, waste and trash
cvat-ai/cvat
Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
apache/airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows