A common place to keep helpful resources for informatics, data science, and more!
- ML For Healthcare - MIT - Course Website for HST.956 at MIT, Machine Learning for Healthcare. Has slides and problem sets, as well as links to recorded lectures
- StrataScratch - LeetCode for Data Science. An interactive learning resource for learning to use SQL/Python to query and manipulate data tables
- Consensus - A search platform using AI to "Aggregate and distill" findings from scientific research
- AutoRegex - An "english-regex" translator built on top of GPT-3.
- Regex Generator - A more traditional tool that allows use of sample text to interactively generate Regex
- BertTopic - A topic modeling algorithm to disocver topics using BERT language transformer models
- Ray Tune - Automated Hypermparamter Tuning implemented in python for libraries like PyTorch, Tensorflow, and ScikitLearn
- Opt_List - A compiled list of hyperparameters that have been produced by google for various machine learning libraries, that they have tried and found to be effective
- MCA - Multiple Correspondence Analysis is like PCA for categorical variables
- Polars - Performant, multi-threaded Dataframe manipulation library; meant as a replacement for Pandas
- PyGWalker - A drag-and-drop library to graph a dataset for Explortatory Data Analysis within a Python environment. Also some mroe advanced features for feature selection etc..
- DuckDB - Creates an in-memory database which can be queried and mutated using SQL syntax. Provides APIs is Python, R, and Java
- SteamPipe - A Command Line tool that allows the use of SQL to access popular cloud services as well as useful APIs (Twitter, Reddit etc...)
- RATH - A more advanced Tableau-like data visualization suite. A little tricky to put together as of this note (requires spinning it up via node)
- Tad - Open-Source tabular data viewer. Uses DuckDB to handle millions of rows with a handy GUI.