A curated list of applied machine learning and data science notebooks and libraries accross different industries. The code in this repository is in Python (primarily using jupyter notebooks) unless otherwise stated. The catalogue is inspired by awesome-machine-learning
.
Caution: This is a work in progress, please contribute, especially if you are a subject expert in any of the industries as listed below. If you are a [analytical, computational, statistical, quantitive] researcher/analyst in field X or a field X [machine learning engineer, data scientist, modeler, programmer] then your contribution will be greatly appreciated.
If you want to contribute to this list (please do), send me a pull request or contact me @dereknow. Also, a listed repository should be deprecated if:
- Repository's owner explicitly say that "this library is not maintained".
- Not committed for long time (2~3 years).
Help Needed: If there is any contributors out there willing to help first populate and then maintain a Python analytics section in any one of the following sub/industries, please get in contact with me.
Accommodation & Food | Agriculture & Forestry | Banking & Insurance |
Biotechnological & Life Sciences | Construction & Engineering | Education & Research |
Emergency & Police | Entertainment, Recreation & Arts | Goods & Manufacturing |
Government and Public Works | Healthcare and Social Assitance | Media & Publishing |
Mining, Oil & Gas Extraction | Miscellaneous | Professional & Technical Services |
Real Estate, Rental & Leasing | Technology | Telecommunications |
Transportation & Warehousing | Utilities | Wholesale & Retail |
Justice, Law and Regulations | Accounting & Auditing |
- Chart of Account Prediction - Using labeled data to suggest the account name for every transaction.
- Accounting Anomalies - Using deep-learning frameworks to identify accounting anomalies.
- Financial Statement Anomalies - Detecting anomalies before filing, using R.
- Useful Life Prediction (FirmAI) - Predict the useful life of assets using sensor observations and feature engineering.
- AI Applied to XBRL - Standardized representation of XBRL into AI and Machine learning.
- Forensic Accounting - Collection of case studies on forensic accounting using data analysis. On the lookout for more data to practise forensic accounting, please get in touch
- General Ledger (FirmAI) - Data processing over a general ledger as exported through an accounting system.
- Bullet Graph (FirmAI) - Bullet graph visualisation helpful for tracking sales, commission and other performance.
- Aged Debtors (FirmAI) - Example analysis to invetigate aged debtors.
- Automated FS XBRL - XML Language, however, possibly port analysis into Python.
- Financial Sentiment Analysis - Sentiment, distance and proportion analysis for trading signals.
- Extensive NLP - Comprehensive NLP techniques for accounting research.
- EDGAR - A walk-through in how to obtain EDGAR data.
- IRS - Acessing and parsing IRS filings.
- Financial Corporate - Rutgers corporate financial datasets.
- Non-financial Corporate - Rutgers non-financial corporate dataset.
- PDF Parsing - Extracting useful data from PDF documents.
- PDF Tabel to Excel - How to output an excel file from a PDF.
- Understanding Accounting Analytics - An article that tackles the importance of accounting analytics.
- VLFeat - VLFeat is an open and portable library of computer vision algorithms, which has Matlab toolbox.
- Rutgers Raw - Good digital accounting research from Rutgers.
- Computer Augmented Accounting - A video series from Rutgers University looking at the use of computation to improve accounting.
- Accounting in a Digital Era - Another series by Rutgers investigating the effects the digital age will have on accounting.
- Loan Acceptance - Classification and time-series analysis for loan acceptance.
- Predict Loan Repayment - Predict whether a loan will be repaid using automated feature engineering.
- Loan Eligibility Ranking - System to help the banks check if a customer is eligible for a given loan.
- Home Credit Default (FirmAI) - Predict home credit default.
- Mortgage Analytics - Extensive mortgage loan analytics.
- Credit Approval - A system for credit card approval.
- Loan Risk - Predictive model to help to reduce charge-offs and losses of loans.
- Amortisation Schedule (FirmAI) - Simple amortisation schedule in python for personal use.
- Credit Card - Estimate the CLV of credit card customers.
- Survival Analysis - Perform a survival analysis of customers.
- Next Transaction - Deep learning model to predict the transaction amount and days to next transaction.
- Credit Card Churn - Predicting credit card customer churn.
- Bank of England Minutes - Textual analysis over bank minutes.
- Zillow Prediction - Zillow valuation prediction as performed on Kaggle.
- Real Estate - Predicting real estate prices from the urban environment.
- Used Car - Used vehicle price prediction.
- XGBoost - Fraud Detection by tuning XGBoost hyper-parameters with Simulated Annealing
- Fraud Detection Loan in R - Fraud detection in bank loans.
- AML Finance Due Diligence - Search news articles to do finance AML DD.
- Credit Card Fraud - Detecting credit card fraud.
- Bank Failure - Predicting bank failure.
- Risk Management - Finance risk engagement course resources.
- VaR GaN - Estimate Value-at-Risk for market risk management using Keras and TensorFlow.
- Actuarial Sciences (R) - A range of actuarial tools in R.
- Bank Note Fraud Detection - Bank Note Authentication Using DNN Tensorflow Classifier and RandomForest.
- ATM Surveillance - ATM Surveillance in banks use case.
- LexPredict - Software package and library.
- AI Para-legal - Lobe is the world's first AI paralegal.
- Legal Entity Detection - NER For Legal Documents.
- Legal Case Summarisation - Implementation of different summarisation algorithms applied to legal case judgements.
- Legal Documents Google Scholar - Using Google scholar to extract cases programatically.
- Chat Bot - Chat-bot and email notifications.
- GDPR scores - Predicting GDPR Scores for Legal Documents.
- Driving Factors FINRA - Identify the driving factors that influence the FINRA arbitration decisions.
- Securities Bias Correction - Bias-Corrected Estimation of Price Impact in Securities Litigation.
- Public Firm to Legal Decision - Embed public firms based on their reaction to legal decisions.
- Supreme Court Prediction - Predicting the ideological direction of Supreme Court decisions: ensemble vs. unified case-based model.
- Supreme Court Topic Modeling - Multiple steps necessary to implement topic modeling on supreme court decisions.
- Judge Opinion - Using text mining and machine learning to analyze judges’ opinions for a particular concern.
- ML Law Matching - A machine learning law match maker.
- Bert Multi-label Classification - Fine Grained Sentiment Analysis from AI.
- Some Computational AI Course - Video series Law MIT.
- Triage - General Purpose Risk Modeling and Prediction Toolkit for Policy and Social Good Problems.
- World Bank Poverty I - A comparative assessment of machine learning classification algorithms applied to poverty prediction.
- World Bank Poverty II - Repository for the World Bank Pover-t Test Competition Solution Overseas Company Land Ownership .
- Overseas Company Land Ownership - Identifying foreign ownership in the UK.
- CFPB - Consumer Finances Protection Bureau complaints analysis.
- Cannabis Legalisation Effect - Effects of cannabis legalization on crime.
- Election Analysis - Election Analysis and Prediction Models
- American Election Causal - Using ANES data with causal inference models.
- Campaign Finance and Election Results - Investigating the relation between campaign finance and subsequent election results.
- Conflict Prediction - Notebooks on conflict prediction.
- Burglary Prediction - Spatio-Temporal Modelling for burglary prediction.
- Predicting Disease Outbreak - Machine Learning implementation based on multiple classifier algorithm implementations.
- Road accident prediction - Prediction on type of victims on federal road accidents in Brazil.
- Text Mining - Disaster Management using Text mining.
- Twitter and disasters - Try to correctly predict whether tweets that are about disasters..
- Traffic Prediction - Multi attention recurrent neural networks for time-series (city traffic)
- Predict Crashes - Crash prediction modeling application that leverages multiple data sources.
- Predict Household Poverty - Predict the poverty of households in Costa Rica using automated feature engineering.
- Air Quality Prediction - Predict air quality(aq) in Beijing and London in the next 48 hours.
- Water Accounting - Assembles water budget data for the US from existing data source.
- Electricity French Distribution - An analysis of electricity data provided by the French Distribution Network (RTE).