wsheffel

Pinned Repositories

hospital-chargemaster
hospital chargemaster lists for open source healthcare
Language:Python1 1 00
nexmon_csi
Channel State Information Extraction on Various Broadcom Wi-Fi Chips
Language:C10
predicting-Paid-amount-for-Claims-Data
Introduction The context is the 2016 public use NH medical claims files obtained from NH CHIS (Comprehensive Health Care Information System). The dataset contains Commercial Insurance claims, and a small fraction of Medicaid and Medicare payments for dually eligible people. The primary purpose of this assignment is to test machine learning (ML) skills in a real case analysis setting. You are expected to clean and process data and then apply various ML techniques like Linear and no linear models like regularized regression, MARS, and Partitioning methods. You are expected to use at least two of R, Python and JMP software. Data details: Medical claims file for 2016 contains ~17 millions rows and ~60 columns of data, containing ~6.5 million individual medical claims. These claims are all commercial claims that were filed by healthcare providers in 2016 in the state of NH. These claims were ~88% for residents of NH and the remaining for out of state visitors who sought care in NH. Each claim consists of one or more line items, each indicating a procedure done during the doctor’s visit. Two columns indicating Billed amount and the Paid amount for the care provided, are of primary interest. The main objective is to predict “Paid amount per procedure” by mapping a plethora of features available in the dataset. It is also an expectation that you would create new features using the existing ones or external data sources. Objectives: Step 1: Take a random sample of 1 million unique claims, such that all line items related to each claim are included in the sample. This will result in a little less than 3 million rows of data. Step 2: Clean up the data, understand the distributions, and create new features if necessary. Step 3: Run predictive models using validation method of your choice. Step 4: Write a descriptive report (less than 10 pages) describing the process and your findings.
Language:Jupyter Notebook10
PyTorchNLPBook
Code and data accompanying Natural Language Processing with PyTorch published by O'Reilly Media https://amzn.to/3JUgR2L
Language:Jupyter Notebook10

wsheffel's Repositories

wsheffel/Adversarial-Tradecraft-in-Cybersecurity
A repo to support the book
wsheffel/AMLSim
The AMLSim project is intended to provide a multi-agent based simulator that generates synthetic banking transaction data together with a set of known money laundering patterns - mainly for the purpose of testing machine learning models and graph algorithms. We welcome you to enhance this effort since the data set related to money laundering is critical to advance detection capabilities of money laundering activities.
Language:Python1 0
wsheffel/api
services to access govinfo content and metadata
wsheffel/Architecting-Enterprise-React-Applications-with-Hooks
Architecting Enterprise React Applications with Hooks, published by Packt
wsheffel/bokeh
Interactive Data Visualization in the browser, from Python
Language:Python1 0
wsheffel/cleanlab
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
Language:Python0 0
wsheffel/CogStack-SemEHR
Surfacing Semantic Data from Clinical Notes in Electronic Health Records for Tailored Care, Trial Recruitment and Clinical Research
wsheffel/dash-sample-apps
Open-source demos hosted on Dash Gallery
wsheffel/Data-science
Collection of useful data science topics along with code and articles
Language:Jupyter Notebook1 0
wsheffel/Data-Science-for-Marketing-Analytics-Second-Edition
wsheffel/datasets
Datasets used in Plotly examples and documentation
wsheffel/ExData_Plotting1
Plotting Assignment 1 for Exploratory Data Analysis
1 0
wsheffel/fastai
The fastai deep learning library
wsheffel/fastbook
The fastai book, published as Jupyter Notebooks
wsheffel/fastcore
Python supercharged for the fastai library
Language:Jupyter Notebook0 0
wsheffel/flask
The Python micro framework for building web applications.
wsheffel/Hands-On-Data-Analysis-with-Pandas-2nd-edition
Materials for following along with Hands-On Data Analysis with Pandas – Second Edition
wsheffel/Healthcare-Fraud-Detection
An end to end web application developed using Streamlit to predict if a healthcare provider is fraud or not based on inpatient claims, outpatient claims and beneficiary details.
wsheffel/Jason-Brownlee-books
wsheffel/jekyll
Jekyll-based static site for The Programming Historian
Language:HTML1 0
wsheffel/jinja
A very fast and expressive template engine.
Language:Python0 0
wsheffel/jupyter-dash
Develop Dash apps in the Jupyter Notebook and JupyterLab
wsheffel/MedCAT
Medical Concept Annotation Tool
Language:Python1 0
wsheffel/Microsoft-Azure-Security-Technologies-Certification-and-Beyond
Microsoft Azure Security Technologies Certification and Beyond, published by Packt
wsheffel/my-first-binderhub-
How to install BinderHub in Repos to see and run code in github
Language:Python
wsheffel/ovc
the Open Vision Computer
Language:C1 0
wsheffel/plotly.js
Open-source JavaScript charting library behind Plotly and Dash
wsheffel/s3cogstack
Language:JavaScript1 0
wsheffel/spark-nlp-workshop
Public runnable examples of using John Snow Labs' NLP for Apache Spark.
wsheffel/Synthea-Medgraph