Pinned Repositories
42-CFR
Random scripts and work
hospital-chargemaster
hospital chargemaster lists for open source healthcare
nexmon_csi
Channel State Information Extraction on Various Broadcom Wi-Fi Chips
predicting-Paid-amount-for-Claims-Data
Introduction The context is the 2016 public use NH medical claims files obtained from NH CHIS (Comprehensive Health Care Information System). The dataset contains Commercial Insurance claims, and a small fraction of Medicaid and Medicare payments for dually eligible people. The primary purpose of this assignment is to test machine learning (ML) skills in a real case analysis setting. You are expected to clean and process data and then apply various ML techniques like Linear and no linear models like regularized regression, MARS, and Partitioning methods. You are expected to use at least two of R, Python and JMP software. Data details: Medical claims file for 2016 contains ~17 millions rows and ~60 columns of data, containing ~6.5 million individual medical claims. These claims are all commercial claims that were filed by healthcare providers in 2016 in the state of NH. These claims were ~88% for residents of NH and the remaining for out of state visitors who sought care in NH. Each claim consists of one or more line items, each indicating a procedure done during the doctor’s visit. Two columns indicating Billed amount and the Paid amount for the care provided, are of primary interest. The main objective is to predict “Paid amount per procedure” by mapping a plethora of features available in the dataset. It is also an expectation that you would create new features using the existing ones or external data sources. Objectives: Step 1: Take a random sample of 1 million unique claims, such that all line items related to each claim are included in the sample. This will result in a little less than 3 million rows of data. Step 2: Clean up the data, understand the distributions, and create new features if necessary. Step 3: Run predictive models using validation method of your choice. Step 4: Write a descriptive report (less than 10 pages) describing the process and your findings.
PyTorchNLPBook
Code and data accompanying Natural Language Processing with PyTorch published by O'Reilly Media https://amzn.to/3JUgR2L
wsheffel's Repositories
wsheffel/3d-network-graph-visualization
wsheffel/chime
COVID-19 Hospital Impact Model for Epidemics
wsheffel/COVID-19
Novel Coronavirus (COVID-19) Cases, provided by JHU CSSE
wsheffel/covid-19-data
An ongoing repository of data on coronavirus cases and deaths in the U.S.
wsheffel/COVID-20
COVID-19 Detector from x-rays using Computer Vision and Deep Learning
wsheffel/COVID-CT
COVID-CT-Dataset: A CT Scan Dataset about COVID-19
wsheffel/COVID-Net
COVID-Net Open Source Initiative
wsheffel/COVID-QA
API & Webapp to answer questions about COVID-19. Using NLP (Question Answering) and trusted data sources.
wsheffel/covid19_scenarios
Models of COVID-19 outbreak trajectories and hospital demand
wsheffel/covid19india-react
📊 Source code of the main website
wsheffel/covid19model
Code for modelling estimated deaths and cases for COVID19.
wsheffel/covidify
Covidify - corona virus report and dataset generator for python 📈
wsheffel/Fetching-Financial-Data
Fetching financial data for technical & fundamental analysis and algorithmic trading from a variety of python packages and sources.
wsheffel/Fraud-Dashboard
wsheffel/Health_Analytics_Sample_Code
A collection of some analysis done with with medical claims data from my class
wsheffel/Health_Insurance_Data_Analysis_and_Prediction
Predicting health insurance costs(using linear regression) and Grouping health insurance claims(using logistic regression)
wsheffel/Healthcare-Data-Mining-Projects
Data mining projects include predicting risk score of chronic diseases with NHANES data and analysis of patient and insurance claim data.
wsheffel/hvplot
A high-level plotting API for pandas, dask, xarray, and networkx built on HoloViews
wsheffel/medcat-cogstack-workshop
wsheffel/moai
:moyai: Pharma and healthcare competitive intelligence through product website FDA OPDP update frequency.
wsheffel/netflix-titles-dataset
Exploratory Data Analysis on Netflix Movies and TV Shows
wsheffel/pneumonia-detection
Using pretrained ResNet for pneumonia detection
wsheffel/Regression_in_Insurance
This repo is for a logistics regression problem using an Insurance Company as case study
wsheffel/rentorbuy
A Project that uses Zillow research data on Quandl, Prophet for time series forecasting, Altair for vega-lite charts and Folium for an creating interactive map.
wsheffel/stock-analysis-engine
Backtest 1000s of minute-by-minute trading algorithms for training AI with automated pricing data from: IEX, Tradier and FinViz. Datasets and trading performance automatically published to S3 for building AI training datasets for teaching DNNs how to trade. Runs on Kubernetes and docker-compose. >150 million trading history rows generated from +5000 algorithms. Heads up: Yahoo's Finance API was disabled on 2019-01-03 https://developer.yahoo.com/yql/
wsheffel/stock_cnn_blog_pub
This project is a loose implementation of paper "Algorithmic Financial Trading with Deep Convolutional Neural Networks: Time Series to Image Conversion Approach"
wsheffel/streamlit-observable
Embed Observable notebooks into Streamlit apps!
wsheffel/Team-Peony-Primer
Primer for BIME 535
wsheffel/world-atlas
Pre-built TopoJSON from Natural Earth.
wsheffel/yellowbrick
Visual analysis and diagnostic tools to facilitate machine learning model selection.