tzuV

Data Scientist at CLINDA-AAU & Health Data Science Sandbox. My main focus is generation and evaluation of synthetic health data.

tzuV's Stars

statsmodels/statsmodels
Statsmodels: statistical modeling and econometrics in Python
Language:Python9.8k 282 5.4k2.8k
goldmansachs/gs-quant
Python toolkit for quantitative finance
Language:Jupyter Notebook7k 152 30849
sdv-dev/SDV
Synthetic data generation for tabular data
Language:Python2.2k 45 1.3k295
synthetichealth/synthea
Synthetic Patient Population Simulator
Language:Java2.1k 73 569623
scorpionhiccup/StockPricePrediction
Stock Price Prediction using Machine Learning Techniques
Language:Jupyter Notebook1.3k 84 14413
zalandoresearch/pytorch-ts
PyTorch based Probabilistic Time Series forecasting framework based on GluonTS backend
Language:Python1.2k 26 139191
IBM/differential-privacy-library
Diffprivlib: The IBM Differential Privacy Library
Language:Python801 32 40195
kelvinau/crypto-arbitrage
Automatic Cryptocurrency Trading Bot using Triangular or Exchange Arbitrages
Language:Python720 87 24198
OpenMined/PyDP
The Python Differential Privacy Library. Built on top of: https://github.com/google/differential-privacy
Language:Python494 20 158136
PacktPublishing/Mastering-Python-for-Finance-Second-Edition
Mastering Python for Finance – Second Edition, published by Packt
Language:Jupyter Notebook427 25 7203
rolling-panda-san/notebooks
Analysis on systematic trading strategies (e.g., trend-following, carry and mean-reversion). The result is regularly updated.
Language:Jupyter Notebook413 16 476
marketsentiment/mslive_public
Track live sentiment for stocks from Reddit and Twitter and identify growing stocks
Language:Python351 34 3106
stan-dev/pystan
PyStan, a Python interface to Stan, a platform for statistical modeling. Documentation: https://pystan.readthedocs.io
Language:Python327 13 19958
loft-br/xgboost-survival-embeddings
Improving XGBoost survival analysis with embeddings and debiased estimators
Language:Python318 85 4453
HongshengHu/membership-inference-machine-learning-literature
256 12 033
ArtLabss/open-data-anonymizer
Python Data Anonymization & Masking Library For Data Science Tasks
Language:Python231 7 929
KRR-Oxford/DeepOnto
A package for ontology engineering with deep learning and language models.
Language:Python177 5 1511
StephanAkkerman/fintwit-bot
FinTwit-Bot is a Discord bot designed to track and analyze financial markets by pulling data from platforms like Twitter, Reddit, and Binance. It features customizable tools for sentiment analysis, market trends, and portfolio tracking to help traders stay informed and make data-driven decisions.
Language:Python54 4 32711
darioradecic/python-1-billion-row-challenge
Language:Python434
IFCA-Advanced-Computing/pycanon
pyCANON is a Python library and CLI to assess the values of the parameters associated with the most common privacy-preserving techniques.
Language:Python25 5 14
glassonion1/anonypy
Anonymization library for python. Protect the privacy of individuals.
Language:Python23 3 28
glasgowcompbio/pyMultiOmics
Python toolbox for multi-omics data mapping and analysis
Language:Jupyter Notebook17 2 244
sambofra/bnstruct
R package for Bayesian Network Structure Learning
Language:R17 4 3211
bhjadhav/EHR_Incentive_Program_Analysis_Python
The Medicare Electronic Health Record (EHR) Incentive Program provides incentives to eligible clinicians and hospitals to adopt electronic health records. This dataset combines meaningful use attestations from the Medicare EHR Incentive Program and certified health IT product data from the ONC Certified Health IT Product List (CHPL) to identify the unique vendors, products, and product types of each certified health IT product used to attest to meaningful use. (data, 2017) Data set merges information about the Centers for Medicare and Medicaid Services, Medicare and Medicaid EHR Incentive Programs attestations with the Office of the National Coordinator for Health IT Certified Health IT Products List. This new dataset enables systematic analysis of the distribution of certified EHR vendors and products among those providers that have attested to meaningful use within the CMS EHR Incentive Programs. The data set can be analyzed by state, provider type, provider specialty, and practice setting. (Technology, 2017) The dataset also includes important provider-specific data, related to the provider's participation and status in the program, unique provider identifiers, and other characteristics unique to each provider, like geography and provider type. Because providers may declare more than one EHR product when attesting, this list also provides a unique ID (i.e. NPI) for each provider. The Medicare EHR Incentive Program provides incentive payments to eligible providers as they adopt, implement, upgrade, or demonstrate meaningful use of certified EHR technology. The CHPL provides the authoritative, comprehensive listing of certified health IT products that have been tested under the ONC Certification Program. (data, 2017) The complete dataset exceeds 1 million rows of data. This data is intended to provide names of EHR products and their vendors, the certification classification of each product (Complete or Modular), the healthcare setting for which the product was certified (Ambulatory or Inpatient), the type of provider attesting to “meaningful use” of an HER, the Incentive Program the provider attested in (Medicare or Medicare/Medicaid), Unique ID for each attestation, Version of EHR product and the Stage of Meaningful Use that the provider attested to (Stage 1/Stage 2). The size of the dataset is 370 MB with 23 columns giving all the necessary information about it. The information in this dataset is from April 2011 till present which is very useful for finding interesting trends from this dataset.
5
JoshWeiner/ml-impute
A package for synthetic data generation for imputation using single and multiple imputation methods.
Language:Python4 2 00
city-knowledge-graphs/phd-course
PhD course on Knowledge Graphs
Language:Python32
BiomedDAR/copula-tabular
Generate tabular synthetic data using Gaussian copulas
Language:Python23
GoreLab/sorghum-multi-trait
Language:Python2
nscharrenberg/TabuGAN
A Tabular GAN with Attention Mechanisms, Reinforcement Learning, Knowledge Graphs and Clustering
Language:Jupyter Notebook20
suyunu/Markov-Chain-Monte-Carlo
Some cool Markov Chain Monte Carlo implementations
Language:Jupyter Notebook21

tzuV

tzuV's Stars

statsmodels/statsmodels

goldmansachs/gs-quant

sdv-dev/SDV

synthetichealth/synthea

scorpionhiccup/StockPricePrediction

zalandoresearch/pytorch-ts

IBM/differential-privacy-library

kelvinau/crypto-arbitrage

OpenMined/PyDP

PacktPublishing/Mastering-Python-for-Finance-Second-Edition

rolling-panda-san/notebooks

marketsentiment/mslive_public

stan-dev/pystan

loft-br/xgboost-survival-embeddings

HongshengHu/membership-inference-machine-learning-literature

ArtLabss/open-data-anonymizer

KRR-Oxford/DeepOnto

StephanAkkerman/fintwit-bot

darioradecic/python-1-billion-row-challenge

IFCA-Advanced-Computing/pycanon

glassonion1/anonypy

glasgowcompbio/pyMultiOmics

sambofra/bnstruct

bhjadhav/EHR_Incentive_Program_Analysis_Python

JoshWeiner/ml-impute

city-knowledge-graphs/phd-course

BiomedDAR/copula-tabular

GoreLab/sorghum-multi-trait

nscharrenberg/TabuGAN

suyunu/Markov-Chain-Monte-Carlo