tzuV
Data Scientist at CLINDA-AAU & Health Data Science Sandbox. My main focus is generation and evaluation of synthetic health data.
tzuV's Stars
statsmodels/statsmodels
Statsmodels: statistical modeling and econometrics in Python
goldmansachs/gs-quant
Python toolkit for quantitative finance
sdv-dev/SDV
Synthetic data generation for tabular data
synthetichealth/synthea
Synthetic Patient Population Simulator
scorpionhiccup/StockPricePrediction
Stock Price Prediction using Machine Learning Techniques
zalandoresearch/pytorch-ts
PyTorch based Probabilistic Time Series forecasting framework based on GluonTS backend
IBM/differential-privacy-library
Diffprivlib: The IBM Differential Privacy Library
kelvinau/crypto-arbitrage
Automatic Cryptocurrency Trading Bot using Triangular or Exchange Arbitrages
OpenMined/PyDP
The Python Differential Privacy Library. Built on top of: https://github.com/google/differential-privacy
PacktPublishing/Mastering-Python-for-Finance-Second-Edition
Mastering Python for Finance – Second Edition, published by Packt
rolling-panda-san/notebooks
Analysis on systematic trading strategies (e.g., trend-following, carry and mean-reversion). The result is regularly updated.
marketsentiment/mslive_public
Track live sentiment for stocks from Reddit and Twitter and identify growing stocks
stan-dev/pystan
PyStan, a Python interface to Stan, a platform for statistical modeling. Documentation: https://pystan.readthedocs.io
loft-br/xgboost-survival-embeddings
Improving XGBoost survival analysis with embeddings and debiased estimators
HongshengHu/membership-inference-machine-learning-literature
ArtLabss/open-data-anonymizer
Python Data Anonymization & Masking Library For Data Science Tasks
KRR-Oxford/DeepOnto
A package for ontology engineering with deep learning and language models.
StephanAkkerman/fintwit-bot
FinTwit-Bot is a Discord bot designed to track and analyze financial markets by pulling data from platforms like Twitter, Reddit, and Binance. It features customizable tools for sentiment analysis, market trends, and portfolio tracking to help traders stay informed and make data-driven decisions.
darioradecic/python-1-billion-row-challenge
IFCA-Advanced-Computing/pycanon
pyCANON is a Python library and CLI to assess the values of the parameters associated with the most common privacy-preserving techniques.
glassonion1/anonypy
Anonymization library for python. Protect the privacy of individuals.
glasgowcompbio/pyMultiOmics
Python toolbox for multi-omics data mapping and analysis
sambofra/bnstruct
R package for Bayesian Network Structure Learning
bhjadhav/EHR_Incentive_Program_Analysis_Python
The Medicare Electronic Health Record (EHR) Incentive Program provides incentives to eligible clinicians and hospitals to adopt electronic health records. This dataset combines meaningful use attestations from the Medicare EHR Incentive Program and certified health IT product data from the ONC Certified Health IT Product List (CHPL) to identify the unique vendors, products, and product types of each certified health IT product used to attest to meaningful use. (data, 2017) Data set merges information about the Centers for Medicare and Medicaid Services, Medicare and Medicaid EHR Incentive Programs attestations with the Office of the National Coordinator for Health IT Certified Health IT Products List. This new dataset enables systematic analysis of the distribution of certified EHR vendors and products among those providers that have attested to meaningful use within the CMS EHR Incentive Programs. The data set can be analyzed by state, provider type, provider specialty, and practice setting. (Technology, 2017) The dataset also includes important provider-specific data, related to the provider's participation and status in the program, unique provider identifiers, and other characteristics unique to each provider, like geography and provider type. Because providers may declare more than one EHR product when attesting, this list also provides a unique ID (i.e. NPI) for each provider. The Medicare EHR Incentive Program provides incentive payments to eligible providers as they adopt, implement, upgrade, or demonstrate meaningful use of certified EHR technology. The CHPL provides the authoritative, comprehensive listing of certified health IT products that have been tested under the ONC Certification Program. (data, 2017) The complete dataset exceeds 1 million rows of data. This data is intended to provide names of EHR products and their vendors, the certification classification of each product (Complete or Modular), the healthcare setting for which the product was certified (Ambulatory or Inpatient), the type of provider attesting to “meaningful use” of an HER, the Incentive Program the provider attested in (Medicare or Medicare/Medicaid), Unique ID for each attestation, Version of EHR product and the Stage of Meaningful Use that the provider attested to (Stage 1/Stage 2). The size of the dataset is 370 MB with 23 columns giving all the necessary information about it. The information in this dataset is from April 2011 till present which is very useful for finding interesting trends from this dataset.
JoshWeiner/ml-impute
A package for synthetic data generation for imputation using single and multiple imputation methods.
city-knowledge-graphs/phd-course
PhD course on Knowledge Graphs
BiomedDAR/copula-tabular
Generate tabular synthetic data using Gaussian copulas
GoreLab/sorghum-multi-trait
nscharrenberg/TabuGAN
A Tabular GAN with Attention Mechanisms, Reinforcement Learning, Knowledge Graphs and Clustering
suyunu/Markov-Chain-Monte-Carlo
Some cool Markov Chain Monte Carlo implementations