Pinned Repositories
adam
ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark and Parquet. Apache 2 licensed.
airflow-provider-pulumi
amazon-athena-execution-parameters-blog
amazon-emr-with-delta-lake
Amazon EMR Notebook to show how to read from and write to Delta tables with Amazon EMR
amazon-genomics-cli
amazon-omics-end-to-end-genomics
amazon-omics-tutorials
automation-api-examples
Examples for the Pulumi Automation API https://pkg.go.dev/github.com/pulumi/pulumi/sdk/v3/go/auto?tab=doc
Awesome-Federated-Machine-Learning
Everything about federated learning, including research papers, books, codes, tutorials, videos and beyond
swissknife
Collection of scripts for genome analysis
alartin's Repositories
alartin/airflow-provider-pulumi
alartin/amazon-athena-execution-parameters-blog
alartin/amazon-genomics-cli
alartin/amazon-omics-end-to-end-genomics
alartin/amazon-omics-tutorials
alartin/automation-api-examples
Examples for the Pulumi Automation API https://pkg.go.dev/github.com/pulumi/pulumi/sdk/v3/go/auto?tab=doc
alartin/aws-genomics-workflows
Genomics Workflows on AWS
alartin/Bison-Fly
Bison-Fly is the NDSU Spring Wheat UAV Pipeline developed in partnership with Drone2Phenome UAV community (D2P). In this pipeline we are presenting step by step how we have been applying UAV data on our breeding program. This is a open source R code from where anyone can use and adept for different crops. All suggestions are welcome. Help us to bui
alartin/Book_Multivariate_Statistical_Machine_Learning_For_Genomic_Prediction
alartin/cdk-emrserverless-with-delta-lake
This construct builds some elements for you to quickly launch an EMR Serverless application. After submitting the Emr Serverless job, you could also launch an EMR notebook via cluster template to check the outcome from the EMR Serverless application.
alartin/city_pipeline
a Data warehouse tech stack with postgres , DBT, Airflow
alartin/cpg-cost-control
A Cloud Function to handle GCP billing budget notifications.
alartin/cpg-infrastructure
This repository is used to manage the infrastructure at the CPG
alartin/devops-for-databricks
devops-for-databricks
alartin/emr-serverless-samples
Example code for running Spark and Hive jobs on EMR Serverless.
alartin/galaxy
Data intensive science for everyone.
alartin/GEFormer
GEFormer is a genome-wide prediction model for genotype-environment interactions based on a deep learning approach designed to predict maize phenotypes using genotype and environment jointly.
alartin/GenomicsDB
Highly performant data storage in C++ for importing, querying and transforming variant data with C/C++/Java/Spark bindings. Used in gatk4.
alartin/glow
An open-source toolkit for large-scale genomic analysis
alartin/GS_TrainingSet_Optimzation1
alartin/HaploNet
Haplotype and population structure inference using neural networks.
alartin/IRRI-OneRice-Simulation
alartin/multitrait-nirs-model
Modeling of nutritional traits from multiple crops using NIRS and machine learning/statistics
alartin/NeuralPLexer
NeuralPLexer: State-specific protein-ligand complex structure prediction with a multi-scale deep generative model
alartin/nirs-protein-prediction
We present here a 1D convolutional neural network model to predict grain protein content using spectroscopic data of multiple cereals
alartin/openapi-agent-examples
alartin/Plant_DNA_LLMs
PDLLMs: A group of tailored DNA large language models (LLMs) for analyzing plant genomes
alartin/Soybean_Trait_Prediction
Jupyter Notebooks for Code Used In Machine learning prediction models outperform deep learning models, provide interpretation and facilitate successful feature selection for soybean trait prediction
alartin/titans-pytorch
Unofficial implementation of Titans, SOTA memory for transformers, in Pytorch
alartin/torchdrug
A powerful and flexible machine learning platform for drug discovery