adlsgen2

There are 24 repositories under adlsgen2 topic.

  • procter-gamble-oss/octopufs

    OctopuFS library helps managing cloud storage, ADLSgen2 specifically. It allows you to operate on files (moving, copying, setting ACLs) in very efficient manner. Designed to work on databricks, but should work on any other platform as well.

    Language:Scala12248
  • oleewere/fluent-plugin-azurestorage-gen2

    Fluentd output plugin for Azure Datalake Storage Gen2 (append support)

    Language:Ruby82184
  • gerardwolf/blog

    Repository for all blog scripts and code

    Language:TSQL7315
  • jlsilva01/adls-azure

    Procedimento para criação de um Azure Data Lake Storage usando Terraform, através de uma assinatura MS Learn Sandbox

    Language:HCL7101
  • paolosalvatori/blob-private-endpoint

    This sample demonstrates how to create a Linux Virtual Machine in a virtual network that privately accesses a blob storage account using an Azure Private Endpoint.

    Language:Shell3302
  • shubhammirajkar/tokyo_olympics_de_project

    Explore the Tokyo Olympics data journey! We ingested a GitHub CSV into Azure via Data Factory, stored it in Data Lake Storage Gen2, performed transformations in Databricks, conducted advanced analytics in Azure Synapse, and visualized insights in Synapse or Power BI.

    Language:Jupyter Notebook3102
  • ayush9892/Supply-Chain-ETL

    Data Engineering Project on Supply Chain ETL. Creating a dynamic ADF pipeline to ingest both Full Load and Incremental Load data from SQL Server and then transform these datasets based on medallion architecture using Databricks.

    Language:Jupyter Notebook1200
  • easonlai/sas_access_to_adls_databricks

    Using SAS to authenticate and access to ADLS Gen 2 from Azure Databricks

    Language:Jupyter Notebook1101
  • iBalajiShanmugam/covid19-adf

    COVID19-ADF is a project that leverages Azure services to collect, analyze, and visualize COVID-19 data. With seamless data integration and advanced analytics, it provides valuable insights into the pandemic's impact, enabling informed decision-making in the fight against COVID-19.

  • just-modeling/jupyterhub-k8s-apache-spark

    Deploy apache spark in client mode on Kubernetes cluster, integrate with Jupyter notebook through Jupyterhub server.

    Language:Shell1120
  • sankamuk/ADLSGen2Admin

    Azure ADLS Gen2 CLI Tool

    Language:PowerShell1101
  • sumeghasetia/azure-dataplatform-setup

    Implementation of most useful services of Azure Data Platform.

    Language:TSQL1100
  • venkatakamaiah46/Azure

    POC projects working on Cloud Platforms

    Language:HTML1101
  • anideswandikar1/DataLakeUsageReport

    Code/Utility to recursively traverse a given Azure Data Lake Gen2 account and find the size of various Containers and Folders

    Language:PowerShell0100
  • ayush9892/SynapseSQLPool-DynamicView

    Creating a pipeline that will automatically create View of data in Synapse, whenever data arrives in ADLS Gen2.

  • bijoychaudhury/spark_aggregation_framework

    This repo contains code specific to the SQL-driven spark aggregation framework to be executed in the Databricks cluster that integrates with the Azure storage account.

    Language:Scala0100
  • iBalajiShanmugam/formual1

    "Explore Formula 1 data analytics with this project. Leveraging the Ergast API, it utilizes Databricks Spark for ingestion, transformation, and analysis. ADLS acts as the storage layer, while Power BI visualizes the ADLS presentation layer. Uncover insights in the world of Formula 1 through powerful data analytics."

    Language:Python0100
  • SonuRepo/paris_olympic_azure_project

    Explore the Paris Olympics data journey! We ingested a GitHub CSV into Azure via Data Factory, stored it in Data Lake Storage Gen2, performed transformations in Databricks, conducted analytics in Azure Synapse, and visualized insights in Synapse.

    Language:Jupyter Notebook0100
  • Srilekha-1106/databricksProject

    Implemented Azure Databricks for real-time data processing and governance using Unity Catalog, Spark Structured Streaming, Delta Lake features, Medallion Architecture, and end-to-end CI/CD pipelines. Focused on incremental loading, compute cluster management, maintaining data quality, and creating workflows.

    Language:Python0100
  • anshul-cached/sync-adls

    Azure Data Lake Gen2 Backup Sync

    Language:Python20
  • ds-fau-ck/Near-Real-Time-AirBnB-Data-Pipeline-with-CDC-Implementation-on-Azure

    AirBnB CDC Ingestion Pipeline: Near Real-Time Change Data Capture (CDC) Pipeline on Azure for Seamless Integration of Continuous Data Streams

    Language:Python
  • epomatti/az-datalake

    Azure Data Lake Gen2 with azcopy

    Language:HCL20
  • fnu-ankit/meta-data-driven-data-migration

    Azure data migration project to migrate data from on-prem SQL Server to Azure cloud using meta-data driven approach.

  • Naveen018/azure-lendingclub

    Data files for azure cloud data engineering project