/portfolio

This repository is to show my Data Analytics & Engineering skills, share projects, and track my progress.

Data Engineer Portfolio

Greetings! My name is Lucjan, and I'm excited to share my still developing data engineer portfolio. Within this repository, you'll find a comprehensive catalog of projects completed in various data analytics/engineering courses or self development exercises, each of which covers essential skills and techniques.

  • Brief overview: Apache Beam model was used to extract, transform and load (ETL) data from BigQuery dataset (bigquery-public-data.london_bicycles.cycle_hire) with some detailed informations about London bicycles to obtain some insight on London cycling behaviour. The task was to get number of rides from one station to another and present the results in form of text file (start_id, end_id, number_of_rides)
  • Technology used: python, Apache Beam, GCP, Google Cloud SDK Shell
  • Final results: output text file with results
  • Brief overview: Apache Spark was used to process a large volume of data by using an EMR cluster on AWS. The objective was to analyze 27 million movie ratings for 58,000 movies provided by 280,000 users and find the most similar movies to a selected movie.
  • Technology used: AWS (EMR on EC2, S3), Apache Spark, pyspark, python
  • Final results: Top 10 movies similar to "Star Wars: Episode IV - A New Hope"
  • Brief overview: In this project, Apache Kafka was used to test the functionality of streaming data transmission.
  • Technology used: AWS (EC2, S3, Crawler, Amazon Athena), python
  • Outcome: Continuously uploaded data to an S3 bucket during the program's execution
  • Brief overview: Following case study was completed as part of the Google Data Analytics Certificate.
  • Methodology: data preprocessing, data cleaning, data analysis, visualization, making conclusions, creating a strategy proposal
  • Technology used: python, pandas, matplotlib, numpy, seaborn
  • Final results: analysis & visualisation

Others:

My LinkedIn Profile

My Tableau Profile