/data-science-portfolio

All my data science projects in one place! βœ¨πŸ‘¨πŸ»β€πŸ’»

Primary LanguageJupyter Notebook

data-science-portfolio

Hello, I'm Josep! πŸ‘‹ Here you can find some of my data projects in one place!βœ¨πŸ‘¨πŸ»β€πŸ’» Hope you enjoy it :)

Overview

Introduction

I am Josep Ferrer, a Data Scientist, Analytics Engineer and Technical Writer from Barcelona. I love working with data and firmly believe in AI's power to enhance people's lives.

So... I want to share my passion with others and guide them into this vast field through writing and teaching.

Data Visualization

AirBnB

This project includes a Tableau and a PowerBI dashboard that analyzes the activity of Airbnb in the city of Barcelona. It provides insights into various aspects such as pricing trends, geographical distribution, and property types. The visualizations are designed to help understand the dynamics of Airbnb in Barcelona.

Tourists in Catalonia

This project offers a PowerBI dashboard regarding tourism trends in Catalonia. It covers data on tourist demographics, seasonal trends and geographical distribution.

Data Engineering

Chicago Taxis

This project is designed to automate the data pipeline for Chicago taxi data, ensuring that historical data is fully fetched and stored, and new data is continuously updated daily. This repository contains two main functions that handle the historical and recurrent data fetching, processing, and integration of Chicago taxi data into BigQuery using Google Cloud Functions. The raw data is modelized into five tables to smooth its usage

Barcelona daily KPIs

This repository contains the code supporting a series of Medium articles about Data Ingestion and Data Visualization with Google Looker Studio and Elastic Cloud, as two of the current most famous platforms for that purpose.

Data Analytics

Articles

This repository contains the code of some of my articles with DataCamp and KDnuggets.

EDA recipe site model development

This repository contains the code and resources used for the DataCamp Data Scientist certification exam project. The project focuses on developing a model to predict which recipes will generate high traffic on a recipe website. The analysis and model development process is documented in detail below.

YouGov Data Modeling

The primary goal of this repository is to demonstrate the transformation of a complex input dataset β€” a SPSS file with over 2,230 columns β€” into three comprehensible, interrelated tables. The purpose of this data modeling is:

  • To capture individual survey responses
  • Identify global trends by country.
  • Assess the branding of various destinations as tourist spots.

End-to-end projects

Simple Docker HuggingFace Model

This repository contains the code for the article A Simple to Implement End-to-end Project with HuggingFace: Generating a Ready-to-use HuggingFace Model with FastAPI and Docker. The project demonstrates how to deploy a sentiment analysis model using HuggingFace, FastAPI, and Docker.

LLM, ML & AI

Articles

This repository contains the code of some of my articles with DataCamp and KDnuggets.

Mars: MedicAI receptionist

A Virtual Agent developed in python using Scipy, NLTK, Neuralcoref among other packages. ProtΓ©gΓ© used to implement our custom Ontology simulating the hospital doctors, areas and distributions. Project developed within the Natural Language Interaction course of the MIIS (UPF). Dialogue generation implemented through template based subsystem.

ChatPDF

This repository contains the code for implementing a simple ChatPDF application using Large Language Models (LLMs). The project demonstrates how to create a chat interface that allows users to interact with PDF documents through natural language queries.