alhassan10ehab
AI Engineer and data scientist | MSc. student at Queen's University | BSc In Biomedical Engineering
Cairo Egypt
Pinned Repositories
Airbnb-price-category-prediction
It is a Kaggle competition: The problem is determining the appropriate listing price for a new Airbnb listing, which is a common challenge for hosts. To accomplish this, we will use the listing's characteristics and images to predict its price. you can see my Kaggle competitions here: https://www.kaggle.com/hassanehab/competitions?tab=completed
Arabic-emotion-extraction-from-text-data-using-CAMeL-Lab-bert-base-arabic
The objective of this task is to assign one of three emotion labels (positive, mixed, negative) to the given text. The task involves text preprocessing, fine-tuning arabic transformer model (CAMeL-Lab/bert-base-arabic), save and load model weights to be evaluated on separate datasets.
Clustering-and-Frequent-Pattern-Mining-on-a-bank-card-transaction
this project consists of 2 parts on bank card transaction data: the first part is performing Data Exploration, Statistics, Hypothesis Test, and Regression Analysis. the second part required clustering analysis and frequent pattern mining.
COVID-19-Outcome-Prediction
Different classifiers are designed to predict the outcome of COVID-19 (death/recovery) when a new person is admitted to the hospital.
Design-hypothesis-test-and-check-normality
some statistical and graphical tests were performed on the data to check its properties such as normality and Design hypothesis test was performed in addition, analysis the impact of the null values the data can be found here: https://www.nyc.gov/site/finance/taxes/property-rolling-sales-data.page
distilbert-base-uncased-for-emotions-classification
Tokenization using two different tokenization models provided by hugging face was performed and used "distilbert-base-uncased" model for emissions classification data can be found here: https://huggingface.co/datasets/dair-ai/emotion
Fake-Reddit-Prediction
It is a Kaggle competition: the problem was predict the label of the news if they are fake news or True news based on 60000 rows have fake news and right news in addition to some injected noise that needed to be cleaned. you can see my Kaggle competitions here: https://www.kaggle.com/hassanehab/competitions?tab=completed
Leaf-Classification-dataset-using-a-neural-network-architecture
at this project i tried to fine tune the hyperparameters to get the best performance of the fully connected network on leaf classification dataset
llm-zoomcamp
Named-Entity-Recognition-NER-from-Arabic-text-and-create-an-API-for-serving-NER-model.
For this task, I used (CAMeL-Lab/bert-base-arabic-camelbert-msa-ner) tool to extract the entities (persons, organizations, and locations) for the provided data and added the entities on each file. In addition, API creation and running using Flask, and showing results using postman.
alhassan10ehab's Repositories
alhassan10ehab/Airbnb-price-category-prediction
It is a Kaggle competition: The problem is determining the appropriate listing price for a new Airbnb listing, which is a common challenge for hosts. To accomplish this, we will use the listing's characteristics and images to predict its price. you can see my Kaggle competitions here: https://www.kaggle.com/hassanehab/competitions?tab=completed
alhassan10ehab/Arabic-emotion-extraction-from-text-data-using-CAMeL-Lab-bert-base-arabic
The objective of this task is to assign one of three emotion labels (positive, mixed, negative) to the given text. The task involves text preprocessing, fine-tuning arabic transformer model (CAMeL-Lab/bert-base-arabic), save and load model weights to be evaluated on separate datasets.
alhassan10ehab/Clustering-and-Frequent-Pattern-Mining-on-a-bank-card-transaction
this project consists of 2 parts on bank card transaction data: the first part is performing Data Exploration, Statistics, Hypothesis Test, and Regression Analysis. the second part required clustering analysis and frequent pattern mining.
alhassan10ehab/COVID-19-Outcome-Prediction
Different classifiers are designed to predict the outcome of COVID-19 (death/recovery) when a new person is admitted to the hospital.
alhassan10ehab/Design-hypothesis-test-and-check-normality
some statistical and graphical tests were performed on the data to check its properties such as normality and Design hypothesis test was performed in addition, analysis the impact of the null values the data can be found here: https://www.nyc.gov/site/finance/taxes/property-rolling-sales-data.page
alhassan10ehab/distilbert-base-uncased-for-emotions-classification
Tokenization using two different tokenization models provided by hugging face was performed and used "distilbert-base-uncased" model for emissions classification data can be found here: https://huggingface.co/datasets/dair-ai/emotion
alhassan10ehab/Fake-Reddit-Prediction
It is a Kaggle competition: the problem was predict the label of the news if they are fake news or True news based on 60000 rows have fake news and right news in addition to some injected noise that needed to be cleaned. you can see my Kaggle competitions here: https://www.kaggle.com/hassanehab/competitions?tab=completed
alhassan10ehab/Leaf-Classification-dataset-using-a-neural-network-architecture
at this project i tried to fine tune the hyperparameters to get the best performance of the fully connected network on leaf classification dataset
alhassan10ehab/llm-zoomcamp
alhassan10ehab/Named-Entity-Recognition-NER-from-Arabic-text-and-create-an-API-for-serving-NER-model.
For this task, I used (CAMeL-Lab/bert-base-arabic-camelbert-msa-ner) tool to extract the entities (persons, organizations, and locations) for the provided data and added the entities on each file. In addition, API creation and running using Flask, and showing results using postman.
alhassan10ehab/time_series_final_project_data_analysis
It was the final project for the data analysis course my team and I compered among the performance of transformers, linear models and LSTM on the time series forecasting
alhassan10ehab/performing-NLI-task-using-roberta-and-deberta
at this repo NLI task was performed on contractNLI dataset using deberta and roberta. In addition, some analyses such as reporting the performance and error analysis for each of them. the data could be found here:https://stanfordnlp.github.io/contract-nli/
alhassan10ehab/Product-Rating-Prediction
It is a Kaggle competition: our problem was a prediction problem for product ratings, based on some features such as merchant_rating, product_colo, price, and Others The output was the product rating, which has five classes from 1 to 5. you can see my Kaggle competitions here: https://www.kaggle.com/hassanehab/competitions?tab=completed
alhassan10ehab/Social-Media-Analytics-using-pyspark
For this project my team and I used Spark and a suite of relevant big data tools to analyze social media data for gaining insights into user behavior, trends, and sentiment. to find out which topics are of interest or popular, what people are talking about a particular brand or product, and how users are engaging with the social media content.
alhassan10ehab/The-Fashion-MNIST-clothing-classification-dataset-using-CNN-and-transfer-learning
It was required in this project to recognize the Fashion MNIST digits. therefore, I implement the LENET-5 architecture from scratch and used two transfer learning models (VGG-16 and RESNET).
alhassan10ehab/topic_modeling
this lab is talking about topic modeling using bertopic which consider one of popular text clustering methods the data contains 50,000 randomly sampled scientific papers collected from this link: https://www.kaggle.com/datasets/Cornell-University/arxiv
alhassan10ehab/Whether-a-first-date-will-lead-to-a-relationship
It is a Kaggle competition: our problem was Whether a first date will lead to a relationship or not based on the profile of the two people. a recommendation system was implemented. but the given data has many missing values which was handled. you can see my Kaggle competitions here: https://www.kaggle.com/hassanehab/competitions?tab=completed