ManikHossain08
Specialized in Data Engineering, SQL, Data Analysis, Machine Learning and Data Science.
Bell CanadaMontreal, Quebec, Canada
Pinned Repositories
AI-Face-Mask-Detector
Bixi-Cloud-ETL-Data-Pipeline-using-Scala-Hive-AWS_Athena_JDBC-Driver
An Automated ETL Data pipeline which extract complex json data from web API service (GBFS-bixi Data) and convert to CSV for loading into Data-warehouse HDFS. After-that, Hive will process the further by external and managed table. Same procedure is also applied with AWS S3 and Athena.
Code-Smell-Java-Code-analysis-tool
Exception handling anti-pattern code analysis tool
Code-Smell-Prediction_ML
Python: Code smell prediction using ML (10 cross-validations)
Customer-Segmentation_K-Means-Clustering-in-R
Which stock prices behave similarly? Organization wants to know which companies are similar to each other to help in identifying potential customers of a SAAS software solution (e.g. Salesforce CRM or equivalent) in various segments of the market. The Sales Department is very interested in this analysis, which will help them more easily penetrate various market segments.
IntelligentSIDC_JAVA_ADT
Student Identification Code, an implementation of java ADT for O(1) to O(n) for insert, update, delete, find operations for large scale data
Kafka-Stream-Data-Pipeline-Near-Real-Time
Stream data into pipeline in near-real-time using Kafka
PySpark-Recommender-System
With this activity, I warmup myself to get a practical hands-on of recommender systems in Spark. We will use the MovieLens dataset sample provided with Spark and available in directory `data`.
Realtime-ETL-DataPipeline-Using-Avro_Schema_Registry-Spark-Kafka-HDFS-Hive-Scala
Bigdata processing (Realtime ETL DataPipeline) using Avro Schema Registry, Spark, Kafka, HDFS, Hive, Scala, docker, spark-streaming
Supervised-Binary-Classifier-For-IoT-Data-Stream
Supervised Binary Classifier For IoT Data Stream
ManikHossain08's Repositories
ManikHossain08/Realtime-ETL-DataPipeline-Using-Avro_Schema_Registry-Spark-Kafka-HDFS-Hive-Scala
Bigdata processing (Realtime ETL DataPipeline) using Avro Schema Registry, Spark, Kafka, HDFS, Hive, Scala, docker, spark-streaming
ManikHossain08/Bixi-Cloud-ETL-Data-Pipeline-using-Scala-Hive-AWS_Athena_JDBC-Driver
An Automated ETL Data pipeline which extract complex json data from web API service (GBFS-bixi Data) and convert to CSV for loading into Data-warehouse HDFS. After-that, Hive will process the further by external and managed table. Same procedure is also applied with AWS S3 and Athena.
ManikHossain08/IntelligentSIDC_JAVA_ADT
Student Identification Code, an implementation of java ADT for O(1) to O(n) for insert, update, delete, find operations for large scale data
ManikHossain08/Kafka-Stream-Data-Pipeline-Near-Real-Time
Stream data into pipeline in near-real-time using Kafka
ManikHossain08/AI-Face-Mask-Detector
ManikHossain08/Code-Clone-within-SO-Mixed_Model_Building
Research Project: A study on clone detection within the Stack Overflow code snippets. Vote Variance Factors Analysis of Similar Stack Overflow Posts (Q&A) by building Mixed Modelling.
ManikHossain08/Evolution-of-the-Stack-Overflow-Over-the-Years
Research Project: Evolution of the Stack Overflow Over the Years using R, Stack Overflow Data Dump, MSSQL, Python
ManikHossain08/ManikHossain08
Big Data Engineer at Bell Canada
ManikHossain08/Real-State-Project
Real State project Using C#.Net (Server api based) and angularJs(Client with html)
ManikHossain08/Scala-Avro-Confluent-Schema-Registry
ManikHossain08/Scala-Programming-With-SBT
Scala basic from beginner to advance level. This languages is very intuitive to use and less code to write and there is no verbosity like JAVA.
ManikHossain08/Spark-ETL-Data-Pipeline-using-SparkStreaming-HDFS-Kafka-Hive
The objectives of this project are to get experience of coding with: Spark, Spark SQL, Spark Streaming, Kafka, Scala and functional programming
ManikHossain08/STM-Data-Enrichedment-With-Hadoop-Scala
STM data enrichment, Extract, Transform, Load (e.g., ETL)
ManikHossain08/Tool2-ParseHTML-Results-Of-NiCad
ManikHossain08/TwitterLytics-using-play-scala-java-sbt
This hands-on project is for practice play framework, reactive programming using Java stream, scala and sbt tool
ManikHossain08/Apache-Spark-RDD-and-DataFrame-APIs
ManikHossain08/beginner-projects
Python beginner projects...
ManikHossain08/Clustering-and-Frequent-Itemsets
ManikHossain08/data-science-interviews
Data science interview questions and answers
ManikHossain08/Data-Science-ML-Full-Stack-2022
Everything you need to know for data science.
ManikHossain08/deploying-machine-learning-models
Example Repo for the Udemy Course "Deployment of Machine Learning Models"
ManikHossain08/etl-python-spark-pandas
ManikHossain08/leetcode-patterns
A curated list of leetcode questions grouped by their common patterns
ManikHossain08/pubsub-streaming-dataflow
ManikHossain08/PythonDataScienceHandbook
Python Data Science Handbook: full text in Jupyter Notebooks
ManikHossain08/Recommendation-Systems-with-Apache-Spark
ManikHossain08/Similarity-Search-Locality-Sensitive-Hashing-
ManikHossain08/spark-bigquery-connector
BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.
ManikHossain08/training-data-analyst
Labs and demos for courses for GCP Training (http://cloud.google.com/training).
ManikHossain08/video-game-training-sql
Hey this is the repo that has all the queries and data for my video game training series!