AnilSener

Pinned Repositories

100days
100 days of algorithms
Language:Jupyter Notebook0 1 00
2018-MachineLearning-Lectures-ESA
Machine Learning Lectures at the European Space Agency (ESA) in 2018
Language:Jupyter Notebook0 1 00
Air_Tranportation_Statistics_Data_Inteview_Case_Study
Analysis of Air Tranportation Statistics Data Case Study solutions for a Lead Data Engineering Position
Language:Jupyter Notebook1 2 00
Axa-Insurance-Telematics-Kaggle
I developed this case study only in 7 days with Pyspark (Spark 1.6.0) SQL & MLlib. I used Databricks cluster and AWS. %90 AUC is achieved (without involving Trip Matching-Repeated Trips feature) with Random Forest. Many ensembles with RF, GBT and Logistic Regression and outlier elimination could be used to improve this result. There are two versions of my code (test and full execution). Since AWS costs have exceeded my budget I sopped to train my model(s) all dataset for full dataset execution. There is also a ppt that presents my outputs in test execution. Full Data Execution code is more production ready and slightly different version. I had to use Databricks Table Caching to TRAIN and TEST data tables to obtain acceptable performance in production ready version.
Language:Jupyter Notebook16 3 110
IEDATACHALLANGE-DJANGO-ANALYTICS-PROJECT
IE 2nd term project prototype application based on Telefonica Mobility and BBVA Credit Card Payments. Provider data is strictly disclosed; but you can use the code in any purpose you desire. MVC stack framework using python Django. Api integrations with Expedia and Twitter Streaming API. Important work on TripAdvisor webscraping. NLP (NLTK) for Topic based sentiment analysis(Trip Advisor Reviews), Timeseries forecasting, Recommendation Engine, Leaflet Data Visualization, NetworkX SNA (python and JS). BBVA data is neglected because of lack of data integrity and necessary categories. I hope this work can be helpful to practicioners of Django framework and analytics. This application is developed in a very short term with Agile methodology, therefore it is normal that there are problems and inconsistencies of code quality. For example we tried to use mongoengine and Django framework document models as a common data source; but we faced with difficulties time to time because of lack of accurate documentation in web. Whenever we resolved we followed the accurate coding practice. Please followup the model usage practice in the last view in views.py to comply with MVC, do not use pymongo directly. Mongoengine will provide features like DBConnectionPooling that will facilitate a scalable architecture.
Language:JavaScript1 2 04
mortgagebalanceforecastingengine
Mortgage Balance Forecasting Engine Pyspark (Spark 1.3.0), Django, SimPy, Python
Language:Python20
python-NLTK-exercise---sentiwordnet-scoring
python-NLP-Simple Sentiment Analysis
Language:Python6 2 110
semiGridSearchCV
Scikit-learn compliant Semi-supervised learning Grid Search with Cross Validation
Language:Python1 1 00
semiKmeans
scikit-learn compliant Semi-Supervised Kmeans (seeded Kmeans) with probability estimates
Language:Python5 1 11
tivi
Currently under development! Tivi Pyspark Streaming and Django Project to build up a recommender system for TV channel audience. Entity Resolution and two recommendation engine algorithms would be used with drifting principle acocridng to training set average treshold comparison principle during the validation. (Content based- collaborative based on show genres/topics and Alternating Least Squares Collaborative Filtering)
Language:JavaScript1 4 00

AnilSener's Repositories

AnilSener/2018-MachineLearning-Lectures-ESA
Machine Learning Lectures at the European Space Agency (ESA) in 2018
Language:Jupyter Notebook0 1 00
AnilSener/amazon-emr-management-guide
The open source version of the Amazon EMR Management Guide. You can submit feedback & requests for changes by submitting issues in this repo or by making proposed changes & submitting a pull request.
1 0
AnilSener/Apache-Spark-Deep-Learning-Cookbook
Apache Spark Deep Learning Cookbook, published by Packt
Language:HTML1 0
AnilSener/awesome-machine-learning
A curated list of awesome Machine Learning frameworks, libraries and software.
Language:Python2 0
AnilSener/aws-devops-essential
In few hours, quickly learn how to effectively leverage various AWS services to improve developer productivity and reduce the overall time to market for new product capabilities.
1 0
AnilSener/bayeslite
BayesDB on SQLite. A Bayesian database table for querying the probable implications of data as easily as SQL databases query the data itself.
Language:Python
AnilSener/boto3
AWS SDK for Python
Language:Python2 0
AnilSener/brick-tutorial-buildsys2017
Language:Jupyter Notebook2 01
AnilSener/Data_Structures_Algorithms_In_Python
My implementation of 80+ popular data structures and algorithms and interview questions in Python 3
Language:Python1 0
AnilSener/DIVE-backend
Codebase for DIVE backend (server, worker, and ORM)
Language:Python2 0
AnilSener/DIVE-frontend
Codebase for DIVE SPA using React and Redux
Language:JavaScript1 0
AnilSener/GPUEnabler
Provides GPU awareness to Spark, Contact: @kmadhugit and @kiszk
Language:Scala2 0
AnilSener/jupyterlab-hub
JupyterLab extension for running JupyterLab with JupyterHub
Language:TypeScript1 0
AnilSener/kepler.gl
Language:JavaScript2 0
AnilSener/kernel_gateway
Jupyter Kernel Gateway
Language:Python1 0
AnilSener/kinesis-sql
Kinesis Connector for Structured Streaming
Language:Scala1 0
AnilSener/machine_learning_examples
A collection of machine learning examples and tutorials.
Language:Python2 0
AnilSener/mleap
MLeap: Deploy Spark Pipelines to Production
Language:Scala2 0
AnilSener/mlflow
Open source platform for the machine learning lifecycle
Language:Python1 0
AnilSener/Optimus
:truck: Agile Data Science Workflows made easy with Python and Spark.
Language:Python2 0
AnilSener/pyspark_dist_explore
Data Exploration in PySpark made easy - Pyspark_dist_explore provides methods to get fast insights in your Spark DataFrames.
Language:Python2 0
AnilSener/Python
Python code for YouTube videos.
Language:Jupyter Notebook2 0
AnilSener/python-sortedcontainers
Python Sorted Container Types: Sorted List, Sorted Dict, and Sorted Set
Language:Python1 0
AnilSener/QUALIFIER
Qualitiy control for gated flow cytometry data
Language:R1 0
AnilSener/sagemaker-spark
A Spark library for Amazon SageMaker.
Language:Scala1 0
AnilSener/scio
A Scala API for Apache Beam and Google Cloud Dataflow.
Language:Scala2 0
AnilSener/spark-notes
2 0
AnilSener/SparkInternals
Notes talking about the design and implementation of Apache Spark
2 0
AnilSener/training-data-analyst
Labs and demos for courses for GCP Training (http://cloud.google.com/training).
Language:Jupyter Notebook2 0
AnilSener/VBYO2018
Veri Bilimi Yaz Okulu
Language:Jupyter Notebook1 0