PyData Berlin 2016 Materials
Keynotes
Olivier Grisel, Predictive Modelling with Python
Julia Evans, How to trick a neural network
We McKinney, Python Data Ecosystem: Thoughts on Building for the Future
Regular
Daniel Kirsch, Functional Programming in Python
Trent McConaghy, BigchainDB: a Scalable Blockchain Database, in Python
David Higgins, Introduction to Julia for Python programmers
Katharina Rasch, What every Data Scientist should know about data anonymization
Alexander Sibiryakov, Frontera: open source, large scale web crawling framework
Thomas Reineking, Plumbing in Python: Pipelines for Data Science Applications
- Yamal: Not yet Opensourced
Ryan Henderson, image-match: a python library for searching for similar images in large corpora
Ian Ozsvald, Statistically Solving Sneezes and Sniffles (a work in progress)
- https://speakerdeck.com/ianozsvald/statistically-solving-sniffles-step-by-step-a-work-in-progress
- http://ianozsvald.com/2016/05/07/statistically-solving-sneezes-and-sniffles-a-work-in-progress-report-at-pydatalondon-2016/
Felix Biessmann, Predicting Political Views From Text
Jie Bao, ExpAn - A Python Library for A/B Testing Analysis
- https://github.com/zalando/expan
- http://www.slideshare.net/JieBao3/expan-presentation-pydata-berlin-2016
Anne Matthies, Zero-Administration Data Pipelines using AWS Simple Workflow
Daniel Moisset, Bridging the gap: from Data Science to service
Katharine Jarmul, Holy D@t*! How to Deal with Imperfect, Unclean Datasets
Nora Neumann, Usable A/B testing – A Bayesian approach
Frank Kaufer, Building Polyglot Data Science Platform on Big Data Systems
Lukasz Czarnecki, Brand recognition in real-life photos using deep learning
Edouard Fouché, Accelerating Python Analytics by In-Database Processing
Delia Rusu, Estimating stock price correlations using Wikipedia
- https://speakerdeck.com/deliarusu/estimating-stock-price-correlations-using-wikipedia
- https://github.com/deliarusu/wikipedia-correlation
Jakob van Santen, The IceCube data pipeline from the South Pole to publication
Moritz Neeb, Bayesian Optimization and it's application to Neural Networks"
Kashif Rasul, What's new in Deep Learning?
Nathan Epstein, Machine Learning at Scale
Ronert Obst and Dat Tran, PySpark in Practice
Jose Quesada, A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and cons
Martina Pugliese, Spotting trends and tailoring recommendations: PySpark on Big Data in fashion
Angelos Kapsimanis, The Simple Leads To The Spectacular (Cancelled)
Anton Dubrau, Using small data in the client instead of big data in the cloud
- did not respond, yet
Nils Magnus, Dealing with TBytes of Data in Realtime
- did not respond, yet
Abhishek Thakur, Classifying Search Queries without User Click Data
- did not respond, yet
Jessica Palmer, Python and TouchDesigner for Interactive Experiments
- did not respond, yet
Maciej Gryka, Removing Soft Shadows with Hard Data
- did not respond, yet
Andreas Lattner, Setting up predictive analytics services with Palladium
- did not respond, yet
Andrej Warkentin, Visualizing FragDenStaat.de
- did not respond, yet
James Powell, The kwarg problem
- did not respond, yet
Matthew Honnibal, Designing spaCy: A high-performance natural language processing (NLP) library written in Cython
- did not respond, yet
Valentine Gogichashvili, Data Integration in the World of Microservices
- did not respond, yet
Michelle Tran Chain, Loop & Group: How Celery Empowered our Data Scientists to Take Control of our Data Pipeline
- did not respond, yet
Guertel Idai, Artificial Body Representation in Robots, Expectation and Surprise
- did not respond, yet
Robert Meyer, pypet: A Python Toolkit for Simulations and Numerical Experiments
- did not respond, yet
Juha Suomalainen, Visualizing research data: Challenges of combining different datasources
- did not respond, yet
Danny Bickson, Python based predictive analytics with GraphLab Create
- did not respond, yet
Fang Xu, Connecting Keywords to Knowledge Base Using Search Keywords and Wikidata
- did not respond, yet
Dr. Markus Abel, Python Learns to Control Complex Systems
- did not respond, yet
Tutorials
Frank Gerhardt, Using Spark - with PySpark
Mike Müller, Single-source Python 2/3
Katharine Jarmul, Data Wrangling with Python
Lev Konstantinovskiy, Practical Word2vec in Gensim
Shoaib Burq, Which city is the cultural capital of Europe? An introduction to Apache PySpark for GeoAnalytics
Lightning Talks
Oliver Zeigermann
Piotr Migdał, Teaching machine learning
- https://speakerdeck.com/pmigdal/teaching-machine-learning
- http://p.migdal.pl/2016/03/15/data-science-intro-for-math-phys-background.html
Mentioned tools:
- Pybuilder: Tired of writing setup.py? http://pybuilder.github.io/
- Sputnik: Package manager for Data https://github.com/spacy-io/sputnik