Vincent Grégoire, University of Melbourne
This repository contains material for the 2018 Python for financial research workshop for honours and Ph.D. students at the University of Melbourne.
The workshop is divided into four blocks of three hours each:
1. Introduction to Python programming
We will discuss what Python is and you will learn the basic structure of the language. You will also learn your way around the programming environment, including the two main editors for scientific Python, Spyder, and Jupyter. You will learn how to import and explore data using pandas by generating summary statistics and plotting graphs using matplotlib.
2. Introduction to data analysis using pandas and matplotlib
You will learn how to import, export and transform data using pandas, the panel data package for Python. You will also see how to do basic portfolio analysis while replicating a classic paper.
Recommended reading: Bondt, W.F. and Thaler, R., 1985. Does the stock market overreact?. The Journal of finance, 40(3), pp.793-805.
3. More data analysis using pandas and statsmodels
You will learn more advanced features of Python and pandas, including dealing with timestamps and estimating measures from daily and intraday data. You will also learn how to estimate OLS and panel regressions using statsmodels.
Recommended reading: Petersen, Mitchell A., 2009. Estimating Standard Errors in Finance Panel Data Sets: Comparing Approaches. Review of Financial Studies. 22, pp.435-480.
4. Other topics
In this block, you will be introduced briefly to other Python packages that can be helpful for research. We will look at an example of web scraping with textual analysis.
I recommend the Anaconda distribution, which is available for Windows, Mac OS and Linux. We are using the Python 3.6 version for the workshop.
Note: this code is for illustrative purpose, and does not necessarily show the correct or best way to do something, the main goal is to illustrate the Python language, its libraries, and some common use cases in research.
Block 1:
- PythonIntro.py: Introduction to the Python language.
- Kickstarter exploration: Introduction to data exploration using a Kickstarter dataset.
Block 2:
- DeBondt and Thaler 1985: Partial replication of a paper with portoflio formation.
Block 3:
- Introduction to empirical market microstructure in Python: Intro to using pandas with intraday data.
- Estimating standard errors in Python: Using statsmodels and pandas for panel regressions.
Block 4:
- SEC Press Releases: Scrape data from website and apply textual (sentiment) analysis.