This project explores and analyzes two data sets bitstamp.csv
and coinbase.csv
which include the following information for the digital cryptocurrency Bitcoin (BTC) at the respective exchanges "Bitstamp" and "Coinbase" ranging from January 2018 to March 2018. The project consists of two steps, first cleaning up the data so that it can be analyzed and then analyzing the clean data set across different time ranges, this includes creating plots and obtaining summary statistics as well as calculating the potential spreads from doing an arbitrage operation between both exchanges.
The dataset includes the following:
- Timestamp (data set includes the following information for each minute in our time range )
- Opening Price "Open"
- High Price "High"
- Low Price "Low"
- Close Price "Low"
- Volume Expressed in BTC "BTC Volume"
- Volume Expressed in USD "USD Volume"
- Weighted Price
The following technologies were used to build and deploy this application:
- Python - Version 3.9.7
- Anaconda (Which includes Jupyter Lab and Pandas)
- Path (from pathlib)
- matplotlib
For installing Python 3.9.7 you can find the Installation Files for both Windows/Mac OS in the following link
If you require assistance installing it, you can follow the following videos for guidance
For installing Python 3.9.7 you can find the Installation Files for both Windows/Mac OS in the following link
If you require assistance installing it, you can follow the following videos for guidance
Navigate to your desired location where you would like to save the documents for this application. You can do this by using the cd
command followed by a space and the file path inside quotations " file path "
. In my example I have gone to Desktop.
Clone this project's repository from GitHub using the following command
https://github.com/epocaterrasus/CU-Assignment3-Crypto-Arbitrage.git
Being in the folder created when you downloaded the repository type jupyter lab
, this should open a window in your predetermined browser with Jupyter Lab. In the left corner you can see the files inside the repository, open the crypto_arbitrage.ipynb
which contains all steps and notes followed to analyze this dataset pair.
Before being able to properly analyze the data to draw conclusions we must first import and clean the data, the following bullets give a summary of what was done:
- Importing both datasets using the pandas
read_csv
function and setting parameters to set index column to "Timestamp", making it parse dates and infer date/time format - Droping NaN/Missing values using the
dropna()
function - Removing "$" signs using the
str.replace
(this step is very important as it will allow us to convert from data type string to float, key to allowing us to do quantitative analysis on our data) - Convert the data to "float" data type using the
astype()
function - Review if there is any duplicates using the
duplicated()
function - Selecting the relevant columns for our analysis, in this case the "Close" column using the
.loc
function
After getting having our data squeaky clean and formatted we can go ahead and start our analysis, the following bullets give a summary of what was done:
- Generating summary statistics of both datasets using the
describe()
function - Creating line plots for both datasets using the
plot()
and customizing them for size, title and line color - Selecting specific months and dates within the dataset to focus the scope of our analysis
- Repeating the previous steps of generating summary statistics, line plots and adding boxplots to our specific date dataframes
- Calculating potential spreads between both exchanges
- Conditionally filtering spreads for which the return is (>0)
- Further focusing our conditional filters to gather datapoints in which the % spread return is (>1%) to account for potential trading fees
- Calculating potential profits and creating summary statistics and plots to allow us to better understand our data
From our analysis we can conclude that there is a tendency for Bitcoin prices in Coinbase to be higher than prices in Bitstamp for the period ranging from January 2018 to March 2018. Diving deeper into the data we notice how the possibility to conduct arbitrage between both exchanges, is more prevalent in the months of January and steadily declines in February reaching lows in March. We can make the assumption that as levels of volume in Bitcoin trading increased, paired with a growing offering of exchanges/sellers, opportunities for arbitrage decrease as any discrepancies within the market are quickly gapped by traders (sometimes algorithms designed to "scalp")
Summary Statistics of Instances in which spread >1% for three randomly selected dates for the months of (January, February, March 2018)
Plot of Closing Prices for Coinbase and Bitstamp (January-March 2018)
Edgar Pocaterra - epocaterra@protonmail.ch / +1 806 283 5455
MIT License
Copyright (c) 2022 epocaterrasus
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.