In this post we'll fetch crypto market pricing data for analysis and forecasting. Unlike previous post on crypto data mining with R, we'll use API instead of web scraping. We'll be using R to both fetch and analyze data. Of course, R data can be exported to almost all major data formats including that of Excel, SPSS, SAS. So collecting data would be helpful even if you use other software/language for analysis.
We'll collect, for illustration, historical (daily) pricing of BTC from CryptoCompare (CC, from now) using their API. You can collect any other crypto data and do many more things. See CC help for available options. There are reasons for using CC's data including:
- CC APIs are available under free to use under a Creative Commons Attribution-NonCommercial 3.0 Unported (CC BY-NC 3.0) license.
- Some of the great services use CC pricing API including EtherChain, EtherScan, Exodus, DAppWallet, nanopool, GasTracker, https://explorer.zcha.in/, https://moon.cryptothis.com/" Ethereum Stats App, Ethereum Classic Stats.
So we get reliable data for free, as in freedom. But please do not abuse them, a request every 10 sec should be more than enough. Please also make sure you credit CC with a link if you use their data on your website or app.
If you haven't already, install R and RStudio, a great open-source R IDE developed by Hadley Wickham. RStudio website has a lot of learning materials to get you started and do more. Now let's get to actual work. First load required packages, namely, jsonlite, for fetching JSON data via CC's API, forecast, for time-series based forecasting, and ggplot2, for visualizing data:
R> library(jsonlite)
R> library(ggplot2)
R> library(forecast)
Let's do an API request:
R> cc <- fromJSON("https://min-api.cryptocompare.com/")
We'd like to know what our request fetched from CC, i.e, what data the variable cc
contains. The first thing we can do is run str(variable_name)
. str() is a very useful function for examining data structure of a variable. To learn more about the function, issue ?str
in R console or see here.
R> str(cc)
List of 3
$ Called : chr "/"
$ Message : chr "Min API Options, works with all symbols, for more options see https://www.cryptocompare.com/api/. If you are requesting signed "| __truncated__
$ AvailableCalls:List of 1
..$ Price:List of 13
... ... output truncated
str()
returns a long output in the event of complex data structures. We have a list; it is a confusing data structure, especially to beginners. See more about accessing elements of a list in this excellent stackoverflow post.
Studying the output of str()
I navigate to cc$AvailableCalls$Price$HistoDay$Info$Examples
, where I can see API request examples.
R> cc$AvailableCalls$Price$HistoDay$Info$Examples
[1] "https://min-api.cryptocompare.com/data/histoday?fsym=BTC&tsym=USD&limit=30&aggregate=3&e=CCCAGG"
[2] "https://min-api.cryptocompare.com/data/histoday?fsym=ETH&tsym=USD&limit=30&aggregate=3&e=Kraken&extraParams=your_app_name"
... ... output truncated
Collect BTC historical (daily) data:
R> cc_histoday_btc <- fromJSON("https://min-api.cryptocompare.com/data/histoday?fsym=BTC&tsym=USD&allData=true&e=CCCAGG")
Run str(cc_histoday_btc)
to display the fetched data's structure. We now create a time series object from the fetched data and store it to variable <btc_ts>
. You can choose whatever name you please, except for a few limitations. Run ?ts
to learn more about time series objects.
btc_ts <- ts(cc_histoday_btc$Data$close, start = cc_histoday_btc$Data$time[1])
Now we fit an ARIMA model using auto.arima()
and forecast BTC's price for the next 50 days.
R> fit_arima <- auto.arima(btc_ts)
R> autoplot(forecast(fit_arima, 50))
Looks like BTC price is only going up! Let's see how accurate our model is with accuracy(fit_arima)
command:
R> accuracy(fit_arima)
ME RMSE MAE MPE MAPE MASE ACF1
Training set 0.5483334 27.41288 9.607521 0.02666922 3.533025 0.9876943 0.002047275
Different error estimations do not appear to be too big. Without using complicated parameter or code, we produced some neat and useful results. We'll learn more soon!