alpacahq/alpaca-backtrader-api

timeframe=bt.TimeFrame.Minutes and sleep 3 seconds

x777 opened this issue · 64 comments

x777 commented

New environment, pip freeze:

alpaca-backtrader-api==0.10.1
alpaca-trade-api==0.50.1
backtrader==1.9.76.123
certifi==2020.6.20
chardet==3.0.4
cycler==0.10.0
idna==2.10
kiwisolver==1.2.0
matplotlib==3.3.2
numpy==1.19.2
pandas==1.1.3
Pillow==7.2.0
pyparsing==2.4.7
python-dateutil==2.8.1
pytz==2020.1
requests==2.24.0
six==1.15.0
toolz==0.11.1
trading-calendars==1.11.11
urllib3==1.25.10
websocket-client==0.57.0
websockets==8.1

Trying to use this sample code https://github.com/alpacahq/alpaca-backtrader-api/blob/master/sample/strategy_sma_crossover.py, but with one difference, instead of bt.TimeFrame.Days I am trying bt.TimeFrame.Minutes:

data0 = DataFactory(dataname=symbol,
                            historical=False,
                            timeframe=bt.TimeFrame.Minutes,
                            backfill_start=True,                    
                            )

And looks like forever logging:

Starting Portfolio Value: 99920.71
***** DATA NOTIF: DELAYED
2020-10-08 10:55:41,651 connected to: wss://paper-api.alpaca.markets/stream
2020-10-08 10:55:42,294 connected to: wss://data.alpaca.markets/stream
2020-10-08 10:56:39,196 sleep 3 seconds and retrying https://paper-api.alpaca.markets/v2/account 3 more time(s)...
2020-10-08 10:56:47,027 sleep 3 seconds and retrying https://paper-api.alpaca.markets/v2/account 3 more time(s)...
2020-10-08 10:56:55,404 sleep 3 seconds and retrying https://paper-api.alpaca.markets/v2/account 3 more time(s)...
2020-10-08 10:57:03,546 sleep 3 seconds and retrying https://paper-api.alpaca.markets/v2/account 3 more time(s)...
2020-10-08 10:57:13,243 sleep 3 seconds and retrying https://paper-api.alpaca.markets/v2/account 3 more time(s)...
2020-10-08 10:57:22,087 sleep 3 seconds and retrying https://paper-api.alpaca.markets/v2/account 3 more time(s)...
2020-10-08 10:57:29,888 sleep 3 seconds and retrying https://paper-api.alpaca.markets/v2/account 3 more time(s)...
2020-10-08 10:57:39,282 sleep 3 seconds and retrying https://paper-api.alpaca.markets/v2/account 3 more time(s)...
2020-10-08 10:57:47,317 sleep 3 seconds and retrying https://paper-api.alpaca.markets/v2/account 3 more time(s)...
2020-10-08 10:57:55,757 sleep 3 seconds and retrying https://paper-api.alpaca.markets/v2/account 3 more time(s)...
2020-10-08 10:58:04,196 sleep 3 seconds and retrying https://paper-api.alpaca.markets/v2/account 3 more time(s)...
2020-10-08 10:58:04,489 sleep 3 seconds and retrying https://paper-api.alpaca.markets/v2/account 3 more time(s)...
2020-10-08 10:58:14,343 sleep 3 seconds and retrying https://paper-api.alpaca.markets/v2/account 3 more time(s)...
2020-10-08 10:58:22,433 sleep 3 seconds and retrying https://paper-api.alpaca.markets/v2/account 3 more time(s)...
2020-10-08 10:58:31,479 sleep 3 seconds and retrying https://paper-api.alpaca.markets/v2/account 3 more time(s)...
2020-10-08 10:58:39,858 sleep 3 seconds and retrying https://paper-api.alpaca.markets/v2/account 3 more time(s)...

What I am doing wrong?

x777 commented

I am sorry, just not read all code of the example. Looks like all fine.
Anyway, would be cool if creators add some samples with real-time intraday timeframes. It would be useful for new users.

x777 commented

Only one question, how to use in live mode resampledata()? Because I see that in DataFactory we already use compression.

x777 commented

Also, how to control the backfill_start range? Looks like too much data loading for the first run.

x777 commented

#41
In this topic above, I understood that the latest version of the library something changes (maybe I am wrong) in real-time mode.
Let me explain what I mean. In the previous versions, I was using the polygon real-time data feed with cerebro.resampledata() function and first resampled data feed of this function worked as a range for next() function (time-driven, not event), right? And it's worked for me.

Now, I am trying to use paper live and I am a little confused. I understood in the above's topic that next() in paper live mode is event-driven. When I am using cerebro.resampledata() I get an error like this:
Error while connecting to wss://data.alpaca.markets/stream:your connection is rejected while another connection is open under the same account
and as I understand it's normal.

So, looks like, I need to change thinking about how it works in live mode with Alpaca API (not Polygon).
But I can't understand how to control next() with the time range of my resample() list of datas? I hope my question for knowing people is understandable what I mean.

Hi,
you have here a mix of a few things:

  1. your connection is rejected while another connection is open under the same account - this is caused by running more than 1 algorithms at once. you can only execute one strategy per account. you could use this project to overcome this: https://github.com/shlomikushchi/alpaca-proxy-agent
  2. as you read in: #41 the next() method is event driven. you could check the time inside next() and decide if you want to do something at this time. if not, just return and wait the next call
  3. backfill takes time if you request a lot of data. you can control it with fromdate in DataFactory
x777 commented

your connection is rejected while another connection is open under the same account - this is caused by running more than 1 algorithms at once. you can only execute one strategy per account. you could use this project to overcome this: https://github.com/shlomikushchi/alpaca-proxy-agent

When I try to use resample data I see a rejection error too. It's ok? With live mode no sense to use resampling already. Instead need to use compression for each DataFactory and use this data to needed indicators, right?

x777 commented

as you read in: #41 the next() method is event driven. you could check the time inside next() and decide if you want to do something at this time. if not, just return and wait the next call

Maybe you have some examples of how to divide event by timeframe?

I have the same issue when I am doing paper trading. When backfill is running, in next method we have the following check

 def next(self):
        if not self.live_bars and not IS_BACKTEST:
            # only run code if we have live bars (today's bars).
            # ignore if we are backtesting
            return

However, any backfill data which comes to next() method triggers /account API. It doesn't matter event if we return from next() method without performing any operations. Why do we need to check an account API if data still backfilling?

x777 commented

I have the same issue when I am doing paper trading. When backfill is running, in next method we have the following check

 def next(self):
        if not self.live_bars and not IS_BACKTEST:
            # only run code if we have live bars (today's bars).
            # ignore if we are backtesting
            return

However, any backfill data which comes to next() method triggers /account API. It doesn't matter event if we return from next() method without performing any operations. Why do we need to check an account API if data still backfilling?

Yes, it's hard to understand without good examples. All samples are very poor to use in reality. Just my opinion.

x777 commented

Today, the same test code doesn't work at all. No calling next() at all. No events - no logs. Enabled/disabled backfill_start param - no matter. Active symbol $SPY. Magic.

@observer-user

However, any backfill data which comes to next() method triggers /account API. It doesn't matter event if we return from next() method without performing any operations. Why do we need to check an account API if data still backfilling?

we need to feed the data into the backtrader platform because we added this integration on top of backtrader.
so every data that is fed into backtrader goes through the backtrader pipeline, no matter if it's a backfill or a live stream

@x777

Maybe you have some examples of how to divide event by timeframe?

yes you could do something like this:

    def next():
        import datetime
        if datetime.datetime.utcnow() - self.timer < timedelta(minutes=5):
            return
        self.timer = datetime.datetime.utcnow()
        <YOUR-CODE-HERE>

and initialize self.timer in the init function

Also, I think the alpaca data stream is a bit thin this morning.
I also don't get as many messages as usual.
when switching to the polygon data stream, everything works fine

x777 commented

Also, I think the alpaca data stream is a bit thin this morning.
I also don't get as many messages as usual.
when switching to the polygon data stream, everything works fine

Thank you for your code, I will try it.
But now, an interesting thing happens with the data feed. How this moment to debug or only wait? Is any status of servers where to check?

try using a different stock, maybe its data feed will be better.
otherwise, you need to wait

x777 commented

Ok, hope it will work later.
Another question about multi-data. I must use alpaca-proxy-agent if I add more than 1 data feed to my algo? For example, I need for indicators data feed with a different timeframe.

if you want more than one data you need the proxy project
more than one indicator, but only one data - you don't need it

x777 commented

if you want more than one data you need the proxy project
more than one indicator, but only one data - you don't need it

The compression parameter in DataFactory uses as a timeframe for the indicator or no?

The compression is used to get the backfill data from the servers, so it used for the indicators for daily bars (1D, 5D, 10D etc..)
once the live data starts, it's event driven with quote data.

x777 commented

you will still receive events to the indicators, but they will be the same events you get for the next() method.
if you want to calculate a different timeframe, calculate manually, yes

x777 commented
x777 commented

@shlomikushchi I won't (can't) use proxy agent, but need a different timeframe for each indicator.
Your code with timedelta is good, but now how I can calculate for each indicator different timeframe? Still struggle with understanding how to do such simple things without resampledata().

I am going to work on the resample function in the future.
in the meantime you could use talib (or other TA packages, there are a few) to calculate the indicators on samples.
for samples you could do one of the two options

  1. store the samples you get from the next function
  2. use the alpaca api to request data with any timeframe that you need.
x777 commented

I am going to work on the resample function in the future.
in the meantime you could use talib (or other TA packages, there are a few) to calculate the indicators on samples.
for samples you could do one of the two options

  1. store the samples you get from the next function
  2. use the alpaca api to request data with any timeframe that you need.

Maybe I can use some previous versions. What previous version is good for real-time but with resample function?

you could try versions before introducing the alpaca data api (0.7.1 and below) but I don't believe it will change anything since the data stream was always quote based

x777 commented

you could try versions before introducing the alpaca data api (0.7.1 and below) but I don't believe it will change anything since the data stream was always quote based

I understand, but resampling data isn't working with even data in previous versions? I just trying to develop bar based strategy and I need to play with different timeframes.

try working with the polygon data stream, it's more stable. let's see if you get a better result.

x777 commented

try working with the polygon data stream, it's more stable. let's see if you get a better result.

Latest version you mean or that I worked with Polygon already?

resampling is a backtrader mechanism. it receives ticks/aggs and resamples them. but messages are not coming as clock events.
one minute you may receive many quotes, another minute you may not receive any.
it depends on many things (were there any traders, did the communication worked properly, etc) there's no way to know that every time, which means it's hard to predict the behavior.
one thing you could do, is change the qcheck parameter as described here: https://www.backtrader.com/docu/live/ib/ib/#live-feeds-and-resamplingreplaying
and/or try to work with the different data streams available

as I said, I don't think you will get a better result with an older version since this is a backtrader code.
but maybe I missed something, so you could try it too

x777 commented

as I said, I don't think you will get a better result with an older version since this is a backtrader code.
but maybe I missed something, so you could try it too

I just try to avoid using proxy agent. Because resampling data for live feed is working fine in previous versions.
But I confused now, maybe it's not working already)

Looks like advantages of using this library for live feed not too much and maybe make sense to work with native alpaca trade api. Just thinking for perspective strategies developing.

I just tested the latest code with polygon data. doing a resample like so, and it works:

data0 = DataFactory(dataname=symbol,
                    historical=False,
                    timeframe=bt.TimeFrame.Minutes,
                    qcheck=2.0,
                    backfill_start=True,
                    fromdate=datetime.utcnow().date() - timedelta(minutes=200)
                   )

cerebro.resampledata(data0,
              compression=1,
              timeframe=bt.TimeFrame.Minutes)

changing the quote data to minute data

x777 commented

@shlomikushchi can you please explain why this happens that only with polygon data it's working? What difference between Alpaca's feed and Polygon?

And next() function triggers by quotes (bid/ask) or trade price?

it works with alpaca data too, it just might not be as consistent.
could be different things:

  • amount of messages
  • delays
  • different exchanges collected from

the default quote data is the bidprice but you could specify to get the askprice

x777 commented

@shlomikushchi But earlier you said that using resample functionality requires proxy agent or not?
About quotes. How I can switch between bidprice and askprice? Can I switch to last trade?

proxy-agent is completely different thing. nothing to do with resampling.
you could pass a param useask (default is False.)
you can't use last trade for now. only quotes.

x777 commented

proxy-agent is completely different thing. nothing to do with resampling.

When I used resampling in earlier messages I get error: Error while connecting to wss://data.alpaca.markets/stream:your connection is rejected while another connection is open under the same account
So, I understood that I must to use proxy-agent. What will be if I add more resample data functions in your example?

if you use more than one data you need the proxy-agent. it doesn't matter if you resample or not

x777 commented

if you use more than one data you need the proxy-agent. it doesn't matter if you resample or not

This is where from all my problems began.

x777 commented

I just tested the latest code with polygon data. doing a resample like so, and it works:

data0 = DataFactory(dataname=symbol,
                    historical=False,
                    timeframe=bt.TimeFrame.Minutes,
                    qcheck=2.0,
                    backfill_start=True,
                    fromdate=datetime.utcnow().date() - timedelta(minutes=200)
                   )

cerebro.resampledata(data0,
              compression=1,
              timeframe=bt.TimeFrame.Minutes)

changing the quote data to minute data

I was trying to use your example with proxy-agent:

***** DATA NOTIF: DELAYED
2020-10-16 15:02:29,362 error while consuming ws messages: [Errno 110] Connect call failed ('192.168.99.100', 8765)
2020-10-16 15:04:40,444 error while consuming ws messages: [Errno 110] Connect call failed ('192.168.99.100', 8765)
2020-10-16 15:06:51,516 error while consuming ws messages: [Errno 110] Connect call failed ('192.168.99.100', 8765)
2020-10-16 15:09:02,578 error while consuming ws messages: [Errno 110] Connect call failed ('192.168.99.100', 8765)

Strategy without resampling works well through proxy-agent, but second, that use resampling shows such errors.

Update: No matter if I am using resample or no. 2 parallel strategies not working for me.

x777 commented

@shlomikushchi It's possible to convert live feed manually without resampedata() to data feed type that can be used later in indicators like after resampling? Now I am getting needed data from Alpaca Data API by using a timer and calculating simple indicators manually, but it's not comfortable. Maybe you have some examples?

are you running the docker container on '192.168.99.100'?
it doesn't seem like something is listening there

@shlomikushchi It's possible to convert live feed manually without resampedata() to data feed type that can be used later in indicators like after resampling? Now I am getting needed data from Alpaca Data API by using a timer and calculating simple indicators manually, but it's not comfortable. Maybe you have some examples?

these are the 2 options you have:

  1. resample
  2. get the data manually using the api
x777 commented

are you running the docker container on '192.168.99.100'?
it doesn't seem like something is listening there

Yes, I don't see any logs in docker terminal when starting strategy.

x777 commented

@shlomikushchi It's possible to convert live feed manually without resampedata() to data feed type that can be used later in indicators like after resampling? Now I am getting needed data from Alpaca Data API by using a timer and calculating simple indicators manually, but it's not comfortable. Maybe you have some examples?

these are the 2 options you have:

  1. resample
  2. get the data manually using the api

Problem with second way is how to convert response to format (type) that supports backtrader indicators or talib library. Can you show some examples?
I only can use with new library btalib by backtrader developers, but there not too much supported indicators yet.

are you running the docker container on '192.168.99.100'?
it doesn't seem like something is listening there

Yes, I don't see any logs in docker terminal when starting strategy.

then you are probably not running on 192.168.99.100.
are you using virtualbox? if not you are probably running on 127.0.0.1:8765

Problem with second way is how to convert response to format (type) that supports backtrader indicators or talib library. Can you show some examples?
I only can use with new library btalib by backtrader developers, but there not too much supported indicators yet.

btalib is fine, it's the same indicators you have in backtrader
but why not use the resample?

x777 commented

are you running the docker container on '192.168.99.100'?
it doesn't seem like something is listening there

Yes, I don't see any logs in docker terminal when starting strategy.

then you are probably not running on 192.168.99.100.
are you using virtualbox? if not you are probably running on 127.0.0.1:8765

Changing the env variable with 127.0.0.1:8765 address solved problem.

x777 commented

Problem with second way is how to convert response to format (type) that supports backtrader indicators or talib library. Can you show some examples?
I only can use with new library btalib by backtrader developers, but there not too much supported indicators yet.

btalib is fine, it's the same indicators you have in backtrader
but why not use the resample?

If the timeframe is not one and hard to use proxy-agent it would be helpful to understand how to convert response API to a data feed that is supported by indicators.

x777 commented

are you running the docker container on '192.168.99.100'?
it doesn't seem like something is listening there

Yes, I don't see any logs in docker terminal when starting strategy.

then you are probably not running on 192.168.99.100.
are you using virtualbox? if not you are probably running on 127.0.0.1:8765

Changing the env variable with 127.0.0.1:8765 address solved problem.

But when I starting the same script through proxy-agent it only connected but no events after that. I see that it connected to docker only.
And looks like because I have problems with resampling.

x777 commented

In my opinion, the main problem to understand how it works is to be sure that the stream from the server is stable.
When you trying to play with resampling or with proxy agent you never know what you are doing wrong because always your script that worked a few minutes ago, now not receiving any data by the stream now. It's a very strange behavior. I am always testing only high liquid stocks and etf.

x777 commented

In resample example
line 50 not working for me:

fromdate=datetime.utcnow() - timedelta(minutes=20)

Only this code starting resample data:

fromdate=datetime.now() - timedelta(minutes=20)

In my opinion, the main problem to understand how it works is to be sure that the stream from the server is stable.
When you trying to play with resampling or with proxy agent you never know what you are doing wrong because always your script that worked a few minutes ago, now not receiving any data by the stream now. It's a very strange behavior. I am always testing only high liquid stocks and etf.

start with one stock. test it with resample (without proxy-agent). when you're sure it works start complicating it

Problem with second way is how to convert response to format (type) that supports backtrader indicators or talib library. Can you show some examples?
I only can use with new library btalib by backtrader developers, but there not too much supported indicators yet.

btalib is fine, it's the same indicators you have in backtrader
but why not use the resample?

If the timeframe is not one and hard to use proxy-agent it would be helpful to understand how to convert response API to a data feed that is supported by indicators.

if you want to retrieve data every minute from the api servers then the pylivetrader is a project that is designed to work like this. to do it in backtrader is possible, but it's not by design

x777 commented

In my opinion, the main problem to understand how it works is to be sure that the stream from the server is stable.
When you trying to play with resampling or with proxy agent you never know what you are doing wrong because always your script that worked a few minutes ago, now not receiving any data by the stream now. It's a very strange behavior. I am always testing only high liquid stocks and etf.

start with one stock. test it with resample (without proxy-agent). when you're sure it works start complicating it

It's funny, I just want to run on live feed a few basics indicators and really can't to do it)
Sometimes it takes a lot of time in ***** DATA NOTIF: DELAYED, then when it switched to ***** DATA NOTIF: LIVE nothing happens much more than in delay mode.

Only one question: such kind of problems have all users of alpaca-backtrader or only mine?

x777 commented

Problem with second way is how to convert response to format (type) that supports backtrader indicators or talib library. Can you show some examples?
I only can use with new library btalib by backtrader developers, but there not too much supported indicators yet.

btalib is fine, it's the same indicators you have in backtrader
but why not use the resample?

If the timeframe is not one and hard to use proxy-agent it would be helpful to understand how to convert response API to a data feed that is supported by indicators.

if you want to retrieve data every minute from the api servers then the pylivetrader is a project that is designed to work like this. to do it in backtrader is possible, but it's not by design

I don't want to do such converting if resampling works well. But...

use polygon data, it works better with resampling

x777 commented

use polygon data, it works better with resampling

Any perspective that it will work better with Alpaca feed in the future?

I believe it will be more and more like polygon yes

x777 commented

I believe it will be more and more like polygon yes

Hope too. Thx!

x777 commented

@shlomikushchi any updates about resamping with Alpaca?

no