Only re-render if the model has changed
z0w0 opened this issue · 16 comments
During the development of Helm 1.0, the sample mechanics introduced in previous versions (which prevented excess rendering) was removed. To optimize rendering, the render function should be changed to require an Eq
implementation on the engine user's model type and then check the model is not the same before rendering.
Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.
This was done by simply marking the model as dirty when the game model is changed by an action, and then this dirty flag is turned off after being rendered. Better than requiring an Eq
constraint.
I don't think you have solved this unfortunately, both the flappy
and the restored hello
example peg one CPU core at 100% while idle.
Hmm. I'll investigate this weekend. Happy if someone beats me to it :)
It may not be re-rendering but polling for events in a really tight loop. I did some profiling with the hello example:
stack install --profile
stack exec -- ghc -prof -fprof-auto -rtsopts examples/hello/Main.hs
examples/hello/Main
Which gives a Main.prof
file:
Wed Jun 28 15:08 2017 Time and Allocation Profiling Report (Final)
Main +RTS -p -RTS
total time = 3.29 secs (3293 ticks @ 1000 us, 1 processor)
total alloc = 938,042,536 bytes (excludes profiling overheads)
COST CENTRE MODULE SRC %time %alloc
tick Helm.Engine.SDL.Engine src/Helm/Engine/SDL/Engine.hs:(129,3)-(141,37) 35.0 1.2
pollEvent.\ SDL.Event src/SDL/Event.hs:(679,37)-(683,43) 24.0 0.0
superstep.loop FRP.Elerea.Param FRP/Elerea/Param.hs:(167,5)-(176,64) 19.0 53.4
step Helm src/Helm.hs:(67,1)-(85,65) 5.2 8.7
externalMulti.\.sample FRP.Elerea.Param FRP/Elerea/Param.hs:469:18-89 2.7 8.7
start.\ FRP.Elerea.Param FRP/Elerea/Param.hs:(156,22)-(160,14) 2.4 0.0
pollEvent SDL.Event src/SDL/Event.hs:(679,1)-(683,43) 2.4 9.9
initialize SDL.Init src/SDL/Init.hs:(61,1)-(63,39) 1.5 0.0
superstep FRP.Elerea.Param FRP/Elerea/Param.hs:(164,1)-(176,64) 1.4 3.1
moves Helm.Mouse src/Helm/Mouse.hs:(23,1)-(26,42) 1.1 2.5
start FRP.Elerea.Param FRP/Elerea/Param.hs:(152,1)-(160,14) 1.1 3.1
superstep.deref FRP.Elerea.Param FRP/Elerea/Param.hs:166:5-53 0.4 3.1
addSignal.fin.\ FRP.Elerea.Param FRP/Elerea/Param.hs:(190,37)-(192,56) 0.2 3.1
externalMulti.\ FRP.Elerea.Param FRP/Elerea/Param.hs:(467,27)-(470,69) 0.1 1.2
memoise FRP.Elerea.Param FRP/Elerea/Param.hs:265:1-64 0.1 1.9
I'd thought that the SDL pollEvent wouldn't be intensive.
What exactly are the possible solutions to this?
Should there be a thread delay, perhaps?
Based off the documentation for SDL.Time.Event, I'm inclined to believe we shouldn't be using any SDL2 API wait implementations but instead should just block the thread using GHC's threadDelay
(especially since the game loop is on the Haskell end). I think adding two new game configuration options (with sane defaults) will suffice: updateLimit
a millisecond unit delay for sleeping the thread (I think 1ms would be a fine default) and fpsLimit
a soft-limit to the FPS the game should render frames at
There might be games where input lag is acceptable for the gain of minimised CPU usage, so the dev may opt to increase the updateLimit.
Yeah, I think you are right about that although it's not explicitly documented.
Been under the weather and haven't had a chance to look at this yet and am going on holidays for a week tomorrow. Happy to Paypal $10 AUD through to any legend that fixes this before I get back :)
In particular:
- Add the soft-limit to the rendering rate as two enum variants,
data FPSLimit = Unlimited | Limited Int
(defaults to 120FPS) - Add the hard-limit to the engine tick with
threadDelay
as two enum variants like the above,data UpdateLimit = Unlimited | Limited Double
where the unit here is either milliseconds or microseconds (threadDelay
takes microseconds IIRC)
These two should both go on the SDL engine config, or game config. Both would work, but game config might be more appropriate because those two settings won't be specific to the SDL engine.
P.S. I realised two other things related to this issue:
- It's only polling one event every engine tick - it should be polling all of the events available in a single tick otherwise it pumps the game actions out of the subscriptions slowly every tick, and commands get run much more than they should..
- Exposed events aren't re-rendering the engine
This one is somehow connected to #118. I implemented FPSLimit Limited
and flickering disappeared. I am not proud by an implementation yet as it was POC, so I may hold it for a little, but as soon as it will be done I will provide a PR. I plan to deliver it in two separate PRs. One for FPSLimit and another one for UpdateLimit. Issues mentioned at very bottom of previous comment will require a separate attention as well.
@z0w0 regarding GameConfig
and using GameConfig
name for FPSLimit
and UpdateLimit
. What options do you have in mind? I have an idea to name it GameLifecycle
as it contains all key functions that are defining game specific. Any other ideas?
@z0w0 There is an interesting interdependency. If UpdateLimit
is bigger than a FPSLimit
it will override it, as there is DirtyModel
as well. Also I think there is no reason to update more often than a FPSLimit
, as it will not provide any benefit anyway, user will not see a result. Is there an option to simplify these DirtyModel
, FPSLimit
, UpdateLimit
and what are the benefits to have such a granular configuration?
ps Thinking loud.
Our current game loop implementation has following constraints:
DirtyModel
— indicates that model was changed. If it is not, there is no reason to render, since new frame will not differ from a previously rendered one.FPSLimit
— in case we want to safe a resource even if model updated, can be helpful on a constrained hardware like mobile devices.
However, the current implementation anyway consumes all the CPU time given, since as we skip frame rendering we do SDL events polling and processing all the time nonstop. Moreover with a naive implementation where model changes on every SDL event and FPS is Unlimited
processing of the game directly depends on hardware speed. The idiomatic way of implementation right now depends on Time.fps
in subscriptions, that generates update actions. This allows generating updates on a regular basis that take delta time as an input for an interpolation. While it solves the problem of uncontrollable timing of game state updates, it does not solve an issue with over-utilization of CPU, as events still polled every second.
The classic game loop will looks like following:
loopStartTime <- …
loopUpdate updateLimit deltaTime
render
delayToCatchupWithFPSLimit loopStartTime
In our implementation we will not need to calculate deltaTime
to pass into the function, as it is up to the developer to use Time
signals in his implementation. However, we need to ensure that overall events polling constrained if needed. This is done by introducing UpdateLimit
that constrains the step function as a whole in the following way:
loopStartTime <- …
updateAsMuchAsUpdateLimitAllows
renderIfDirty
waitUpToFPSLimit
Resulting implementation kit not much different from a classic game loop.
There are several problems with this implementation that may require further thinking:
- Update + Render time can be above
FPSLimit
. This will cause frames skip to happen, but still a smooth update of game logic. We may redefine prioritization in a way that FPS is constant, but game updates happen less often. It may affect physics accuracy. We need to spend additional time on drawbacks investigation of such approach. UpdateLimit
can be lower than a Timer.fps events within a game implementation, as result the actual updates will be less than requested one.
This is what I am going to implement. If anyone has any suggestions for improvement, please chime in.
Sounds exactly like what I was invisioning. In an ideal world, update would be on its own green thread and rendering stays on the main thread. But that takes a bit of involvement, because some commands need to be on the main thread (things that interact with the underlying SDL window for example).
Ok, I got POC working but have challenges now with propagating quit state for engine across loop functions. Profiling does show much better numbers, I am going to polish the code, implement a better propagating of quit state and provide a PR.
@z0w0 you mentioned "It's only polling one event every engine tick", correct me if I am wrong but tick function is a recursive function.
-- Sink everything else into the signals
Just Event.Event { .. } ->
sinkEvent engine eventPayload >>= tick
By the end it will sink all the events before proceeding to processing actions. Am I missing something there?
Also can you clarify "Exposed events aren't re-rendering the engine", not sure I am understand this part.
Yep, you're right that the poll is correct. I misread the code at the time :)
Exposed events are emitted by SDL when a window above the game window is moved out of the way, i.e. the contents of the game window are re-exposed. We need to re-render the game window as soon as this exposed event is emitted so that there's no delay in updating the screen. This complicates the code you've been working on a bit..