Document Non-Synced Bevy Gotcha's ( Events, etc. )
zicklag opened this issue · 12 comments
Is your feature request related to a problem? Please describe.
The current design of Bevy GGRS doesn't snapshot and restore Bevy events, so events that are handled in a frame may not get handled again if the frame is rolled back to.
Describe the solution you'd like
We might need a custom event resource type that is compatible with the rollback and snapshot system.
Describe alternatives you've considered
A quick attempt to snapshot the Events<T>
resource from Bevy didn't turn out so well. Events<T>
isn't Clone
even if T: Clone
, which presents challenges, and the exposed public API doesn't allow us enough access to reasonably snapshot and restore it. Even if we could, we'd have to have a way to reset the read counters on the Local<ManualEventReader<T>>
storage that powers the EventReader<T>
system param.
Essentially bevy's built-in event system probably just isn't suitable for snapshot and restore, and we need some alternative, or we just have to make this as a caveat and say that you have to use components for sending events or something like that.
Additional context
I'm actually a little confused why some events in my game seem to be working fine after rollback and restore. I'm still debugging. I'm still trying to get my game working in the SyncTest mode, and things get a little confusing to debug when the world is rolling back and forth the whole time. :D
There's a small chance that I'm misunderstanding something and events could work fine without changes, but I highly doubt it.
A lot of bevy's internal features are not very snapshot-friendly. As an example, bevy's current states use a driver function that has a Local<bool>
making it impossible to capture states without essentially rewriting the feature yourself (Which we did in bevy jam 1). Doing this for every feature that bevy might have cannot realistically be maintained.
That's absolutely reasonable, ad in that case I'm totally fine closing this with a note in the README about how it's an unsupported feature.
There should probably be a section for well-known pitfalls / bevy features not working as expected.
I just had some trouble with Added
and Changed
queries triggering too often, causing bevy_ecs_ldtk to respawn the level on every frame during synctest sessions.
Perhaps we can try to avoid unnecessary change detection? #39
@johanhelsing I replied on #39.
To get started on a list of things to beware of for the README ( just real quick as I can think of them ):
- Writing/Reading a component/resource in a rollback system without registering it
- Reading input directly from bevy
- Using
Local<T>
parameters in rollback systems- Using any other system parameters that depend on
Local
internally. - Using events
- Using any other system parameters that depend on
- Entity IDs will change over rollbacks if entities get deleted:
- This means if you need to store an entity in a custom component you must implement
MapEntities
.- Actually, this is broken right now and we need a fix for #40
- So you can't depend on the Entity ID itself in any logic that must be deterministic.
- This means if you need to store an entity in a custom component you must implement
- Bevy query iteration order is non-deterministic/
- You must be careful when creating
Rollback
IDs in a non-deterministic loop. - You have to collect and sort your query results by some component, such as
Rollback
- You must be careful when creating
So this one felt kind of obvious once I actually found it, but GlobalTransform
s are only updated in the PostUpdate
stage, so it's probably best to avoid them in gameplay code.
If you really need them, it is possible to register it as a rollback component, but then you also need to make sure that your rollback schedule propagates them properly by adding bevy::transform::transform_propagate_system
to the rollback schedule.
This is what I'm currently doing:
#[derive(Debug, Hash, PartialEq, Eq, Clone, StageLabel)]
pub struct RollbackPreStage;
impl Plugin for GgrsIntegrationPlugin {
fn build(&self, app: &mut App) {
app.init_resource::<FrameCount>();
GGRSPlugin::<GGRSConfig>::new()
.with_input_system(input)
.with_rollback_schedule(
Schedule::default()
.with_stage(
RollbackPreStage,
SystemStage::single_threaded()
.with_system(tick_frame_count)
.with_system(
bevy::transform::transform_propagate_system.after(tick_frame_count),
),
)
.with_stage_after(RollbackPreStage, PhysicsStage, PhysicsStage::create_stage())
.with_stage_after(
PhysicsStage,
"ROLLBACK_STAGE",
SystemStage::parallel()
.with_system(update_action_state_from_ggrs_input)
.with_system(spawn_player_characters)
.with_system_set(
CharacterMovementPlugin::gameplay_systems()
.after(update_action_state_from_ggrs_input),
)
.with_system_set(
ShipModulePlugin::gameplay_systems()
.after(update_action_state_from_ggrs_input),
),
),
)
.register_rollback_component::<Transform>()
// global transform is derived from transform and the parent hierarchy...
// so as long as we propagate transforms in the rollback schedule, we should be good
// .register_rollback_component::<GlobalTransform>()
.register_rollback_types::<PhysicsPlugin>()
.register_rollback_types::<CharacterMovementPlugin>()
.register_rollback_resource::<FrameCount>()
.build(app);
}
}
EDIT:
Note that transform_propagate_system
can be quite a performance hog. Here is a trace from my game:
I suspect this is due to Bevy using Changed<Transform>
filters on the queries in transform_propagate_system
, and bevy_ggrs triggers these on every snapshot restore (#39)
That's a really good point. I hadn't gotten to thinking through the problem completely yet, but I was wondering why global transforms weren't working. I just stopped using them for now, but yeah that makes a lot of sense, and seems obvious once you mention it.
For the sorting/ordering gotcha, here is an example how to sort:
fn solve_body_positions(mut bodies: Query<(&Rollback, &mut Pos, &Mass)>) {
let mut bodies: Vec<_> = bodies.iter_mut().collect();
bodies.sort_by_key(|b| b.0.id());
for (_rollback, mut pos, mass) in bodies {
// safe to mutate body positions in order-dependent way here
}
}
I guess we could consider deriving a couple of traits on Rollback
to make it possible to sort on that directly?
The state implementation in Bevy has changed now, and no longer uses a Local
. It is however still tied to the Main
schedule. I managed to write an extension methods that adds it to the GgrsSchedule
instead, but had to patch bevy 0.11 to add Reflect
to NextState<S>
and State<S>
, but looks like it will land in Bevy 0.12.
Makes the following work:
#[derive(States, Reflect, Hash, Default, Debug, Eq, PartialEq, Clone)]
pub enum GameplayState {
#[default]
InRound,
GameOver,
}
fn main() -> Result<(), Box<dyn std::error::Error>> {
App::new()
.add_ggrs_plugin(
GgrsPlugin::<GgrsConfig>::new()
.with_update_frequency(60)
.with_input_system(input)
.register_rollback_component::<Health>()
// Register the state's resources so GGRS can roll them back
.register_roll_state::<GameplayState>(),
)
.add_plugins((
MinimalPlugins.set(ScheduleRunnerPlugin::run_loop(Duration::from_secs_f64(
1.0 / 10.0,
))),
LogPlugin::default(),
))
// Add the state to a specific schedule, in this case the GgrsSchedule
.add_roll_state::<GameplayState>(GgrsSchedule)
.add_systems(OnEnter(GameplayState::InRound), spawn_player)
.add_systems(OnEnter(GameplayState::GameOver), log_game_over)
.add_systems(
GgrsSchedule,
decrease_health
.after(apply_state_transition::<GameplayState>)
.run_if(in_state(GameplayState::InRound)),
)
.insert_resource(Session::SyncTest(session))
.run();
}
Implementation is here: https://github.com/johanhelsing/bevy_roll#states
Might do some experiments with other things as well.
Found another hierarchy-related gotcha:
If you have a rollbacked entity, and add a non-rollbacked child (i.e. a particle from a particle system or some other unimportant cosmetic stuff), this will result in an invalid hierarchy because Children
is rolled back, and the non-rolled back child entity is not in the entity map used when running map_entities
on Children
... so it is assigned a "dead" entity instead of the still living child.
Another one: Avoid usize
for gameplay components if you care about cross-platform determinism (and don't use them in checksums)
Found another hierarchy-related gotcha:
If you have a rollbacked entity, and add a non-rollbacked child (i.e. a particle from a particle system or some other unimportant cosmetic stuff), this will result in an invalid hierarchy because
Children
is rolled back, and the non-rolled back child entity is not in the entity map used when runningmap_entities
onChildren
... so it is assigned a "dead" entity instead of the still living child.
Maybe related gotcha / issue...
If having a player represented by a rolled back parent without any visual representation and non-rollback children displaying meshes / 3d models, the player seems to freeze in place after being rolled back.
This is very apparent and happens directly on spawn when running your game in sync test mode, or perhaps after a few seconds when running on two different machines. However it probably goes unnoticed if running two instances of the game on the same machine, which typically means low latency and no rollback.
Haven't dug into the details but when this "freeze" occurs, the non visible parent's translation is updated, but the visible non-rollbacked children seem to be stuck in place.