ppy/osu-framework

`BindableList` mutation operations are unsafe with certain binding patterns

bdach opened this issue · 2 comments

bdach commented

As described on discord.

Reproduction test case:

diff --git a/osu.Framework.Tests/Bindables/BindableListTest.cs b/osu.Framework.Tests/Bindables/BindableListTest.cs
index 9df3106dd..202d8eef0 100644
--- a/osu.Framework.Tests/Bindables/BindableListTest.cs
+++ b/osu.Framework.Tests/Bindables/BindableListTest.cs
@@ -995,6 +995,21 @@ public void TestRemoveAtNotifiesBoundLists()
             Assert.That(triggeredArgs.OldStartingIndex, Is.EqualTo(0));
         }
 
+        [Test]
+        public void TestRemoveAtBranchingBinds()
+        {
+            var b1 = new BindableList<int> { 1, 2, 3, 4, 5 };
+
+            var b2 = b1.GetBoundCopy();
+            var b3 = b1.GetBoundCopy();
+
+            var b4 = new BindableList<int>();
+            b2.BindTo(b4);
+            b3.BindTo(b4);
+
+            b1.RemoveAt(b1.Count - 1);
+        }
+
         #endregion
 
         #region .RemoveAll(match)

will fail with

System.ArgumentOutOfRangeException : Index was out of range. Must be non-negative and less than the size of the collection. (Parameter 'index')
   at System.Collections.Generic.List`1.get_Item(Int32 index)
   at osu.Framework.Bindables.BindableList`1.removeAt(Int32 index, BindableList`1 caller) in /home/dachb/Documents/opensource/osu-framework/osu.Framework/Bindables/BindableList.cs:line 274
   at osu.Framework.Bindables.BindableList`1.RemoveAt(Int32 index) in /home/dachb/Documents/opensource/osu-framework/osu.Framework/Bindables/BindableList.cs:line 268
   at osu.Framework.Tests.Bindables.BindableListTest.TestRemoveAtBranchingBinds() in /home/dachb/Documents/opensource/osu-framework/osu.Framework.Tests/Bindables/BindableListTest.cs:line 1010

This can also happen for any remove/add operation, not only a RemoveAt() issue specifically (although that one is loudest, as it will lead to a crash, adds will lead to silent corruption). Core issue is that the following binding weak reference graph is allowed:

flowchart TD
    b1 --- b2
    b1 --- b3
    b2 --- b4
    b3 --- b4
Loading

which means that trying to perform partial updates on a bindable list by walking the weak reference graph can lead to some nodes having the same partial update ran on them twice (or more).

Possible resolutions I can think of:

  • Collect instances which had the partial update performed on them to a list and use the list to skip performing it again if the same instance was already encountered before
  • Take a partial (item count? possibly item being removed when removing by index?) snapshot of the state of the list at the root level and use that to determine if instances already had the partial update performed on them
  • Take a full snapshot of the state of the list at the root level and propagate that fully down instead of trying to partially update via the weakref graph
  • Attempt to detect such pathological weakrefs graphs somehow and either refuse to set up this binding pattern or automatically fix it by correcting it to
    flowchart TD
        b1 --- b2
        b1 --- b3
        b1 --- b4
    
    Loading
    (similar to union-find path compression?)
  • Add a versioning struct field to each BindableList instance:
    public struct BindGroup
    {
      public Guid ID;
      public int Version;
    }
    • When a bindable is bound to another bindable, it copies its BindGroup from that bindable. The point of the GUID is to distinguish sets of bound-together bindables.
    • Each top-level change to a bindable list increments BindGroup.Version.
    • During change propagation, if BindGroup.ID matches but BindGroup.Version doesn't, then perform the differential change.

Needs performance consideration.

peppy commented

Solutions 1, 4, 5 all sound like potential candidates to me. 4 sounds like it may have side effects and be the hardest to manage. 1 and 5 both sound like they will work. 1 is the easiest implementation, but 5 likely the most robust.

@smoogipoo interested in your thoughts on this before I get too deep in implementation.

peppy commented

I've gone ahead with solution 1 for now. #5954