StanfordLegion/legion

Default Mapper Overapproximation of Fields

Opened this issue · 0 comments

For performance reasons, the default mapper currently likes to over-approximate the needed fields when creating instances. It does this by finding all the fields in a field space and putting them in the layout constraints for creating new instances, even if it doesn't have privileges on those fields. It does that here:

https://gitlab.com/StanfordLegion/legion/-/blob/master/runtime/mappers/default_mapper.cc?ref_type=heads#L2315

Unfortunately, this is actually not always safe to do when there are deletions in flight for some of those fields in parallel with the operation that is creating the instance. In such races it is possible for the mapper to sample the current set of fields in the field space, then the deletion goes through and removes them, and then the mapper completes it's call into the runtime to create an instance at which point the runtime complains that it doesn't know about at least one field and you get an error like this:

[0 - 7f2b28989c00]    0.505730 {6}{runtime}: [fatal 2007] LEGION FATAL: unknown field ID 6 requested during instance creation (from file /home/mebauer/legion/runtime//legion/region_tree.cc:14058)

with a backtrace like this:

#7  0x000055dc63632368 in Legion::Internal::Runtime::report_fatal_message (id=2007,
    file_name=0x55dc6481e738 "/home/mebauer/legion/runtime//legion/region_tree.cc", line=14058,
    message=0x7f2ae04f45b0 "unknown field ID 6 requested during instance creation")
    at /home/mebauer/legion/runtime//legion/runtime.cc:32119
#8  0x000055dc635408e1 in Legion::Internal::FieldSpaceNode::compute_field_layout (this=0x7f2af00018c0,
    create_fields=std::vector of length 8, capacity 8 = {...},
    field_sizes=std::vector of length 8, capacity 8 = {...},
    mask_index_map=std::vector of length 8, capacity 8 = {...},
    serdez=std::vector of length 8, capacity 8 = {...}, mask=...)
    at /home/mebauer/legion/runtime//legion/region_tree.cc:14058
#9  0x000055dc63294762 in Legion::Internal::InstanceBuilder::compute_layout_parameters (this=0x7f2ae04f6860)
    at /home/mebauer/legion/runtime//legion/legion_instances.cc:4150
#10 0x000055dc63293875 in Legion::Internal::InstanceBuilder::initialize (this=0x7f2ae04f6860,
    forest=0x55dc663ac010) at /home/mebauer/legion/runtime//legion/legion_instances.cc:3955
#11 0x000055dc635d65b3 in Legion::Internal::MemoryManager::find_or_create_physical_instance (this=0x7f2afc00be70,
    constraints=..., regions=std::vector of length 1, capacity 1 = {...}, result=...,
    created=@0x7f2ae04f7da9: false, processor=..., acquire=true, priority=0, tight_region_bounds=false,
    unsat_kind=0x0, unsat_index=0x0, footprint=0x7f2ae04f8210, creator_id=16, remote=false)
    at /home/mebauer/legion/runtime//legion/runtime.cc:8831
#12 0x000055dc6361b6ef in Legion::Internal::Runtime::find_or_create_physical_instance (this=0x55dc663ed010,
    target_memory=..., constraints=..., regions=std::vector of length 1, capacity 1 = {...}, result=...,
    created=@0x7f2ae04f7da9: false, processor=..., acquire=true, priority=0, tight_bounds=false, unsat=0x0,
    footprint=0x7f2ae04f8210, creator_id=16) at /home/mebauer/legion/runtime//legion/runtime.cc:26601
#13 0x000055dc637a4f5c in Legion::Internal::MapperManager::find_or_create_physical_instance (this=0x55dc6654d190,
    ctx=0x7f2ae04f82c0, target_memory=..., constraints=..., regions=std::vector of length 1, capacity 1 = {...},
    result=..., created=@0x7f2ae04f7da9: false, acquire=true, priority=0, tight_region_bounds=false,
    footprint=0x7f2ae04f8210, unsat=0x0) at /home/mebauer/legion/runtime//legion/mapper_manager.cc:1271
#14 0x000055dc63453c6f in Legion::Mapping::MapperRuntime::find_or_create_physical_instance (this=0x55dc6625f200,
    ctx=0x7f2ae04f82c0, target_memory=..., constraints=..., regions=std::vector of length 1, capacity 1 = {...},
    result=..., created=@0x7f2ae04f7da9: false, acquire=true, priority=0, tight_bounds=false,
    footprint=0x7f2ae04f8210, unsat=0x0) at /home/mebauer/legion/runtime//legion/legion_mapping.cc:865
#15 0x000055dc63d5b936 in Legion::Mapping::DefaultMapper::default_make_instance (this=0x55dc6654cca0,
    ctx=0x7f2ae04f82c0, target_memory=..., constraints=..., result=...,
    kind=Legion::Mapping::DefaultMapper::TASK_MAPPING, force_new=false, meets=true, req=...,
    footprint=0x7f2ae04f8210) at /home/mebauer/legion/runtime//mappers/default_mapper.cc:2343
#16 0x000055dc63d59d84 in Legion::Mapping::DefaultMapper::default_create_custom_instances (this=0x55dc6654cca0,
    ctx=0x7f2ae04f82c0, target_proc=..., target_memory=..., req=..., index=0,
    needed_fields=std::set with 3 elements = {...}, layout_constraints=..., needs_field_constraint_check=false,
    instances=std::vector of length 1, capacity 1 = {...}, footprint=0x7f2ae04f8210)
    at /home/mebauer/legion/runtime//mappers/default_mapper.cc:2025
#17 0x000055dc63d576df in Legion::Mapping::DefaultMapper::map_task (this=0x55dc6654cca0, ctx=0x7f2ae04f82c0, task=
    ..., input=..., output=...) at /home/mebauer/legion/runtime//mappers/default_mapper.cc:1489

We probably need to come up with a way to fix this that doesn't involve eliminating the optimization, but is also safe for cases where deletions are occurring in flight with instance creation.