apache/accumulo

Lower time to host ondemand tablets

Closed this issue · 3 comments

When an on demand tablet is requested to be hosted. The following happens in the manager

  1. Set the requestToHostColumn
  2. Trigger an event that cause the TGW to scan the range
  3. TGW then :
    1. scans metadata table and find tablet w/ requestToHost column
    2. consults balancer to get assignment location
    3. sets a future location
    4. sends assignment RPC to tablet server

When running SplitMillionIT see the following events happen

  1. A table w/ million tablets is cloned. This triggers an event that causes the TGW to scan the 1 million tablets.
  2. SplitMillionIT tries to scan 100 of the 1M tablets. This causes a request to the manager to host the 100 tablets.
  3. The above request gets backed up behind the TGW scanning the 1 million new tablets from the clone and that makes scanning 100 tablets take a while.

One possible way this could be improved is to the following when the manager gets a request to host tablets.

  1. consults balancer to get assignment locations
  2. sets a future location
  3. sends assignment RPC to tablet server

So, do what the TGW is doing directly and instead of setting requestToHost column set the future column. May be able to drop the requestToHostColumn. The future column being set has very similar properties to the requestToHost column. When the TGW sees a future column, it will send an RPC to tablet server to load the tablet. So if the manager is working on request to host a tablet and it sets the future location and then dies, when the manager starts again the TGW will see the future column and send an RPC.

One pain point in implementing the above strategy is that synchronization may be needed around the balancer plugin. Not sure its well defined about the expectations of a balancer plugin to handle concurrent calls to it. Some balancer plugins are stateful but do not have handling for concurrency. This may point to a large need to revisit the design of the balancer plugin in elasticity and get expectations about its use well documented. A first cut of this change could just synchronize access to the balancer plugin.

The client side tablet location cache looks at the requestToHost column to attempt to be aware of another client process requesting to host the same tablet. That code could use the future column for the same purpose. If it sees future set, then it knows that it does not need to ask the manager to host the tablet.

Opened #4581. Did not set the future directly in the PR. Tried the approach of directly setting the future and ran into two problems. First problem was that future can not be set if a tablet has write ahead logs. Second problem was that TGW has a lot of cases it handles and would not to go down the path of duplicating a subset of this functionality. So decided to make ondemand hosting work more directly w/ TGW in order to lower latency.

Closed by #4581?