apache/accumulo

Tablet with lots of file may not be readable on scan servers for long periods of time.

keith-turner opened this issue · 1 comments

Describe the bug

Tablets can only be scanned when they have less than a configurable number of files. Scan servers remember a tablets files for a configurable time period. If a tablet has too many files to scan then it may be unreadable on a scan server for scan servers configured period to remember tablets files even if the tablet is compacted.

This is a suspected bug, need to write a test to confirm.

To Reproduce

The following outlines a possible test to confirm this problem.

  • Configure max files for scan to 20
  • Somehow temporarily disable compactions on a test table
  • Create greater than 20 files on tablet in the test table
  • Set the scan server tablet metadata cache timeout to 10 mins
  • Attempt to scan the tablet w/ >20 files on a scan server
  • Wait a little bit
  • enable compactions to reduce the number of files.
  • See what happens with the scan started prior to the compaction

Expected behavior

Ideally soon after a tablet is compacted to have less than the max files for scan it would be readable on a scan server.