simon-mo/conex

Split layers according to size threshold

Opened this issue · 0 comments

Currently, there is a planner pass that's comment out

conex/src/planner.rs

Lines 99 to 118 in fec621e

// Pass 2: Split and collapse layers so the size is about 512MB.
// let mut new_layer_to_files = Vec::new();
// let mut current_layer_size: usize = 0;
// let mut new_layer = Vec::new();
// for (layer, files) in self.layer_to_files.iter() {
// for file in files.iter() {
// if current_layer_size + file.size > 512 * 1024 * 1024 {
// new_layer_to_files.push((layer.to_owned(), new_layer.clone()));
// new_layer = Vec::new();
// current_layer_size = 0;
// }
// new_layer.push(file.to_owned());
// current_layer_size += file.size;
// }
// if !new_layer.is_empty() {
// new_layer_to_files.push((layer.to_owned(), new_layer));
// }
// }

The task would be to uncomment it, and fix any compilation and correctness issue associated with it.

The pass should successfully split any layers that are too large. We don't need to worry about combining layer for this task. It should do the following, assuming threshold of 10MB, it needs to be configurable as a CLI parameter for conex push:

Before (layer size in array) After
[20MB] [10MB, 10MB]
[8MB, 7MB] [8MB, 7MB]
[12MB, 4MB] [10MB, 2MB, 4MB]
[20MB, 12MB] [10MB, 10MB, 10MB, 2MB]

Of course, this is assume that files are small enough to align in the boundary. This first iteration do not need to split the file. For example

  • With the threshold of 10MB, a layer consist of files (12MB, 8MB), it should be split into [12MB, 8MB] for two layers. (8MB, 12MB) should yield (20MB) for now.

  • Symbolic link can go anywhere. Absolute link should go into the same layer with whichever file it references. There should be a pass after this re-arrangement to check this is enforced.

While developing this, it would be great to add unit tests for the planner class's method, so the behavior is tested.