ykasten/layered-neural-atlases

How to implement the Multi Foreground Atlases feature?

thiagoambiel opened this issue · 3 comments

Thanks for sharing this amazing code!
I'm trying to implement the Multi Foreground Atlases feature referenced by the Section 4.3 on arxiv paper.
But, i can't understand this sentence:

Unlike the one foreground object case, to support occlusions between different foreground objects, the sparsity loss is applied directly on the atlas, by applying 𝑙1 regularization on randomly sampled UV coordinates from foreground regions in the atlas.

What this means in practice?
I need to apply the l1 regularization equation error(y, ŷ) + λ * Σ |w|?

if yes:

  • What is the lambda value used in the lucia results on paper?
  • The error(y, ŷ) term is the single layer case sparsity loss equation (Eq. 14)?
  • The |w| term is the values from uv coordinates given by the multiple foreground mapping models?

if not:

  • How to calculate, in practice, the sparsity loss on multi foreground object case?
  • Please, explain with a equation for easier understanding.

Another questions:

  • The equations that calculates the losses for each mapping model (like the rigidity_loss and the optical_flow_loss) need to be applied for each foreground mapping model and summed at the end?

  • What is the coefficient values used for the user scribble losses on the equations:

    • l_red = -log(alpha_red) Eq. 20
    • l_green = -log(alpha_green) Eq. 21
  • What is this βtv = 100 variable at Section 3.5?

I would be grateful if you can answer these questions. Thanks!

  • In the case of two separate foreground layers, we apply sparsity constraints by encouraging the foreground atlases to have black (zero) colors. This is done by sampling a batch of UV coordinates directly from the continuous UV space (without using the mapping networks). This is how it should look like:
random01 = torch.rand(sparsity_batch_size, 2)
half= np.long(sparsity_batch_size/2)
random01[:half,1] = random01[:half,1]-1  # half of batch belongs to foreground 1, the other to foreground 2
random_rgb_for_sparsity=(model_F_atlas(random01.cuda())+1.0)*0.5  # atlases' color values
rgb_loss_negative = (random_rgb_for_sparsity).mean()  # L1 sparsity loss
  • Yes, the losses that apply for a single foreground layer should be applied separately for each layer when working with multiple ones.

  • The coefficient value for both scribble losses is 10,000, which is zeroed out after 10,000 iterations.

  • beta_tv is not being used, you can ignore it

Hope this helps!

Thanks so much for your reply!
I implemented your example code for sparsity loss and it works!

Foreground Layer 1 Foreground Layer 2
foreground_atlas_0 foreground_atlas_1
Multi Foreground Layers Single Foreground Layer
multi_crop single_crop

Now the algorithm supports occlusions between foreground objects and the foreground atlases is very similar to the ones from the arxiv paper!

For debugging purposes, I still have some questions:

  • What values did you use for the sparsity_batch_size and sparsity_coeff in the lucia results on paper?
  • When using scribble losses, did you draw scribbles for each frame of the video? or drawing scribbles for a fraction of the frames should be enough?

Again, i would be grateful if you can answer these questions. Thanks!

there's an implementation available at https://github.com/thiagoambiel/NeuralAtlases#multi-layer-foreground-atlases

EDIT: kudos to the OP who shared the result of his work