orpatashnik/StyleCLIP

Error in using AFHQWild

sarmientoj24 opened this issue · 1 comments

I am getting this error when using afhwild dataset

G_main                        Params    OutputShape         WeightShape     
---                           ---       ---                 ---             
latents_in                    -         (?, 512)            -               
labels_in                     -         (?, 0)              -               
G_mapping/Normalize           -         (?, 512)            -               
G_mapping/Dense0              262656    (?, 512)            (512, 512)      
G_mapping/Dense1              262656    (?, 512)            (512, 512)      
G_mapping/Dense2              262656    (?, 512)            (512, 512)      
G_mapping/Dense3              262656    (?, 512)            (512, 512)      
G_mapping/Dense4              262656    (?, 512)            (512, 512)      
G_mapping/Dense5              262656    (?, 512)            (512, 512)      
G_mapping/Dense6              262656    (?, 512)            (512, 512)      
G_mapping/Dense7              262656    (?, 512)            (512, 512)      
G_mapping/Broadcast           -         (?, 16, 512)        -               
dlatent_avg                   -         (512,)              -               
Truncation/Lerp               -         (?, 16, 512)        -               
G_synthesis/4x4/Const         8192      (?, 512, 4, 4)      (1, 512, 4, 4)  
G_synthesis/4x4/Conv          2622465   (?, 512, 4, 4)      (3, 3, 512, 512)
G_synthesis/4x4/ToRGB         264195    (?, 3, 4, 4)        (1, 1, 512, 3)  
G_synthesis/8x8/Conv0_up      2622465   (?, 512, 8, 8)      (3, 3, 512, 512)
G_synthesis/8x8/Conv1         2622465   (?, 512, 8, 8)      (3, 3, 512, 512)
G_synthesis/8x8/Upsample      -         (?, 3, 8, 8)        -               
G_synthesis/8x8/ToRGB         264195    (?, 3, 8, 8)        (1, 1, 512, 3)  
G_synthesis/16x16/Conv0_up    2622465   (?, 512, 16, 16)    (3, 3, 512, 512)
G_synthesis/16x16/Conv1       2622465   (?, 512, 16, 16)    (3, 3, 512, 512)
G_synthesis/16x16/Upsample    -         (?, 3, 16, 16)      -               
G_synthesis/16x16/ToRGB       264195    (?, 3, 16, 16)      (1, 1, 512, 3)  
G_synthesis/32x32/Conv0_up    2622465   (?, 512, 32, 32)    (3, 3, 512, 512)
G_synthesis/32x32/Conv1       2622465   (?, 512, 32, 32)    (3, 3, 512, 512)
G_synthesis/32x32/Upsample    -         (?, 3, 32, 32)      -               
G_synthesis/32x32/ToRGB       264195    (?, 3, 32, 32)      (1, 1, 512, 3)  
G_synthesis/64x64/Conv0_up    2622465   (?, 512, 64, 64)    (3, 3, 512, 512)
G_synthesis/64x64/Conv1       2622465   (?, 512, 64, 64)    (3, 3, 512, 512)
G_synthesis/64x64/Upsample    -         (?, 3, 64, 64)      -               
G_synthesis/64x64/ToRGB       264195    (?, 3, 64, 64)      (1, 1, 512, 3)  
G_synthesis/128x128/Conv0_up  1442561   (?, 256, 128, 128)  (3, 3, 512, 256)
G_synthesis/128x128/Conv1     721409    (?, 256, 128, 128)  (3, 3, 256, 256)
G_synthesis/128x128/Upsample  -         (?, 3, 128, 128)    -               
G_synthesis/128x128/ToRGB     132099    (?, 3, 128, 128)    (1, 1, 256, 3)  
G_synthesis/256x256/Conv0_up  426369    (?, 128, 256, 256)  (3, 3, 256, 128)
G_synthesis/256x256/Conv1     213249    (?, 128, 256, 256)  (3, 3, 128, 128)
G_synthesis/256x256/Upsample  -         (?, 3, 256, 256)    -               
G_synthesis/256x256/ToRGB     66051     (?, 3, 256, 256)    (1, 1, 128, 3)  
G_synthesis/512x512/Conv0_up  139457    (?, 64, 512, 512)   (3, 3, 128, 64) 
G_synthesis/512x512/Conv1     69761     (?, 64, 512, 512)   (3, 3, 64, 64)  
G_synthesis/512x512/Upsample  -         (?, 3, 512, 512)    -               
G_synthesis/512x512/ToRGB     33027     (?, 3, 512, 512)    (1, 1, 64, 3)   
---                           ---       ---                 ---             
Total                         30276583                                      

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-90-9adc47bb73f0> in <module>()
     24 model, preprocess = clip.load("ViT-B/32", device=device)
     25 
---> 26 M=Manipulator(dataset_name='afhqwild')
     27 fs3=np.load('./npy/afhqwild/fs3.npy')
     28 np.set_printoptions(suppress=True)

3 frames
/tensorflow-1.15.2/python3.7/tensorflow_core/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1154                 'Cannot feed value of shape %r for Tensor %r, '
   1155                 'which has shape %r' %
-> 1156                 (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
   1157           if not self.graph.is_feedable(subfeed_t):
   1158             raise ValueError('Tensor %s may not be fed.' % subfeed_t)

ValueError: Cannot feed value of shape (1, 16, 512) for Tensor 'G_synthesis_1/dlatents_in:0', which has shape '(?, 14, 512)'

I believe this problem came from training options for encoder4editing. The second dimension of latent output is calculated by

log_size = int(math.log(opts.stylegan_size, 2))
        self.style_count = 2 * log_size - 2

It seemed that you need (1, 14, 512) shaped latent for input, so please set your stylegan_size as 256, then it will be fixed.