naturomics/CapsLayer

Performance issues in /capslayer/data/datasets (by P3)

Opened this issue · 1 comments

Hello! I've found a performance issue in /capslayer/data/datasets: batch() should be called before map(), which could make your program more efficient. Here is the tensorflow document to support it.

Detailed description is listed below:

  • /cifar10/reader.py: dataset.batch(batch_size)(here) should be called before dataset.map(parse_fun)(here).
  • /fashion_mnist/reader.py: dataset.batch(batch_size)(here) should be called before dataset.map(parse_fun)(here).
  • /mnist/reader.py: dataset.batch(batch_size)(here) should be called before dataset.map(parse_fun)(here).
  • /cifar100/reader.py: dataset.batch(batch_size)(here) should be called before dataset.map(parse_fun)(here).

Besides, you need to check the function called in map()(e.g., parse_fun called in dataset.map(parse_fun)) whether to be affected or not to make the changed code work properly. For example, if parse_fun needs data with shape (x, y, z) as its input before fix, it would require data with shape (batch_size, x, y, z).

Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.

Hello, I'm looking forward to your reply~