Question regarding reproduction of results

Question

Question regarding reproduction of results

Closed this issue 9 months ago · 3 comments

Hello,
Thanks for sharing the code and congratulations on your publication. I have been trying to reproduce your results on CIFAR-100 and I am not getting the average accuracy of 89.96. Here is the config file: https://pastebin.com/FzdDxBD7

Here is the log file: https://pastebin.com/qFMv3kW4

Looking forward to hearing from you :)

Answer 1 · 2023-10-16T02:31:59.000Z

Hi @gulzainali98,

Thank you for your interest in our work.

After reviewing the config file and log file you've provided, it seems that you have set module.num_emas=0 in your config file (https://pastebin.com/FzdDxBD7), please setmodule.num_emas=1 (line 65) to reproduce the experiment.

If you encounter any further issues or have additional questions, please don't hesitate to reach out.

Best regards,

Qiankun Gao

Answer 2 · 2023-11-03T14:39:25.000Z

Here are the new logs on class order 1 with num_emas set to 1. I downloaded your git and ran the repository without any changes.
https://pastebin.com/mSY6s5Vs

my question is what would \hat{A} in the logs be. We get following final accuracies:

Acc: 85.43
Global Per Task Accs: 85.70, 82.30, 87.10, 84.40, 87.00, 78.60, 85.30, 85.00, 89.80, 89.10
Global Task Accs Avg: 85.43
Local Per Task Accs: 99.40, 98.00, 97.20, 97.80, 97.80, 96.40, 96.60, 98.80, 98.90, 99.10

It seems that from the formula given in the paper, \hat{A} is the average global task accuracy. However, last task 9 accuracy seems to be closer to the \hat{A} reported in the paper. Looking forward to your answer.

Thanks :)

Answer 3 · 2023-12-10T03:18:02.000Z

Hi, @gulzainali98，

I'm happy to provide some clarification on the terms you've encountered.

Acc: This refers to $A_{10}$ as mentioned in the paper, indicating the accuracy measured across all 10 tasks. This accuracy is calculated over the entire set of 100 classes.
Global Per Task Accs: These are the accuracies for each of the 10 tasks when tested individually. Here, classification is performed across all 100 classes too.
Global Task Accs Avg: This is the average of the 10 values listed in Global Per Task Accs.
Local Per Task Accs: These accuracies are also calculated for each of the 10 tasks individually. However, unlike Global Per Task Accs, the classification here is limited to the 10 classes relevant to each specific task.

To calculate $\bar{A}$, you would use the command grep 'Acc:' /path/to/log.txt to extract the Acc values for each of the 10 learning stages. The average of these 10 Acc values will give you the $\bar{A}$

I hope this helps clarify your doubts. If you have any more questions or need further assistance, feel free to reach out.

Best regards,

Qiankun Gao