hfawaz/InceptionTime

Some questions about UCR archive

peter943 opened this issue · 8 comments

Hi
I have learnt your paper and other related work (such as Rocket, TS-CHIEF, Hive-COTE) about TSC in UCR archives these day. From the results in the Rocket, your work is one of the most accurate model.
1
The one of the motivations in your paper is reducing the computational complexity of the model which is also in the Rocket. As you can see, the performance difference between Rocket and Inceptiontime is small. It seems that reducing computational complexity while maintaining accuracy is a research direction in the recent published paper for UCR. I have a confusion whether the accuracy could be improved in statistically. Constructing a more complex network may be a good choice. Could you please give me some suggestions? Thanks a lot.

Indeed reducing computational complexity was neglected before for Time Series Classification. I suggest that you look into methods from other domains such as Computer Vision and try to see how to adapt them to time series classification.

@hfawaz Thanks for your quick reply. I have an another confusion about TSC in UCR when reading papers. Some papers choose the model corresponding to the maximum testing accuracy, some choose the model corresponding to the minimum training loss. As you can see, the former would be overfitting , the latter uses the testing set. What do you think about these two different choices?

Indeed this is a very important point, I suggest you take a look at the discussion here and let me know if you have more questions.

@hfawaz Thanks again. I have read the discussion between you and crocodilegogogo. In your reply, you say,
"It is true that the test set should not be used for validating any hyperparameter.
This is a known problem with any open benchmark dataset, where you risk of overfitting the public archive.
However in our paper, we did the ablation study in order to visualize the effect of removing one component from Inception."

I want to the make sure that the section 6: Architectural Hyperparameter study in your paper is belong to ablabtion study or hyperparameter searching?

Indeed you are right, the architectural study was performed in order to visualize the effect of changing the hyperparameters, but one would argue that we did the architectural study before designing InceptionTime.
There is no way to prove that we did not do that, hence my argument about potentially overfitting the UCR archive by any time series classification algorithm.
Hope this clarifies things.

Hi Fawaz,
Happy for your patient advice, and someone holding similar confusions with me. Actually, I also have the same doubts about the 'ablation study' experiments play the role of 'hyperparametric selection'. I have no intention of questioning the contribution of your article, just wondering can we use the 'ablation experiments' in your article as a standard for deep learning ablation experiments? Considering taking the test sets as validation sets is prohibited by Bagnall, for deep learners in TSC, There is not only a lack of appropriate methods for model selection, but also a lack of schemes for 'hyperparameter selection' as well as 'ablation study'.

Yes I understand that it is not a straightforward decision. I think if you would like to do a hyperparameter selection, you would be better off splitting the training set into a train/validation and keeping the test set intact. However this is expensive for an archive with 128 datasets, but if you have your own dataset, I think you can afford such as scheme.

@hfawaz Thanks for your patient reply, it helps me a lot.