- satuin gambar, dengan cara cari seluruh format gambarnya pake glob, terus pindahin ke satu folder aja
- search duplicated dataset with phash
- buat model untuk cek ini sertif yang beneran apa nggak -> simple cnn would do i guess?
- training pipeline di kedro
- semua, full size, >90 accuracy on sertif_clf
- resized, non transparent, converted to 512*512
training is actually done, first batch evaluations are:
- log loss didn't converge
fixes to do:
- sertifikat classifier
- gak usah di dedupe, 'agak mirip' is fine -> set the threshold to be >=2
- re-label the sertifikat classifier images, for a better criteria
- watch dog kalo model udah selesai training -> upload ke drive
- watch dog biar modelnya yang disimpen cuman 2 aja