EducationalTestingService/rstfinder

ValueError: X has 9371 features, but StandardScaler is expecting 74698 features as input.

mufeili opened this issue ยท 7 comments

Hi,

When I evaluate a trained discourse parsing model (e.g. using rst_eval rst_discourse_tb_edus_TRAINING_DEV.json -p rst_parsing_model.C1.0 --use_gold_syntax), I encountered the error in the title.

Since the code uses sparse features, my guess is that the set of features in the training and test sets are different.

@mufeili did you train the model with rstfinder using the instructions?

@desilinguist Thank you for your reply. Yes, I followed the instructions. It seems that the issue does not exist with skll 2.1. So I guess there are some compatibility issues with skll 2.5.

Interesting! Yes, it could certainly be that it's a SKLL 2.5 issue since we haven't really tested rstfinder with that yet.

Glad you have a workaround for now. I will try to replicate the issue on my end and see what changes are required.

Thank you for developing and maintaining such a great tool!

Thank you for the kind words! I am glad you find it useful! :)

I'm having this same problem -- just wondering if there are any updates on this issue?

Hi @ashleylew, unfortunately, we have still not gotten around to getting rstfinder to work with SKLL 2.5. Do you still see this issue if you use SKLL 2.1?