feature-engine/feature_engine

Nested directory structure of tests and feature_engine itself

Closed this issue · 2 comments

What was the rationale behind nested directory structure of tests?

I think if we flatten that hierarchy newcomer developers will benefit by navigating easily. Also, for me it's easy to get lost inside that hierarchy.

Here is the screenshot of that nested hierarchy:
image

Here, the module we're testing is feature_engine/encoding which itself is not nested. If we flatten that hierarchy at least it will resemble other directories.

Regarding the feature_engine there is timeseries module with nested directory. All other modules don't have that sort of structure and there is only this one with nested structure.

Here is the screenshot:

image

What about flattening this too?

Hey @Okroshiashvili

Time series is nested, because I was kind of planning ahead. With time series you can create features for forecasting, and features for regression or classification. At the moment we do not have functionality for the latter, but I created the subfolder looking to what I'd like to have next.

Having said this, for the latter we have tsfresh already, so I am not too sure how much more value we could add with feature-engine. I guess, I will know when I start working with ts.

So, we could flatten this directory, and then when/if we add more functionality for regression or classification we could nest it back. I'd do this so long the imports do no change. Otherwise it'll upset users.

Regarding encoding, I divided this in subfolders because I have more than 1 python file for the base encoders and then I found it confusing. There is 1 script per base class. And one of them is a bit lengthy already, so not sure if it helps to join the 3 together.

Anyhow, I don't have strong feelings, I'd be happy to flatten this directory as well. If you pick this up, be mindful that we need to adjust the toctrees in the docs. I can point you to which file if ness.

Thank you!

Thanks @solegalli for clarification. If you plan to add new features/functionalities then I guess it's much better to keep current structure rather than doing things twice considering that flattening may ruin imports