Mahedi-61/MIU_CS_411

Need a suggestion

Closed this issue · 2 comments

I have created 3 new variables to combine some related variables as shown in picture.
Screenshot (15)

here I need some suggestion now, which are:

  1. Should I drop the variables, which are used to create the new variables?
  2. Is it necessary to drop.?
  3. If I drop them or if I don't, then what can be the effect in my processing?

There is no general rule for that. It depends on problem domain knowledge.

You can keep all the feature variables if the feature dimension is not so high.
For example, Total number of the full bath is still important though now you have total number of bath variables.

However, there is a metric named pearson correlation coefficient in which you can calculate the importance of each variable statistically. From this coefficient value you can decide which feature to keep and which to drop

Ok sir. Thank you so much.. I will check the pdf.