/FeatureRelevance.jl

A package for scoring and selecting relevant features for 1 or more target variables.

Primary LanguageJuliaMIT LicenseMIT

FeatureRelevance.jl

A package for scoring and selecting relevant features for 1 or more target variables.

Quickstart

Tip: Start julia with --threads N to utilise multithreading functionality.

n = 10000
y = (y1=randn(n), y2=rand(n))
X = (;
    x1 = y.y1 .+ 0.05randn(n),
    x2 = y.y2 .+ 0.1randn(n),
    xsin = sin.(y.y1),
    xcos = cos.(y.y2),
    xrand = randn(n),
)

Load report function and GreedyMRMR selection algorithm. We'll also load DataFrames.jl to improve readability.

using FeatureRelevance: report, GreedyMRMR
using DataFrames

Wrap some existing features and targets in DataFrames.

targets = DataFrame(y)
features = DataFrame(X)

Finally, generate a report of the top 5 non-redundant features for each target variable.

report(GreedyMRMR(; n=5, positive=true), features, targets) |> DataFrame