/Frcwp

滴滴黑产识别的离群点检测python自用包

Primary LanguagePythonMIT LicenseMIT

Build Status

What is Frcwp

It means fast risk control with python. It's a lightweight tool that automatic recognize the outliers from a large data pool. This project aims to help people get easily method with abnormal recognition, especially forces password attacks. We wish it could be a nice Open Source which could simplify the complexity of the Data Feature Project.

Theory

@bolg:风控用户识别方法

Compared with common methods

We got the correctness around 29 data sets below,however the speed of Frcwp comes last.

Usage

U can get it easily download from Pypi with pip install Frcwp.

import pandas as pd
from Frcwp import Frcwp

path = '../data/data_all.csv'
traindata = pd.read_csv(path)

frc = Frcwp()
traindata = frc.changeformat(traindata, index=0)

params = {
    'na_rate': 0.4,
    'single_dealed': 1,
    'is_scale': 0,
    'distince_method': 'Maha',
    'outlier_rate': 0.05,
    'strange_rate': 0.15,
    'nestimators': 150,
    'contamination': 0.2
}

frc.fit(traindata, **params)

predict_params = {
    'output': 20,
    'is_whole': 1
}
frc.predict(frc.potentialdata_set, **predict_params)

Dependence

Frcwp is implemented in Python 3.6, use Pandas.DataFrame to store data. These package can be easily installed using pip.

Reference

TODO

  • feature scanning
  • increase new outliers distinguishing methods