Command to anonymize sensitive data. This app helps you anonymize data in a database used for development of a Django project.
This app is based on Django-Database-Anonymizer, using Faker to anonymize the values.
Install using pip:
pip install django-hattori
Then add 'hattori'
to your INSTALLED_APPS
.
INSTALLED_APPS = [
...
'hattori',
]
You should only run the anonymize process in PRE or development environments. To avoid problems by default anonymization is disabled.
To enable you must add to settings ANONYMIZE_ENABLED=True
How to execute command:
./manage.py anonymize_db
Possible arguments:
-a, --app
: Define a app you want to anonymize. All anonymizers in this app will be run. Eg.anonymize_db -a shop
-m, --models
: List of models you want to anonymize. Eg.anonymize_db -m Customer,Product
-b, --batch-size
: batch size used in the bulk_update of the instances. Depends on the DB machine, default use 500.
In order to use the management command we need to define anonymizers.
- Create a module anonymizers.py in the given django-app
- An anonymizer is a simple class that inherits from
BaseAnonymizer
- Each anonymizer class is going to represent one model
- An anonymizer has the following members:
model
: (required) The model class for this anonymizerattributes
: (required) List of tuples that determine which fields to replace. The first value of the tuple is the fieldname, the second value is the replacerget_query_set()
: (optional) Define your QuerySet
- A replacer is either of type str or callable
- A callable replacer is a Faker instance or custom replacer.
- All Faker methods are available. For more info read the official documentation Faker!
from hattori.base import BaseAnonymizer, faker
from shop.models import Customer
class CustomerAnonymizer(BaseAnonymizer):
model = Customer
attributes = [
('card_number', faker.credit_card_number),
('first_name', faker.first_name),
('last_name', faker.last_name),
('phone', faker.phone_number),
('email', faker.email),
('city', faker.city),
('comment', faker.text),
('description', 'fix string'),
('code', faker.pystr),
]
def get_query_set(self):
return Customer.objects.filter(age__gt=18)
Use lambdas to extend certain predefined replacers with arguments, like min_chars
or max_chars
on faker.pystr
:
('code', lambda **kwargs: faker.pystr(min_chars=250, max_chars=250, **kwargs)),
Important: don't forget the **kwargs!