Faker_elasticsearch is a conda package that you can use to generate and store fake data in your elastic search server.
Use the package manager conda to install faker_elasticsearch.
conda install -c fakt faker_elasticsearch
Step 1: Create a config file stating the configurations of the elasticsearch server.
Notice the schema of fake data documents is also a part of this config file. An example document schema is given in the config file below.
Sample config file:
es_url = "http://localhost:9200/"
index_name = "test_data"
index_type = "test_type"
batch_size = 1000
num_of_shards = 2
http_upload_timeout = 3
count = 100000
format = "name:str,address:words,country_code:country_code,acc_id:account_number,ip_address:ipv4,timestamp:ts"
num_of_replicas = 0
force_init_index = False
set_refresh = False
out_file = "test.csv"
id_type = None
dict_file = None
username = None
password = None
validate_cert = True
from faker_elasticsearch import elasticsearch_data
path_to_config = "es.config" # path to the config file.
elasticsearch_data.create_fake_data(path_to_config=path_to_config)
That's all, enjoy!!!
Note: Currently supported field types are:
bool
returns a random true or falsets
a timestamp (in milliseconds), randomly picked between now +/- 30 daysipv4
returns a random ipv4tstxt
a timestamp in the "%Y-%m-%dT%H:%M:%S.000-0000" format, randomly picked between now +/- 30 daysint:min:max
a random integer betweenmin
andmax
. Ifmin
andmax
are not provided they default to 0 and 100000str:min:max
a word ( as in, a string), made up ofmin
tomax
random upper/lowercase and digit characters. Ifmin
andmax
are optional, defaulting to3
and10
words:min:max
a random number ofstrs
, separated by space,min
andmax
are optional, defaulting to '2' and10
dict:min:max
a random number of entries from the dictionary file, separated by space,min
andmax
are optional, defaulting to '2' and10
text:words:min:max
a random number of words seperated by space from a given list of-
seperated words, the words are optional defaulting totext1
text2
andtext3
, min and max are optional, defaulting to1
and1
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Please start contributing to unit tests. I will gradually update when I get the time.