This is CSV data manipulation scripts First you need to build index DB with position, size and filename of each molecule. `indexer_csv.py --readfromdir mydata/ --db mydata_index.sqlite` To extract molecules to csv file after building index: `fetcher_csv.py --readfromdir mydata/ --db mydata_index.sqlite --input-file in.txt --output-file results.csv` Also you can put data to Wasabi.com bucket. First install some cli tools and python modules: `pip3 install awscli boto3` Next put key and token to `~/.aws/crednetials` ``` [default] aws_access_key_id = YOUR_KEY_ID aws_secret_access_key = YOUR_KEY ``` And upload data: ``` aws s3 sync mydata/ s3://mybucket/ ``` Next you can fetch molecules from Wasabi with `fetcher_csv_s3.py`: ``` fetcher_csv_s3.py -f 100k.txt --readfrombucket mybucket -o out.csv --db mydata_index.sqlite ``` Index data can be extracted to csv file for instance: sqlite3 -header -csv index.sqlite 'select sid from data' > output.csv This csv can be imported to mysql (use substance1.sql as example. `mysql yourdbname < substance1.sql`)