VIVA is a software tool for building content-based retrieval methods for video archives based on deep learning models. The tool enables visual information retrieval through concept classification and person identification. It allows to easily train models from scratch and adjust them by adding new people or concepts. To reduce the manual effort of collecting training samples, the workflow includes a web crawler, image similarity search, as well as review and user feedback components.
Funded by the German Research Foundation (DFG), the VIVA software was developed at the Department of Distributed Computing (University of Marburg) and the Visual Analytics group (Leibniz Information Centre for Technology and Natural Sciences - TIB) in cooperation with the German Broadcasting Archive (DRA).
For a detailed documentation see wiki
-
Software requirements:
docker
,docker-compose
,pipenv
(for compose generator) -
Create data folders
cd `docker/script/ ./create_data_folders.sh
-
(Optional) Add Elastic Search index to
docker/data/elasticsearch
chown -R 1000 elasticsearch
See wiki for details.
-
Download the required face processing models (see wiki).
-
Set the file permissions of
redis.conf
:chmod o+r docker/build/redis/redis.conf
. -
(Optional) Modify
.env
inside thedocker
folder eg. to set up custom ports and a custom media folder location -
Initialize database
cd docker
- Create schema
bash script/generate_compose.sh -i docker-compose up --abort-on-container-exit
- Import default values and initialize sequences:
./script/run_sql_script.sh sql/base_init.sql
- Import development users example to get access to the app (username
test
and passwordnix
)./script/run_sql_script.sh sql/users_development.sql
Hint: If you want to reinitialize the database make sure to delete the folder
docker/data/postgres/data
otherwise initialization routine of PostgreSQL will not be executed. - Create schema
-
(Optional) Import sample data set
-
Select the docker-compose environment
bash script/generate_compose.sh -d django # development (Django) on CPU bash script/generate_compose.sh -d django -t gpu # development (Django) on GPU bash script/generate_compose.sh -d keras # development (Keras) on CPU bash script/generate_compose.sh -d keras -t gpu # development (Keras) on GPU
Start Docker containers
docker-compose up
For development: Open the app project in an IDE and start the Django development server (e.g in PyCharm Professional). Go to http://localhost:8000/ and log in.
Datasets should contain database information and the corresponding media files. When importing your own media files please make sure to also import corresponding database information according to our database structure.
Note: Make sure when executing the following commands that the working directory is set to docker
folder!
After setting up the docker compose environment datasets can be imported. Database information can be imported in different ways:
-
A sql file containing all required information without the need of additional files: Regardless of the location of the sql file the following command can be executed.
./script/run_sql_script.sh PATH_TO_SQL_FILE
-
Multiple files that depend on each other: Copy all files to
docker/data/postgres/transfer
. To execute a sql file in the transfer folder use:./script/run_transfer_sql_script.sh FILENAME_OF_SQL_FILE # no path allowed here
-
To execute a custom command in the transfer path (above), ensure that the database container is running and execute:
docker exec -it -w "/transfer" "$(docker-compose ps -q db)" COMMAND
-
When importing a dataset that also contains media files make sure to add the media files to corresponding folder (
app/media
folder or custom location). -
Serving media files in production environment requires the folders and files to be world readable including the folder provided in the
docker/.env
file. If you do not want them to be world readable the folders and files have to be owned by the user with id 101 (Nginx daemon). -
Datasets could interfere with each other (violating constraints in database).
There is a sample database that contains some concepts but no media. The database information can be found in docker/sql/sample_concepts.sql
.
Currently we are working on a larger sample media dataset. Stay tuned!