/OpuSearch

A Streamlit application to generate and search alignments from OPUS OpenSubtitles

Primary LanguagePythonMIT LicenseMIT

DOI

OpuSearch

This is a Streamlit Application for the software project OpuSearch.

Setup Overview

  1. Check if you have Python already installed: How to Python. If the command is not found, or a different version than 3.11.X, download Python 3.11 and make sure to add Python to your Path Variable (See here for more information or take the instruction linked under 3.3). For Windows, better use the Python Website for installing instead of Microsoft Store.
  2. Check if you have Git installed: How to Git If Git is not on your computer, install it for Windows or for macOS/Linux. If necessary, install the package manager homebrew beforehand (macOS/Linux).
  3. Clone the repository.
  4. Create a virtual environment.
  5. Install the necessary packages into the virtual environment.
  6. Run the app.

Detailed Setup Instruction

2. Cloning the repository

2.1

Open a command line tool of your choice.

2.2.

Navigate to your preferred local directory, ideally something like "Documents". Make sure that this is not a cloud folder, as the software will store large amounts of data to work. For navigating in Windows you can go to the preferred directory via the Explorer and open a Git Bash or cmd there using the context menu with right-click. Otherwise, you can use cd to change the directory over the commandline (on all operating systems). To make things easier you can copy the path to your preferred directory from your file explorer or finder and put it after cd.

2.3.

Once you have arrived at the desired locarion, copy and paste the following command:

git clone https://github.com/JR0cky/OpuSearch.git

Remark: You may need to generate a token to be able to clone.

3. Creating a virtual environment

To create a virtual environment for this project you can use different frameworks. This example shows how to do it with python and venv:

3.1.

Open a command line tool of your choice.

3.2

Go to the root directory (OpuSearch) of the project. For Windows you can go to the root directory via the Explorer and open a Git Bash or Terminal there using the context menu with right-click. Otherwise, you can navigate there (on all operating systems) using:

cd OpuSearch

3.3

In the command line (bash or cmd) type the following:

python -m venv venv

or, depending on your operating system, use

python3 -m venv venv

Make sure to use "venv" for the environment name, as the script OpuSearch.bat for running the app will not work with a different name.

If Python is not recognized as a module you need to add it to your system's path (see this instruction).

You can check whether the folder venv has been created by using the command dir (Windows) or ls (Linux/macOS) to list the content of the directory you are currently in. Otherwise, you can take a look via the file explorer.

3.4

Now activate the environment with the following command from the project's root directory (OpuSearch):

For Windows (cmd):

venv\Scripts\activate.bat

If you see the name of your environment in parentheses at the beginning of the line then you were successful.

For Windows (Powershell --> Terminal in VSCode and PyCharm):

venv\Scripts\Activate.ps1

Remark: If you do not have permission to run this command, you may follow these instructions.

For Linux and MacOS (including Git Bash for Windows):

source venv/bin/activate

If you see (venv) at the beginning of your command prompt, the virtual environment has been activated.

4. Installing the packages

Once you have activated your virtual environment and still being in the project's root directory, you can install the required packages with the following command:

pip install -r requirements.txt

The installation is finished if you see your user name and the environment name displayed as before.

5. Running the Streamlit App

After installing the necessary packages, you can start the application. For Windows you can do this double clicking on OpuSearch.bat, which is in the project's root directory. You may need to allow its execution before the application starts running.

Alternatively (including macOs and Linux), you can run this code from the project's root directory (OpuSearch) to display the content on your local host. Use the following command:

streamlit run home.py

This will open a terminal (command line) automatically. You will be asked to enter your email in the terminal if you start the app for the first time. You can skip this step by pressing ENTER.

Afterwards, the app will be opened in a new tab in your default browser.

6. Closing the Streamlit App

You can stop the application by either pressing CTRL + C in the terminal or pressing the button Close Application. If you just close the browser window, we advise you shut down the terminal and run the app again in a new terminal.

7. Updating the Streamlit App

In case there was an update you can run the following command to download the changes from this repository before running the app as usual:

git pull origin

Also make sure to add new packages in case the requirements.txt has been updated:

pip install -r requirements.txt

8. Citing the Software

If you use this software in your work, please cite it using the following metadata:

@software{RockstrohFliessbach2024OpuSearch,
author = {Rockstroh, Johanna and Fliessbach, Jan},
doi = {10.5281/zenodo.12742554},
month = jul,
title = {{OpuSearch: Application and GUI to generate and search alignments from OPUS OpenSubtitles}},
url = {https://github.com/JR0cky/OpuSearch},
version = {1.0},
year = {2024}
}