/bible-explore

Exploring Bible datasets, mainly from Kaggle

Primary LanguageJupyter NotebookGNU Affero General Public License v3.0AGPL-3.0

bible-explore

Exploring Bible datasets, mainly from Kaggle

This repository contains a simple search and display fo the Kaggle Bible Corpus.

My intention with this experiment is to study ways of exploring a text dataset.

Development

  1. Clone the repo git clone git@github.com:leomrocha/bible-explore.git

  2. Install the dependencies

pip3 install -r requirements.txt
  1. Download the Kaggle Bible Corpus Dataset

From here

And select the language you want (this demo is built with the english one but it can be changed)

  1. Encode the dataset and compute similarities

Even if this description is not complete, there is a notebook that allows to encode and explore everything in the notebooks/bible-explore-one.ipynb directory

You should have 3 python pickled files as output in a db directory:

db/bible-db.pkl
db/bible-embeddings.pkl
db/graph-db.pkl
  1. launch the development server
uvicorn src.server:app --reload
  1. Develop And you can create a Pull Request if you make something :)

Testing