This repository contains the code for the MH6301 - Information Retrieval project at the Nanyang Technological University (NTU). The project creates a simple app that enables searching on the businesses included in the Yelp Dataset.
The project is divided into three main parts:
- ElasticSearch: The ElasticSearch instance that indexes the Yelp dataset (only hosted in the Docker Compose file)
- Indexer (hosted in the
./backend
folder): The Python script that indexes the Yelp dataset into the ElasticSearch instance. - App (hosted in the
./frontend
folder): The React app that enables searching on the indexed data.
In addition, the ./data
folder contains only the business.json
file from the Yelp dataset. The full dataset can be downloaded from the Yelp Dataset website.
Please view the preview video for a quick overview of the project.
Assuming you have cloned this repo to your local machine, there are two main ways to run the project:
- Ensure you have Docker Compose installed on your machine.
- Navigate to the root directory of the project in your terminal.
- Build and run the Docker containers by running
docker-compose up -d
. - Open your web browser and navigate to
http://localhost:3000
to access the application.
- Ensure you have Elasticsearch, Python, Node.js and npm installed on your machine.
- Start Elasticsearch locally.
- Navigate to the root directory of the project in your terminal.
- Install Python dependencies by running
pip install -r requirements.txt
(This assumesrequirements.txt
is in the root directory). - Prepare and start the Python indexer by running
python index.py
. - Install Node.js dependencies by running
npm install
. - Once dependencies are installed, you can start the application with
npm run start
. - Open your web browser and navigate to
http://localhost:3000
to access the application.
Dark modes
- https://github.com/aniftyco/awesome-tailwindcss#tools
- https://tailwindcss.com/docs/dark-mode
- https://blog.logrocket.com/theming-react-components-tailwind-css/
- https://react-typescript-cheatsheet.netlify.app/docs/basic/getting-started/basic_type_example/
- https://tailwindcomponents.com/component/layout-with-header-sidebar-and-rightbar
- https://mantine.dev/core/app-shell/
- https://www.searchkit.co/docs/getting-started/with-react
- https://github.com/searchkit/searchkit/blob/main/examples/with-ui-nextjs-react/pages/index.tsx
- https://www.algolia.com/doc/guides/building-search-ui/getting-started/react/?client=jsx
- Enable array indexing for array fields (e.g. business.categories, checkin.date, etc.)
- Reference: EiA, 3.3.1 Arrays
- Enable nested type indexing for nested fields (e.g. business.hours, business.attributes, etc.)
- Reference: EiA, 8.3 Nested type
- Enable geolocation indexing for long/lat fields (e.g. business.latitude, business.longitude, etc.)
- Reference: EiA, Appendix A Working with geospatial data
- Enable dark mode
- Dockerize everything
Note: EiA = Elasticsearch in Action (1st Edition)