/gemini-developer-contest-aug-2024

Gemini Developer Contest August 2024

Primary LanguagePython

Online Presence Generator with Google Gemini

YouTube Video

This repository houses our submission for the Google Gemini Developer Contest (August 2024).

Our solution enables businesses to generate useful structured data from a simple video walkthrough of their business.

By leveraging Gemini's video inference capabilities, we built a solution that generates structured data which can be used to improve the business's online presence, thus increasing revenue.

The solution is presented in the form of a progressive web app, which can be installed natively on the user's phone after navigating to the app's base url. We leverage CloudFlare Zero Trust with its associated cloudflared program to provide a layer of authorization required in the usage of this app.

App Usage

The user first records a video on their phone, narrating and showing specific aspects of their business. Keep in mind the most important parts of your business, and make sure to visually and audiably capture the nature of it's importance. Make sure to record closeup video for at least 1 second since a frame is grabbed every 1 second (and audio is continuous).

After the video recording is complete, the user navigates to the app.

  1. Click on "Add New Business"
  2. Click on "Click To Edit" and update the Name and Description of the Entity (business or otherwise).
  3. Click "Upload Video" and select the video to upload.
  4. Keep an eye on the output log, the video needs to be uploaded with the phone screen still on (background upload is not available)
  5. Once the video has been uploaded, return to Step 3 for each additional video to upload. All of the videos will be used when generating the assets.
  6. Once all videos have been successfully uploaded, click on the "Start Inference" button and observe the output log for confirmation that the LLM inference is complete
  7. Once LLM Inference completed successfully, click on the "Select a Profile" button and select the online presence that you're interested in.
  8. Leverage the generated assets by updating the appropriate online presence using the respective UI or API.

Project Team

This project is a collaboration between:

  • James Antisdel: Product Manager
  • Kurt Heiden: Software Engineer

Running the Application with Docker

This project is containerized for easy deployment and uses Gunicorn as the WSGI server. The current implementation of this project expects the user to provide a globally unique value for the Cf-Access-Authenticated-User-Email header. This value must be unique across all other users of the application and is base64 encoded then saved to disk for the purpose of persisting and authorizing access to the appropriate user data storage.

This means that this application can be deployed using Cloudflare Zero Trust and the Cloudflare daemon. When leveraging the Cloudflare Zero Trust authorization flow, the user will automatically be sending the Cf-Access-Authenticated-User-Email header as a part of their requests to the application.

Obtaining the Docker Image:

docker build . -t kheidencom:gemini-developer-contest-aug-2024 

or

docker pull kheidencom/gemini-developer-contest-aug-2024

Running the Application with Docker Compose:

Create an .env file from .env.template and populate the GOOGLE_API_KEY.

docker-compose up -d

Setting up the Development Environment

Before running the application locally, you need to install the required Python packages:

pip install -r requirements.txt
pip install -r requirements1.txt

Create an .env file from .env.template and populate the GOOGLE_API_KEY.

To run the application locally:

cd src
python server.py

The web server serves on port 4867.

Understanding Application Logs

The project generates two log files for monitoring and debugging:

  • access.log: Contains access logs generated by the Gunicorn web server, providing information about incoming requests.
  • main.log: Captures standard output and error logs, including detailed stack traces, which are crucial for troubleshooting issues.

Real-time Log Monitoring (Windows)

You can monitor the main.log file in real-time using PowerShell.

Get-Content <path\to>\logs\main.log -wait

Note: Replace <path\to> with the actual path to the logs directory.

Progressive Web App (PWA) Integration

This project includes a Progressive Web App (PWA) manifest located in the src/geminidevcontest2024 directory. This can be deployed to a static site. Note that the HTML of the templates will need to be updated to reflect the new manifest location.

Colab nodebook

We also include a Colab notebook which can aid in rapid development of prompt tuning.