This repository houses our submission for the Google Gemini Developer Contest (August 2024).
Our solution enables businesses to generate useful structured data from a simple video walkthrough of their business.
By leveraging Gemini's video inference capabilities, we built a solution that generates structured data which can be used to improve the business's online presence, thus increasing revenue.
The solution is presented in the form of a progressive web app, which can be installed natively on the user's phone after navigating to the app's base url.
We leverage CloudFlare Zero Trust with its associated cloudflared
program to provide a layer of authorization required in the usage of this app.
The user first records a video on their phone, narrating and showing specific aspects of their business. Keep in mind the most important parts of your business, and make sure to visually and audiably capture the nature of it's importance. Make sure to record closeup video for at least 1 second since a frame is grabbed every 1 second (and audio is continuous).
After the video recording is complete, the user navigates to the app.
- Click on "Add New Business"
- Click on "Click To Edit" and update the Name and Description of the Entity (business or otherwise).
- Click "Upload Video" and select the video to upload.
- Keep an eye on the output log, the video needs to be uploaded with the phone screen still on (background upload is not available)
- Once the video has been uploaded, return to Step 3 for each additional video to upload. All of the videos will be used when generating the assets.
- Once all videos have been successfully uploaded, click on the "Start Inference" button and observe the output log for confirmation that the LLM inference is complete
- Once LLM Inference completed successfully, click on the "Select a Profile" button and select the online presence that you're interested in.
- Leverage the generated assets by updating the appropriate online presence using the respective UI or API.
This project is a collaboration between:
- James Antisdel: Product Manager
- Kurt Heiden: Software Engineer
This project is containerized for easy deployment and uses Gunicorn as the WSGI server. The current implementation of this project expects the user to provide a
globally unique value for the Cf-Access-Authenticated-User-Email
header. This value must be unique across all other users of the application and is base64 encoded
then saved to disk for the purpose of persisting and authorizing access to the appropriate user data storage.
This means that this application can be deployed using Cloudflare Zero Trust and the Cloudflare daemon. When leveraging the Cloudflare Zero Trust authorization flow,
the user will automatically be sending the Cf-Access-Authenticated-User-Email
header as a part of their requests to the application.
Obtaining the Docker Image:
docker build . -t kheidencom:gemini-developer-contest-aug-2024
or
docker pull kheidencom/gemini-developer-contest-aug-2024
Running the Application with Docker Compose:
Create an .env file from .env.template
and populate the GOOGLE_API_KEY
.
docker-compose up -d
Before running the application locally, you need to install the required Python packages:
pip install -r requirements.txt
pip install -r requirements1.txt
Create an .env file from .env.template
and populate the GOOGLE_API_KEY
.
To run the application locally:
cd src
python server.py
The web server serves on port 4867.
The project generates two log files for monitoring and debugging:
- access.log: Contains access logs generated by the Gunicorn web server, providing information about incoming requests.
- main.log: Captures standard output and error logs, including detailed stack traces, which are crucial for troubleshooting issues.
You can monitor the main.log
file in real-time using PowerShell.
Get-Content <path\to>\logs\main.log -wait
Note: Replace <path\to>
with the actual path to the logs
directory.
This project includes a Progressive Web App (PWA) manifest located in the src/geminidevcontest2024
directory. This can be deployed to a static site. Note that the HTML of the templates will need to be updated to reflect the new manifest location.
We also include a Colab notebook which can aid in rapid development of prompt tuning.