TestingGemini

Project Introduction

Overview

This is a personal, simple learning and testing project, aimed at exploring and testing the functionalities of two models: GPT-4 and Gemini. The project serves as a modest share of my experiences in this area.

Objectives

The main objectives of the project are:

To learn and test the models: Gain hands-on experience with GPT-4 and Gemini by calling their APIs and testing their capabilities.
To share experiences: Document and share insights and findings from experimenting with these models.

Useful Links

GPT-4 Vision: OpenAI GPT-4 Vision API
Gemini: Google AI Gemini

3. Environment Setup and Execution

The necessary steps for environment setup and execution are as follows:

For running GPT-4 image description:
1. Install Python.
2. Install dependency packages.
3. Input your image in the path.
4. Add your API key.
For testing Gemini:
1. Run it directly through Colab: Colab Notebook for Gemini.
2. Alternatively, set up the necessary libraries and dependencies locally (e.g., Jupyter Notebook).
3. Modify your API key and the image you want to load for description.

Note: If you have any questions about the above instructions, please leave a message in the issues section.

5. Results Comparison

At present, Gemini seems to yield results similar to those of GPT-4, although this project has not conducted extensive and rigorous testing.

7. Conclusion

Both Gemini and GPT-4 are outstanding language models for AI applications, particularly in image recognition.
As of now, both APIs are functioning well. However, Gemini seems a bit slower in producing results compared to GPT-4.
GPT-4 generally provides more stable outputs. In limited testing, Gemini sometimes exhibited hallucinations.

9. Contact Information

For queries or suggestions, please contact me at: rongx@vt.edu

5418XR/TestingGemini