Gemini Pro and Vision: Free Python API — A Quick Start-Up Guide
In this article, we will explore how to use Gemini Pro and Vision Pro for free through their API.
For those who prefer a hands-on approach, there’s a Google Colab available for experimentation.
Introduction
Gemini is a class of large language models developed by Google DeepMind. It’s a multimodal model designed to seamlessly work with and interpret text, code, images, video, and audio. Gemini is seen as a rival to GPT-4, another multimodal model developed by OpenAI.
Gemini comes in three different versions
- Gemini Ultra
a. largest and most capable model for highly complex tasks
b. still in preview access
c. efficiently serveable at scale on TPU accelerators - Gemini Pro
a. best model for scaling across a wide range of tasks
b. available now - Gemini Nano
a. most efficient model for on-device tasks
b. still in preview access
c. trained by distilling from larger Gemini models
d. two versions of Nano, with 1.8B (Nano-1) and 3.25B (Nano-2) parameters, targeting low and high memory devices respectively
Free Pricing
Developers have free access to Gemini Pro and Gemini Pro Vision through , with up to 60 requests per minute, making it suitable for most app development needs.
Google’s Gemini Setup in Google Colab
Ready to play — Google Colab
Step 1
Use Google Colab to get started
Step 2
Google AI Python SDK
It enables developers to use Google’s state-of-the-art generative AI models 1. Gemini
2. PaLM
Supports:
- Generate text from text-only input
- Generate text from text-and-images input (for Gemini only)
- Build multi-turn conversations (chat)
- Embedding
pip install google-generativeai
Step 3
Setting up your API key, follow the link and setup the API key required for accessing Gemini.
Google API Key generation.
Add the key to the secrets manager in the left panel.Give it the name GOOGLE_API_KEY.
Configure it with following code
GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')
genai.configure(api_key=GOOGLE_API_KEY)
Step 4
1. Generate text from text inputs
model = genai.GenerativeModel('gemini-pro')
response = model.generate_content("What is medium blogs")
The response from the Gemini-pro is
2. Generate text from image and text inputs
model = genai.GenerativeModel('gemini-pro-vision')
Example output for the image
response = model.generate_content(img)
response = model.generate_content(["How to make this Recipe", img], stream=True)
Use this Google Colab notebook link is here.
References
Stay on the cutting-edge of AI! 🌟 Follow me on Medium, connect on LinkedIn, and explore my GitHub for insightful AI projects about the latest trends in AI technologies and models. Dive into the world of AI with me and discover new horizons! 📚💻