Gemini Pro and Vision: Free Python API — A Quick Start-Up Guide

3 min readJan 20, 2024

In this article, we will explore how to use Gemini Pro and Vision Pro for free through their API.

For those who prefer a hands-on approach, there’s a Google Colab available for experimentation.

Introduction

Gemini is a class of large language models developed by Google DeepMind. It’s a multimodal model designed to seamlessly work with and interpret text, code, images, video, and audio. Gemini is seen as a rival to GPT-4, another multimodal model developed by OpenAI.

Gemini comes in three different versions

Gemini Ultra
a. largest and most capable model for highly complex tasks
b. still in preview access
c. efficiently serveable at scale on TPU accelerators
Gemini Pro
a. best model for scaling across a wide range of tasks
b. available now
Gemini Nano
a. most efficient model for on-device tasks
b. still in preview access
c. trained by distilling from larger Gemini models
d. two versions of Nano, with 1.8B (Nano-1) and 3.25B (Nano-2) parameters, targeting low and high memory devices respectively

Free Pricing

Developers have free access to Gemini Pro and Gemini Pro Vision through , with up to 60 requests per minute, making it suitable for most app development needs.

Google’s Gemini Setup in Google Colab

Ready to play — Google Colab

Google Colaboratory

Edit description

colab.research.google.com

Step 1

Use Google Colab to get started

colab.google

Colab is a hosted Jupyter Notebook service that requires no setup to use and provides free access to computing…

colab.google

Step 2

Google AI Python SDK

It enables developers to use Google’s state-of-the-art generative AI models 1. Gemini
2. PaLM

Supports:

Generate text from text-only input
Generate text from text-and-images input (for Gemini only)
Build multi-turn conversations (chat)
Embedding

pip install google-generativeai

Step 3

Setting up your API key, follow the link and setup the API key required for accessing Gemini.

Google API Key generation.

Add the key to the secrets manager in the left panel.Give it the name GOOGLE_API_KEY.
Configure it with following code

GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')

genai.configure(api_key=GOOGLE_API_KEY)

Step 4

1. Generate text from text inputs

model = genai.GenerativeModel('gemini-pro')

response = model.generate_content("What is medium blogs")

The response from the Gemini-pro is

2. Generate text from image and text inputs

model = genai.GenerativeModel('gemini-pro-vision')

Example output for the image

response = model.generate_content(img)

response = model.generate_content(["How to make this Recipe", img], stream=True)

Use this Google Colab notebook link is here.

References

Stay on the cutting-edge of AI! 🌟 Follow me on Medium, connect on LinkedIn, and explore my GitHub for insightful AI projects about the latest trends in AI technologies and models. Dive into the world of AI with me and discover new horizons! 📚💻

Gemini Pro and Vision: Free Python API — A Quick Start-Up Guide

Introduction

Google Colaboratory

Edit description

Step 1

colab.google

Colab is a hosted Jupyter Notebook service that requires no setup to use and provides free access to computing…

Step 2

Supports:

Step 3

Step 4

1. Generate text from text inputs

2. Generate text from image and text inputs

Written by Abhishek Maheshwarappa