Quit Emailing Yourself

Trying out Gemini 3 Pro with audio transcription and a new pelican benchmark

6 min read | Saved February 14, 2026 | Copied!

gemini 🤖 audio-transcription 🤖 benchmarks 🤖 pricing 🤖 multimodal 🤖

Do you care about this?

The article reviews Google’s Gemini 3 Pro, highlighting its improved features over Gemini 2.5, including audio transcription capabilities and performance benchmarks compared to other AI models. It details pricing, multimodal input support, and tests involving image analysis and a city council meeting audio transcript.

If you do, here's more

Gemini 3 Pro launched on November 18, 2025, as an upgrade over Gemini 2.5, aiming to compete with top AI models like Claude Sonnet 4.5 and GPT-5.1. It retains the same knowledge base, with a January 2025 cutoff. The model can handle 1 million input tokens and output up to 64,000 tokens, supporting various input types including text, images, audio, and video. Benchmarks show it slightly outperforms its main competitors on several tests, though independent verification of these results is still pending.

In terms of pricing, Gemini 3 Pro is more expensive than Gemini 2.5 but remains cheaper than Claude Sonnet 4.5. For instance, it charges $2.00 for the first 200,000 tokens input and $4.00 thereafter, compared to Claude's $3.00 for the first 200,000 tokens. The article details several benchmarks, showing Gemini 3 Pro excelling in areas like academic reasoning and multimodal understanding. It also discusses attempts to test the model’s capabilities, including generating alt text from an image and transcribing a city council meeting audio. The initial audio transcription attempt failed, but after reducing the file size, the model successfully produced a detailed Markdown transcript, outlining key sections and including speaker names and timestamps.

Questions about this article

No questions yet.