Pico Banana 400k: Apple's Dataset for Text

In the world of AI, where pixels respond to human instructions, Apple’s Pico Banana 400k dataset advances text-guided image editing. It provides nearly 400,000 text-image-edit triplets to train AI models for precise creative control. For researchers in multimodal AI or developers building editing tools, this dataset supports innovation.

Released under Apple’s open research, Pico Banana 400k uses the Open Images library, Gemini-generated prompts, and Nano-Banana edits verified by AI. It’s a benchmark with successes, failures, and multi-turn conversations for fine-tuning and preference learning. Here’s what makes Pico Banana 400k essential for instruction-aware image manipulation.

What Is Pico Banana 400k?

Pico Banana 400k is a dataset for text-guided image editing—transforming photos with natural language. Examples include adjusting brightness in a landscape or replacing a lion with butterflies in a savanna. It includes ~400K high-res images (512–1024 pixels) from Open Images, covering humans, objects, and text.

The name “Pico Banana” references precise edits and creative twists. It includes three types:

Single-Turn SFT Samples: ~257K triplets for one-shot changes.
Preference Learning Samples: ~56K pairs of positive and negative edits.
Multi-Turn SFT Samples: ~72K chains for iterative editing.

It covers 35 operations in eight categories, from photometric shifts to stylistic changes.

Key Features of Pico Banana 400k

Pico Banana 400k is crafted with care. Gemini-2.5-Flash generates prompts, Nano-Banana edits, and Gemini-2.5-Pro evaluates with a scorecard: 40% compliance, 25% realism, 20% preservation, 15% quality.

Edit categories:

Category	Description	Share
Object-Level Semantic	Add/remove/replace/relocate objects	35%
Scene Composition	Lighting or environmental changes	20%
Human-Centric	Outfit or pose adjustments	18%
Stylistic	Artistic styles like oil painting	10%
Text & Symbol	Edit billboards or graffiti	8%
Pixel & Photometric	Contrast or color adjustments	5%
Scale & Perspective	Zooms or viewpoint changes	2%
Spatial/Layout	Expansions or rearrangements	2%

Prompts are relatable, like “Replace the red apple with a green one.” Failure cases improve model robustness. High resolution, diversity, and quality scores (~0.7 minimum) ensure reliability.

How Pico Banana 400k Is Built

It starts with Open Images’ 9M+ photos (CC BY 2.0). Gemini-2.5-Flash creates prompts, Qwen-2.5-Instruct-7B summarizes, Nano-Banana edits, and Gemini-2.5-Pro scores. Successful edits go to SFT, failures to preferences, and chains to multi-turn.

The JSONL format includes manifests for downloads. A Python script handles Open Images URL mapping.

Applications of Pico Banana 400k

Pico Banana 400k supports:

Editing Models: Fine-tune diffusion models for accuracy.
Conversational Tools: Enable iterative changes.
Training: Use pairs for reward modeling.
Benchmarking: Test compliance and quality.

It aids photographers, marketers, and educators. Early results show reduced artifacts and better intent adherence.

Getting Started with Pico Banana 400k

Hosted on Apple’s CDN. Visit the GitHub repo.

Download Files:

Single-turn: Manifest | JSONL
Preferences: Manifest | JSONL
Multi-turn: Manifest | JSONL

Load Data: Parse JSONLs for training.

Licensing and Ethics

CC BY-NC-ND 4.0 for non-commercial use. Source images: CC BY 2.0. Prompts avoid toxicity. See the arXiv paper.

Also Read

What's Hot

Goodbye AI Cluster Bills. Exo Runs AI on Your Own Devices

Cloudflare Speed Test CLI: Boost Your Network Diagnostics in Seconds

TuxMate: The Ultimate Linux Bulk App Installer for Streamlined Setup

Pico Banana 400k: Apple’s Dataset for Text Guided Image Editing

Goodbye AI Cluster Bills. Exo Runs AI on Your Own Devices

Stop AI Scraping on Your Blog: Protect Your Content with Fuzzy Canary

Gemini Conductor CLI for AI-Driven Development

Download LineageOS 22 (Android 15): Official and Unofficial Supported Devices

Best React Bits Alternative for Stunning UI Components

Uiverse.io: The Best React Bits Alternative for Open Source UI Components