Apple’s Pico-Banana-400K dataset features over 400,000 real-world images paired with Nano-Banana model edits, enabling researchers to explore text-guided, context-aware visual transformations across 35 diverse categories, including color changes, object manipulation, and background modifications
Apple has quietly unveiled Pico-Banana-400K, a large-scale dataset designed to advance research in text-guided image editing. The dataset, introduced through a research paper published on arXiv, marks one of Apple’s most significant open contributions to the global AI research community in recent years.
The Pico-Banana-400K dataset contains over 400,000 real-world images sourced from the OpenImages collection, each paired with an edited counterpart generated using Apple’s Nano-Banana model. This approach enables researchers to study how text prompts can guide precise, context-aware visual transformations. The edits span 35 categories—including color adjustments, object addition or removal, background replacement, and structural modifications—ensuring a diverse range of examples for model training.
To maintain data reliability, each image-edit pair underwent automated scoring by multimodal large language models (MLLMs) followed by human curation, ensuring consistency between textual instructions and visual outcomes.
Structured for advanced AI research
Apple’s dataset is divided into three specialized subsets. One features 72,000 multi-turn examples to support research on sequential editing and reasoning across multiple steps. Another includes 56,000 preference-based samples for alignment and reward model development. A third subset pairs long and short text instructions, helping researchers explore how AI systems interpret or condense commands effectively.
Beyond its size and structure, Pico-Banana-400K also addresses persistent challenges in AI image editing, such as over-reliance on synthetic data and inconsistent quality control. By including both successful and failed edit attempts, Apple allows models to learn from their errors—enhancing robustness and generalization.
Available for non-commercial academic and research purposes, Pico-Banana-400K sets a new benchmark for training and evaluating the next generation of instruction-based image editing systems.
See What’s Next in Tech With the Fast Forward Newsletter
Tweets From @varindiamag
Nothing to see here - yet
When they Tweet, their Tweets will show up here.



