Product7 min readMay 28, 2026

Search Your Library by Meaning, Not Filenames

5AM's new semantic search lets you find photos and videos by describing what's in them — and a redesigned AI Search Index in album settings shows you exactly what it costs before you index a single file.

5

5AM Team

Writer & Creative

Search Your Library by Meaning, Not Filenames

Search Your Library by Meaning, Not Filenames

Everyone has the same problem with their media library: you know the photo exists, but you have no idea what it's called. Was it IMG_4821.jpg? Was it in the Italy album or the one from the conference? You end up scrolling for ten minutes looking for "that shot of the red bicycle against the blue wall."

Today we're shipping the fix. Semantic search lets you find media by describing what's in it. Type "red bicycle against a blue wall" and you get the photo — even though nobody ever typed those words as a caption, a tag, or a filename.

This post covers how it works, the new AI Search Index controls in album settings, and exactly what it costs.


What "semantic" actually means here

A normal search matches text. If you search "dog," it finds files with "dog" in the name or description, and misses the photo of your golden retriever labeled IMG_0042.jpg.

Semantic search matches meaning. Under the hood, every piece of media gets turned into an embedding — a list of numbers that captures what the content is about. Your search query gets turned into an embedding too, and we find the media whose meaning sits closest to your query.

The result: searching "celebration" surfaces birthday parties, fireworks, and a champagne toast — none of which contain the literal word "celebration." It understands the concept, not just the characters.

We use Google's gemini-embedding-2 model for this, and one detail makes it especially powerful: it's multimodal. It can read the pixels of an image directly, placing photos and text into the same shared "meaning space." That's why a text query can find an untagged photo — the image itself was understood, not just a caption someone wrote about it.


How indexing works (and why media types differ)

Before search can find something, that item has to be indexed. Here's what happens per media type:

  • Photos are embedded directly from the image itself. If a photo already has an AI description, we fold that text in too for an even richer match — but a photo with no description at all still becomes fully searchable, because the model reads the image.
  • Videos and audio are different. The embedding model can't watch a long clip, so these first get a short AI-generated summary of their content. That summary text is then embedded. If a video already has a description, we skip straight to embedding it.

The upshot for you: photos are instant to index, while videos and audio take an extra step to "watch and summarize" first. Both end up equally searchable.


The AI Search Index, right in album settings

We used to nag you with a banner above every album: "12 images missing AI description." It was in your face, and worse, it asked you to commit to indexing without telling you what it would cost. That banner is gone.

In its place, every album owner now has an AI Search Index section inside Album Settings (it sits between Album Sharing and the Danger Zone). It's calm, on-demand, and honest about cost. When you open it, you'll see one of a few states:

  • All indexed — a green checkmark and "All media in this album is indexed for semantic search." Nothing to do.
  • Work pending — a clear breakdown of what's left, like "12 photos · 3 photos with descriptions · 1 video needs a summary first," an estimated cost in dollars, and a Generate button.
  • In progress — "Indexing already in progress. Batch jobs can take a few minutes to a few hours," with a Refresh link to check back.

You decide when to index, album by album (or your whole library at once). Nothing happens until you click Generate.

Note: This feature is live on the web app today. The 5AM iOS app isn't released yet, but the same AI Search Index surface is built and waiting for it.


What it costs

This is the part most tools hide. We put it front and center, because the honest answer is: for almost everyone, it's a rounding error.

Indexing runs on your own Gemini API key, and we route the bulk indexing through the Gemini Batch API — which Google prices at half the standard rate. The cost line in album settings even annotates this: "Gemini Batch API rate, 50% of standard."

Here's the breakdown of what each piece costs, using current Gemini pricing (all at the discounted batch rate):

  • Embedding a photo or descriptiongemini-embedding-2 at $0.10 per million tokens.
  • Summarizing a video/audio clip firstgemini-2.5-flash at $0.15 in / $1.25 out per million tokens.

To put that in real numbers:

  • Photos are effectively free to index. A photo embed is about 100 tokens. At batch rates, that's on the order of $0.0001 per photo — roughly a hundredth of a cent. Indexing a 1,000-photo album costs around ten cents.
  • Videos and audio cost a little more because they need a summary step first (and that summary runs at standard Flash rates, since it goes through our video service). Figure somewhere in the $0.001–$0.005 per clip range.

The estimate you see in album settings is computed from these exact numbers before you click anything — no surprises on your bill.

For the official, always-current pricing straight from Google, see the Gemini API pricing page.


Built to be safe to re-run

A few things we sweated so you don't have to:

  • It never re-indexes what's already done. Each item is fingerprinted by its content. Re-click Generate on an album that's fully indexed and it'll simply say "nothing to do" — no duplicate work, no duplicate cost.
  • It only re-indexes what changed. Edit one photo's description and re-run, and only that single photo goes back through.
  • It survives crashes. Batch jobs can run for hours on Google's side. Our system tracks every job in the database and resumes cleanly after any restart, so you never lose an indexing run or get charged for a phantom one.

How to try it

  1. Open any album you own and go to Settings.
  2. Scroll to the AI Search Index section.
  3. Review the breakdown and the estimated cost.
  4. Click Generate. (If you haven't saved a Gemini API key yet, you'll be prompted once.)
  5. Give it a few minutes for photos — videos may take longer while they're summarized — then use the album's search box and describe what you're looking for.

That's it. Start with one album, watch a search that should be impossible suddenly work, and then index the rest of your library knowing exactly what it'll cost.


Prefer the terminal? Use the CLI

Everything above is also available from the 5am CLI — handy for scripting, bulk work, or wiring search into your own tools.

Index an album (same batch-API path as the web UI, at the discounted rate):

5am albums generate-summary <albumId>

It's async and idempotent — re-running on an already-indexed album submits nothing and just reports "Already indexed — nothing to do."

Search by meaning:

5am media semantic-search "sunsets at the beach"
5am media semantic-search "wedding photos" --album <albumId> --limit 20

Both need a Gemini key set once with 5am keys set gemini <key> (the index call embeds your media; the search call embeds your query). Results come back as JSON by default — add --pretty for a readable table.

Open your albums → · Get the 5am CLI →

Tags

#semantic-search#ai#embeddings#gemini#media-library

Related Posts

Ready to Create?

Join thousands of creators who use 5AM to bring their artistic vision to life.

Start Creating