Similar Product Search Using Amazon S3 Vector Buckets: Image and Text Retrieval with RRF Fusion

In today’s data-driven landscape, business increasingly rely on the power of similarity, contextual search functionalities to enable features like visual search, product recommendations, semantic text retrieval and more. However, setting up the infrastructure to support scalable, low-latency vector search often requires specialized tools, dedicated vector databases, complex indexing strategy and deployments. This might create barriers for teams that want to experiment or integrate vector search search into their existing pipelines quickly. Recently, AWS announced Amazon S3 Vector Buckets (in preview) which address this challenge by bringing vector storage and similarity search directly into Amazon S3 - turning a familiar, scalable object store into a lightweight vector database.

Amazon S3 Vector Buckets enable scalable storage and efficient retrieval of vector data. Key features include no provision of additional infrastructure, scalable (grow indexes with data), cost effective, simplified API integration, and low latency queries.

In this blog, we’ll walk through a lightweight implementation of a product search workflow using Amazon S3 Vector Buckets - all within a Jupyter notebook. We’ll explore how to enable both text-based and image-based search, and demonstrate a simple approach to multimodal search by combining results using Reciprocal Rank Fusion (RRF). clear

The core of this workflow are vectors; numerical representations of data like text or images. We’ll use Amazon S3 Vector Buckets to create vector indexes, add vectors to those indexes, and perform similarity searches to find the most relevant matches. But before diving into implementation, let’s take a moment to understand what a vector is at a high level.

What is a Vector?

Vector is a numerical representation of the something - in our case, data like images, text, or audio. These numbers are mapped into a high-dimensional space so that similar data ends up close together. This numeric representation enables similarity measurements through mathematical metrics like cosine similarity or Euclidean distance, facilitating accurate and efficient data retrieval.

Simple Image Example: Pixels as Vectors

Consider below 3 x 3 grayscale image (left) and its pixel values (right). Each pixel value range from 0 to 255, representing brightness. These values can be flattened into a vector.

Just like this (1 x 9 dimensions), modern AI models take larger and more complex images and convert them into much higher-dimensional vectors (like 512 or 1024 dimensions). These vectors capture not just color or brightness - but complex patterns, shapes, and even semantic meaning.

Personalization use case: Product Search Demonstrated via Jupyter Notebook

The example is presented as a practical tutorial in a Jupyter notebook format, demonstrating individual text-based and image-based searches, and implements multimodal search by combining text and image results using Reciprocal Rank Fusion (RRF).

S3 Vector bucket creation

via AWS CLI (Optionally you can also specify the encryption configuration). Ensure to update the AWS CLI to latest version. Currently, I have 2.27.60 installed.

aws s3vectors create-vector-bucket --vector-bucket-name "media-vector-bucket"

via SDK (Boto3)

import boto3
s3vectors = boto3.client("s3vectors")
s3vclient = boto3.client('s3vectors')
response = s3vclient.create_vector_bucket(
    vectorBucketName='media-vector-bucket'
)

Vector index creation

In this example, we’ll create two separate vector indexes - one for images and another for text. This allows us to store and query image and text embeddings independently, while still leveraging the power of Amazon S3 Vectors for high-speed similarity search within each modality.

A vector index is the structure that organizes the added vector data which enables the fast similarity search.

Image index

via AWS CLI

aws s3vectors create-index \
  --vector-bucket-name "media-vector-bucket" \
  --index-name "img-index" \
  --data-type "float32" \
  --dimension 768 \
  --distance-metric "cosine"

via SDK (Boto3)

response = client.create_index(
    vectorBucketName='media-vector-bucket',
    indexName='img-index',
    dataType='float32',
    dimension=768,
    distanceMetric='cosine',
)

The command above creates an image index named img-index within the vector bucket media-vector-bucket. We set the dimension to 768, the data type to float32 (currently the only supported type), and the distance metric to cosine for similarity search (currently it supports Cosine & Euclidean).

The reason we specify 768 as the dimension is because the model we’ll use to generate image embeddings - google/vit-base-patch16–224-in21k - outputs vectors of 768 dimensions. This ensures that the vectors we generate will be compatible with the index we’ve defined.

Text index

via AWS CLI

aws s3vectors create-index \
  --vector-bucket-name "media-vector-bucket" \
  --index-name "txt-index" \
  --data-type "float32" \
  --dimension 384 \
  --distance-metric "cosine"

via SDK (Boto3)

response = client.create_index(
    vectorBucketName='media-vector-bucket',
    indexName='txt-index',
    dataType='float32',
    dimension=384,
    distanceMetric='cosine',
)

The reason we specify 384 as the dimension is because the model we’ll use to generate image embeddings - sentence-transformers/all-MiniLM-L6-v2 - outputs vectors of 384 dimensions.

⚠️ Important Note

Amazon S3 Vector Buckets currently support dimension values between 1 and 4096. Higher-dimensional vectors require more storage space, which may impact both performance and cost.

Data preparation

For this demo, we’ll use the Fashion Product Images Dataset available on Kaggle. To keep things lightweight and manageable within a notebook setup, we’ll randomly select 10,000 product images from the dataset to build our image search experience.

images.csv contains the filename and the image link. styles.csv contains the metadata of the image. Hence, merging both the dataframe.

import os
from pathlib import Path
import pandas as pd

base_data_path = f"{Path().resolve().parent}/data/"
storage_bucket_name = os.environ.get("STORAGE_BUCKET_NAME")

images_df = pd.read_csv(f"{base_data_path}/fashion-dataset/images.csv")
style_df = pd.read_csv(f"{base_data_path}/fashion-dataset/styles.csv", on_bad_lines='skip')

style_df["id"] = style_df["id"].astype(str)
images_df["id"] = images_df.filename.str.replace(".jpg", "")

merged_df = images_df.merge(style_df, left_on=["id"], right_on=["id"], how="left")
merged_df = merged_df[merged_df.link != "undefined"]
sampled_df = merged_df[~merged_df["productDisplayName"].isna()].sample(n=10000)

sampled_df.reset_index(drop=True, inplace=True)
sampled_df.fillna("", inplace=True)

Optionally, if the data is on s3 then add the s3_uri to sampled_df.

sampled_df["s3_uri"] = sampled_df["filename"].apply(lambda x: f"s3://{storage_bucket_name}/fashion-dataset/images/{x}")

From Embeddings to Index: Creating and Uploading Vectors

The next step is to define a helper function that takes in raw input (either an image or a product name) and returns its corresponding vector embedding. You’re free to use any model that suits your use case (as far as it satisfy the dimension & data type limitations) for generating these vectors - whether it’s a proprietary model or an open-source one.

For this example, we’ll use open-source models to generate embeddings:

For images, we’ll use a vision model like google/vit-base-patch16–224-in21k
For text, we’ll use a lightweight model like sentence-transformers/all-MiniLM-L6-v2

These models will help us convert product data into vector representations that we can store and search in Amazon S3 Vector Buckets.

from transformers import CLIPProcessor, CLIPModel, AutoImageProcessor, AutoModel
from sentence_transformers.SentenceTransformer import SentenceTransformer

iprocessor = AutoImageProcessor.from_pretrained("google/vit-base-patch16-224-in21k")
imodel = AutoModel.from_pretrained("google/vit-base-patch16-224-in21k")
tmodel = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")

def get_embeddings(inputs, mode):
    if mode == "image":
        img_inputs = iprocessor(images=inputs, return_tensors="pt")
        with torch.no_grad():
            outputs = imodel(**img_inputs)
            last_hidden_state = outputs.last_hidden_state[:, 0, :]  # shape: [1, 2048, 1, 1]
            features = last_hidden_state / last_hidden_state.norm(dim=1, keepdim=True)  # L2 normalize each row
            features = features.cpu().numpy().astype(np.float32)
            return [f.flatten().tolist() for f in features]
    if mode == "text":
        embeddings = tmodel.encode(inputs)
        return [f.tolist() for f in embeddings]

The get_embeddings is the helper method will accept the batch inputs along with the mode as image or text.

⚠️ Important Note

Ensure that the returned vector embedding is in the form of list and each element in the list is of type float32 or lower precision. The length of the vector embedding list should match the defined dimension during the vector creation. In case, if you pass the data with higher precision, S3 Vectors will converts the values to 32-bit floating point before storing them. And each vector should match the defined dimension.

Creating and Uploading Vectors

To efficiently process and upload large volumes of data, we define a function called create_and_upload_vectors. This function processes the input data in batches of 500 records.

For each batch, it:

Invokes the get_embeddings function to generate vector embeddings for both image and text data.
Attaches metadata such as product IDs or categories to each vector for future filtering.
Uploads the vectors to their respective indexes:
- Image embeddings → img-index
- Text embeddings → text-index

Batching ensures the upload process remains scalable and performant, especially when dealing with thousands of vectors.

def create_and_upload_vectors(df, vector_bucket_name, image_index_name, text_index_name, batch_size=500):
    for i in range(0, len(df), batch_size):
        image_vectors, text_vectors = [], []
        
        batch = df.iloc[i:i+batch_size]
        
        images_batch = [Image.open(f"{base_path}{f}").convert("RGB") for f in batch["filename"]]
        text_batch = batch["productDisplayName"].tolist()
    
        image_embeddings = get_embeddings(images_batch, mode="image")
        text_embeddings = get_embeddings(text_batch, mode="text")
    
        for ind, v in enumerate(batch.itertuples()):
            metadata = {"gender": v.gender, "category": v.masterCategory, "type": v.usage, "season": v.season, "productName": v.productDisplayName, "s3_uri": v.s3_uri}
            text_vectors.extend([{
                "key": f"{v.id}",
                "data": {"float32" : text_embeddings[ind]},
                "metadata": metadata}])
            image_vectors.extend([{
                "key": f"{v.id}",
                "data": {"float32" : image_embeddings[ind]},
                "metadata": metadata}])
        
        s3vectors.put_vectors(
            vectorBucketName=vector_bucket_name,   
            indexName=image_index_name,
            vectors=image_vectors
        )
        s3vectors.put_vectors(
            vectorBucketName=vector_bucket_name,   
            indexName=text_index_name,
            vectors=text_vectors
        )

# invocation
create_and_upload_vectors(sampled_df, vector_bucket_name, image_index_name, text_index_name)

⚠️ Important Note

Amazon S3 Vector Buckets currently allow a maximum of 500 vectors per PutVector invocation. This is why we process and upload the data in batches of 500 to stay within the API limits.

Querying & retrieving similar matches

To query the index, we define a helper function query_s3_vectors to query the index and retrieve top k results.

⚠️ Important Note

query_vectors can only retrieve and return up to a maximum of 30 matching results.

def query_s3_vectors(bucket_name, index_name, vector, k=5):
    response = s3vectors.query_vectors(
            vectorBucketName=bucket_name,
            indexName=index_name,
            queryVector={"float32": vector}, 
            topK=k, 
            returnDistance=True,
            returnMetadata=True
        )
    return response["vectors"]

Also, create the helper function to display the query and its respective results.

import textwrap
import matplotlib.pyplot as plt
import matplotlib.image as mpimg

def display_results(matches, query_text=None, query_image=None, is_rrf=False):
    max_cols = 7
    total = len(matches) + 1
    n_rows = math.ceil(total / max_cols)

    fig, axes = plt.subplots(n_rows, max_cols, figsize=(3 * max_cols, 5 * n_rows))
    axes = axes.flatten()

    query_path = os.path.join(f"{base_data_path}/fashion-dataset/test-images/{query_image}")
    if os.path.exists(query_path):
        img = mpimg.imread(query_path)
        axes[0].imshow(img)
        axes[0].axis("off")
        axes[0].set_title(f"Query Text:\n{textwrap.fill(str(query_text), width=25)}", fontsize=9)
    else:
        axes[0].text(0.5, 0.5, f"Query Text:\n{textwrap.fill(str(query_text), width=25)}", ha='center', va='center')
        axes[0].axis("off")

    for i, match in enumerate(matches):
        idx = i + 1
        match_path = os.path.join(f"{base_data_path}/fashion-dataset/images/{match['key']}.jpg")
        if os.path.exists(match_path):
            img = mpimg.imread(match_path)
            axes[idx].imshow(img)
            axes[idx].axis("off")
            product_name = match["metadata"].get("productName", "N/A")
            wrapped_name = textwrap.fill(product_name, width=20)
            if is_rrf:
                title = f"{match["key"]}\n{wrapped_name}\nrrf_score: {match["rrf_score"]:.4f}"
            else:
                title = f"{match["key"]}\n{wrapped_name}\nDist: {match["distance"]:.4f}"
            axes[idx].set_title(title, fontsize=8)
        else:
            axes[idx].text(0.5, 0.5, "Image not found", ha="center", va="center")
            axes[idx].axis("off")

    for i in range(total, len(axes)):
        axes[i].axis("off")

    plt.tight_layout()

Querying the index using an image vector.

query_image = "t-shirt.webp"
input_image = [Image.open(f"{base_data_path}/fashion-dataset/test-images/{query_image}").convert("RGB")]
i_emb = get_embeddings(input_image, mode="image")
image_match = query_s3_vectors(bucket_name=vector_bucket_name, vector=i_emb[0], k=10, index_name="img-index")
display_results(image_match, query_image=query_image)

image based recommebdation — Image based results

Querying the index using the text.

input_text = ["mens green polo tshirt"]
t_emb = get_embeddings(input_text, mode="text")
text_match = query_s3_vectors(bucket_name=vector_bucket_name, vector=t_emb[0], k=10, index_name="txt-index")
display_results(text_match, query_text=input_text, query_image=None)

text based recommebdation — Text based results

Multimodal Search with Reciprocal Rank Fusion (RRF)

Reciprocal Rank Fusion (RRF) is a simple rank aggregation technique that combines results from multiple search systems by rewarding items that consistently appear near the top.

Combine individual search results using Reciprocal Rank Fusion:

Obtain separate ranked lists for text and image searches.
Apply RRF to merge rankings, enhancing overall search accuracy.

def rrf_merge(image_results, text_results, k=60, top_k=20):
    scores = defaultdict(float)
    rank_sources = {"image": image_results, "text": text_results}

    for source_name, result_list in rank_sources.items():
        for rank, item in enumerate(result_list):
            doc_id = item["key"]
            scores[doc_id] += 1 / (k + rank + 1)
    metadata = {item["key"]: item for item in image_results + text_results}

    fused = [
        {
            "key": doc_id,
            "rrf_score": round(score, 6),
            **metadata[doc_id]
        }
        for doc_id, score in sorted(scores.items(), key=lambda x: x[1], reverse=True)
    ]

    return fused[:top_k]

fused_results = rrf_merge(image_results=image_match, text_results=text_match, top_k=20)
display_results(fused_results, query_text=input_text, query_image=query_image, is_rrf=True)

multimodal based recommebdation — Fused results

Performance Considerations

Vector query results are returned under a second (<1000 ms), enabling fast lookups.
Dimension size and the number of vectors within the index may affect latency - larger datasets or high-dimensional vectors may increase query time.

Storage and Cost Considerations

Larger dimensions require more storage space, which can lead to increased cost.
A single index can store up to 50 million vectors, making it scalable for large workloads.
Vector dimensionality must be carefully managed to optimize overall costs.

Recall and Accuracy Considerations

When performing vector similarity queries, average recall performance is influenced by:

The quality and type of embedding model used to generate the vectors.
The size of the dataset, in both vector count and dimensionality.
The distribution and diversity of incoming queries.

Design Considerations

S3 Vectors is ideal for workloads that don’t require real-time results, such as batch processing or periodic search.
Partial updates to a vector or its associated metadata are not supported. Any update requires re-uploading the full vector entry.
No direct support for inherently multimodal search (e.g., combining image and text vectors). You’ll need to implement external fusion techniques like Reciprocal Rank Fusion (RRF).
No infrastructure setup is required - S3 Vectors is fully managed, so you can start storing and querying vectors immediately using API calls, without provisioning or managing servers.

Wrapping Up

Amazon S3 Vector Buckets make it easier than ever to implement vector search without managing a dedicated vector database. In this tutorial, we built a multimodal product search system combining image and text similarity.

By generating embeddings, creating separate indexes for image and text, and retrieving results using simple API calls, we demonstrated how to deliver a powerful and scalable search experience. We also used Reciprocal Rank Fusion (RRF) to combine results across modalities, showing how multimodal search can be layered on top with minimal effort.

S3 Vector Buckets are a great fit for:

Teams wanting to experiment with vector search quickly
Applications where query latency under 1 second is acceptable
Use cases like product recommendations, visual search, or content discovery
They may not suit real-time or high-frequency update scenarios, but they shine when ease of integration, scalability, and cost-efficiency matter most.

If you like to follow along with me step by step then you can refer to this video.

Want to try it yourself?

Download the notebook and sample metadata to get started.

Thank you for reading!