To update the vector and/or metadata of a single record, use the update operation with the following parameters:
namespace: The namespace containing the record to update. To use the default namespace, set the namespace to "__default__".
id: The ID of the record to update.
One or both of the following:
Updated values for the vector. Specify one of the following:
values: For dense vectors. Must have the same length as the existing vector.
sparse_values: For sparse vectors.
setMetadata: The metadata to add or change. When updating metadata, only the specified metadata fields are modified, and if a specified metadata field does not exist, it is added.
If a non-existent record ID is specified, no records are affected and a 200 OK status is returned.
In this example, assume you are updating the dense vector values and one metadata value of the following record in the example-namespace namespace:
from pinecone.grpc import PineconeGRPC as Pineconepc = Pinecone(api_key="YOUR_API_KEY")# To get the unique host for an index, # see https://docs.pinecone.io/guides/manage-data/target-an-indexindex = pc.Index(host="INDEX_HOST")index.update( namespace="example-namespace", id="id-3", values=[5.0, 3.0], set_metadata={"genre": "comedy"})
After the update, the dense vector values and the genre metadata value are changed, but the type metadata value is unchanged:
setMetadata: The metadata to add or change. When updating metadata, only the specified metadata fields are modified. If a specified metadata field does not exist, it is added.
dry_run: Optional. If true, the number of records that match the filter expression is returned, but the records are not updated.
Each request updates a maximum of 100,000 records. Use "dry_run": true to check if you need to run the request multiple times. See the example below for details.
For example, let’s say you have records that represent chunks of a single document with metadata that keeps track of chunk and document details, and you want to store the author’s name with each chunk of the document:
Copy
{ "id": "document1#chunk1", "values": [0.0236663818359375, -0.032989501953125, ..., -0.01041412353515625, 0.0086669921875], "metadata": { "document_id": "document1", "document_title": "Introduction to Vector Databases", "chunk_number": 1, "chunk_text": "First chunk of the document content...", "document_url": "https://example.com/docs/document1" }},{ "id": "document1#chunk2", "values": [-0.0412445068359375, 0.028839111328125, ..., 0.01953125, -0.0174560546875], "metadata": { "document_id": "document1", "document_title": "Introduction to Vector Databases", "chunk_number": 2, "chunk_text": "Second chunk of the document content...", "document_url": "https://example.com/docs/document1" }},...
To check how many records match the filter expression, send a request with "dry_run": true:
curl
Copy
# To get the unique host for an index,# see https://docs.pinecone.io/guides/manage-data/target-an-indexPINECONE_API_KEY="YOUR_API_KEY"INDEX_HOST="INDEX_HOST"curl "https://$INDEX_HOST/vectors/update" \ -H "Api-Key: $PINECONE_API_KEY" \ -H 'Content-Type: application/json' \ -H "X-Pinecone-API-Version: unstable" \ -d '{ "dry_run": true, "namespace": "example-namespace", "filter": { "document_title": {"$eq": "Introduction to Vector Databases"} }, "setMetadata": { "author": "Del Klein" } }'
The response contains the number of records that match the filter expression:
Copy
{ "matchedVectors": 150000}
Since this number exceeds the 100,000 record limit, you’ll need to run the update request multiple times.
Initiate the first update by sending the request without the dry_run parameter:
Again, the response contains the total number of records that match the filter expression, but only 100,000 will be updated:
Copy
{ "matchedVectors": 150000}
Pinecone is eventually consistent, so there can be a slight delay before your update request is processed. Repeat the dry_run request until the number of matching records shows that the first 100,000 records have been updated:
curl
Copy
# To get the unique host for an index,# see https://docs.pinecone.io/guides/manage-data/target-an-indexPINECONE_API_KEY="YOUR_API_KEY"INDEX_HOST="INDEX_HOST"curl "https://$INDEX_HOST/vectors/update" \ -H "Api-Key: $PINECONE_API_KEY" \ -H 'Content-Type: application/json' \ -H "X-Pinecone-API-Version: unstable" \ -d '{ "dry_run": true, "namespace": "example-namespace", "filter": { "document_title": {"$eq": "Introduction to Vector Databases"} }, "setMetadata": { "author": "Del Klein" } }'
Copy
{ "matchedVectors": 50000}
Once the first 100,000 records have been updated, update the remaining records:
This feature is available only on the unstable version of the API.
Each request updates a maximum of 100,000 records. Use "dry_run": true to check if you need to run the request multiple times. See the example above for details.
You can add or change metadata across multiple records, but you cannot remove metadata fields.
Pinecone is eventually consistent, so there can be a slight delay before updates are visible to queries. You can use log sequence numbers to check whether an update request has completed.