Long context windows are nice. The problem is they're expensive and slow. 💡 As Pinecone’s CTO Ram Sriharsha explained at ELC Annual 2025, a 100k-token query costs ~$1, versus ~$0.000025 with retrieval. That's 40,000x cheaper! At scale, this is the difference between $6M/month and $150/month. While long context windows are powerful, they remain too costly and high-latency for real-world applications. Retrieval offers better economics, lower latency, more accurate factual grounding, and infrastructure costs that scale with data rather than query length. Even with edge cases, retrieval is what makes LLMs affordable and practical at scale. Ram's full talk and slides are in the comment below 👇
About us
Pinecone is the leading vector database for building accurate and performant AI applications at scale in production. Pinecone's mission is to make AI knowledgeable. More than 5000 customers across various industries have shipped AI applications faster and more confidently with Pinecone's developer-friendly technology. Pinecone is based in New York and raised $138M in funding from Andreessen Horowitz, ICONIQ, Menlo Ventures, and Wing Venture Capital. For more information, visit pinecone.io.
- Website
-
https://www.pinecone.io/
External link for Pinecone
- Industry
- Software Development
- Company size
- 51-200 employees
- Headquarters
- New York, NY
- Type
- Privately Held
- Founded
- 2019
Locations
-
Primary
New York, NY 10001, US
-
San Francisco, California, US
-
Tel Aviv, IL
Employees at Pinecone
-
Milen Dyankov
Developer Relations and Engineering Executive focused on empowering developers and teams. Experienced in leading enterprise projects, enhancing…
-
Jenna Pederson
Developer relations @ Pinecone | Keynote speaker | Software engineer
-
Andrew Naber
Fractional Marketing & Strategy Leader
-
Mike Sefanov
Leading global communications, analyst relations, and various marketing streams at Pinecone
Updates
-
Make knowledge hidden in your Google Docs discoverable and actionable. This post by John Ward, a Solutions Engineer here at Pinecone, shows how to load Google Docs into Pinecone Assistant, then ask natural-language questions and quickly surface answers across your notes, PRDs, design specs, and whatever else you store in Docs.
-
-
🎬 Happening now at #SFTechWeek: Our Staff Developer Advocate Milen Dyankov demoing his movie search and recommender system built using hybrid (dense + sparse) retrieval and reranking. 📖 Read about cascading retrieval: https://lnkd.in/g-m9Cg3w ✨ How Pinecone supports accurate & performant search, recommenders, and agents at scale: https://lnkd.in/gGcp6MfF
-
-
Aquant delivers expert-level service intelligence at enterprise scale with Pinecone. Aquant’s AI platform supports service teams across industries—from diagnosing complex machinery issues to improving customer experiences. But scaling real-time, domain-specific retrieval required a new foundation. With Pinecone, Aquant achieved: • 98% retrieval accuracy • 48% increase in weekly question volume • 49% reduction in time-to-resolution • 19% lower cost per service case By powering fast, reliable semantic search with Pinecone, Aquant delivers real-time, context-aware intelligence that improves outcomes for both service teams and their customers. Read the full case study 👉 https://lnkd.in/gazdJSzg
-
-
Pinecone reposted this
First BJJ x Pinecone event done for #SFTechWeek. A lot of good conversations about vector databases and implementing AI systems. If you missed this one and plan to be in LA for #LATechWeek, come by our 2nd BJJ event at Meraki Jiu Jitsu. Sign up: https://lnkd.in/gft39fAr
-
-
🥋 Join us for Grappling with Vector Databases at #LATechWeek Only 1 week away! No gi BJJ + a 15-min Pinecone demo on building AI agents. First 50 attendees get a custom Pinecone rashguard. 📅 Tue, Oct 14, 8:00am-10:00am PT 📍 Meraki Jiu Jitsu RSVP: https://lnkd.in/gFuy6uyh TECH WEEK by a16z
-
-
We're starting a series of Tech Talks! For our first, join our Founder & Chief Scientist, Edo Liberty, for a deep dive into vector search algorithms — why they’re hard, and how approaches like HNSW, PQ, and IVF compare. 🗓️ Thursday, Oct 23 in New York City. ⏰ 5:30pm - 8:30pm 🔗 Reserve your spot: https://lnkd.in/gHfm4cRt
-
-
🎙️ Our Staff Developer Advocate, Jenna Pederson, joined the Adventures in DevOps podcast to break down how developers are building smarter AI applications with vector databases. One key insight from the conversation: LLMs have limitations, especially with domain-specific language. The most accurate retrieval systems combine: 🔎 Semantic (dense) search — for understanding meaning and intent 🔑 Lexical (sparse) search — for precise keyword matching This hybrid approach ensures your AI can find the right information whether users search by concept or by exact terminology. 📢 If you're building with RAG, embeddings, or LLMs, this conversation is worth your time. Link in comments 👇
-
Discover insights from Arjun Patel, Senior Developer Advocate at Pinecone. - Learn effective prompt engineering strategies. - Unlock the power of Retrieval-Augmented Generation (RAG) for better AI performance. - Explore real-world applications for your projects. 🔗 Read more here: https://lnkd.in/gQpW3t5A
-