[go: up one dir, main page]

DEV Community

# llm

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Metí Gemma corriendo en el browser, sin API keys, y me cambió cómo pienso el edge

Metí Gemma corriendo en el browser, sin API keys, y me cambió cómo pienso el edge

Comments
9 min read
I Built a RAG Pipeline. Then I Realized Retrieval Is the Real Model

I Built a RAG Pipeline. Then I Realized Retrieval Is the Real Model

1
Comments
3 min read
Why We Ditched Bedrock Agents for Nova Pro and Built a Custom Orchestrator

Why We Ditched Bedrock Agents for Nova Pro and Built a Custom Orchestrator

Comments
7 min read
HBM4 Didn't Break the Memory Wall — It Just Moved It

HBM4 Didn't Break the Memory Wall — It Just Moved It

Comments
6 min read
How AI Apps Actually Use LLMs: Introducing RAG

How AI Apps Actually Use LLMs: Introducing RAG

Comments
4 min read
Google Gemma 4: How a 31B Model Beats 600B+ Giants (Benchmarks + NVIDIA Co-Optimization)

Google Gemma 4: How a 31B Model Beats 600B+ Giants (Benchmarks + NVIDIA Co-Optimization)

Comments
2 min read
LLMKube Now Deploys Any Inference Engine, Not Just llama.cpp

LLMKube Now Deploys Any Inference Engine, Not Just llama.cpp

Comments
3 min read
Anthropic Just Released a Model So Dangerous They Gave It to Only Security Researchers

Anthropic Just Released a Model So Dangerous They Gave It to Only Security Researchers

Comments
2 min read
I built an Ollama alternative with TurboQuant, model groups, and multi-GPU support

I built an Ollama alternative with TurboQuant, model groups, and multi-GPU support

Comments
4 min read
Running Just One LLM on 8GB VRAM Is a Waste

Running Just One LLM on 8GB VRAM Is a Waste

Comments
8 min read
Light Just Cut KV Cache Memory Traffic to 1/16th

Light Just Cut KV Cache Memory Traffic to 1/16th

Comments
7 min read
Why Your Agent Doesn't Know What Time It Is

Why Your Agent Doesn't Know What Time It Is

Comments
7 min read
The AI Stack: A Practical Guide to Building Your Own Intelligent Applications

The AI Stack: A Practical Guide to Building Your Own Intelligent Applications

Comments
5 min read
ツール呼び出しでも大きいモデルは勝てなかった

ツール呼び出しでも大きいモデルは勝てなかった

Comments
4 min read
I benchmarked GPT-4o, Claude 3.5, and Gemini 1.5 for security — the results

I benchmarked GPT-4o, Claude 3.5, and Gemini 1.5 for security — the results

Comments
2 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.