[go: up one dir, main page]

Local Ai

25 articles about local ai.

Open Source LLM Releases in 2026: What Has Shipped and What to Expect

·12 min read

A practical guide to every major open source LLM release in 2026 so far, from Llama 4 to Qwen 3, with benchmarks, licensing, and what they mean for local AI agents.

open-sourcellm2026ai-modelslocal-aillamaqwen

Agentic AI Only Works If It Runs Locally

·2 min read

Cloud-hosted AI agents face censorship filters, limited system access, and higher latency. Local agents avoid all three - here is why that matters for real

local-aiagentic-aicensorshiplatencydesktop-agentprivacy

Another CLI? What Makes It Different from Ollama's Built-In

·2 min read

Why a dedicated AI agent CLI differs from ollama's built-in commands - tool calling, desktop integration, and persistent memory make the difference.

cliollamalocal-aideveloper-toolsdesktop-agent

Apple Foundation Models in SwiftUI - The Hybrid Local and Cloud Approach

·2 min read

Playing with Apple Foundation Models in SwiftUI reveals the power of on-device models combined with cloud fallback. Hybrid local/cloud is the right

applefoundation-modelsswiftuion-devicelocal-ai

Codex-Like Functionality with Local Ollama - Qwen 3 32B Is the Sweet Spot

·2 min read

Running Qwen 3 32B locally on M-series Macs for Codex-like coding agent capabilities. Why 32B is the sweet spot for Apple Silicon.

ollamaqwencodexlocal-aiapple-silicon

GPU Selection for Local AI Agent Workloads

·7 min read

Concrete benchmark data comparing Apple Silicon M4, NVIDIA RTX 5090, and AMD for local LLM inference. What tokens-per-second numbers actually mean for agent responsiveness.

gpulocal-aihardwarellm-inferenceapple-silicon

Solving the Hallucination vs Documentation Gap for Local AI Agents

·2 min read

How CLI introspection and skills that tell agents to check docs first can reduce hallucinations in local AI agents.

hallucinationdocumentationlocal-aiagent-skillsreliability

A Generally Adopted Benchmark for Local AI Inference Speed

·2 min read

llama-bench provides tokens-per-second metrics for local inference. Having a standard benchmark makes hardware and model comparisons meaningful instead of

benchmarkllama-benchinference-speedtokens-per-secondlocal-ai

Built a Local AI Coding Agent with Qwen 3.5 9B

·2 min read

How to build a local AI coding agent using Qwen 3.5 9B for desktop automation, and why tool calling format matters more than model size.

local-aiqwentool-callingcoding-agentollama

macOS Dictation with Local Whisper - Sub-Second Latency on Apple Silicon

·2 min read

How local Whisper models on M-series chips deliver sub-second voice input latency for AI agents, eliminating cloud roundtrips and enabling real-time

whisperapple-siliconvoice-inputmacoslocal-aidictation

Building a macOS Tray App with Ollama as Your Knowledge Base

·2 min read

How to build a macOS menu bar app that uses Ollama for a personal AI knowledge base - global shortcut UX, local model inference, and keeping everything on

macosollamatray-appmenu-barknowledge-baselocal-ai

Built an Open Source LLM Agent for Personal Finance

·2 min read

Using structured outputs from local LLMs to categorize financial transactions, track spending, and generate reports without sending data to the cloud.

personal-financeopen-sourcestructured-outputslocal-aiautomation

DSM and Provable Memory for AI Agents - Why Relevance Beats Proof

·2 min read

Why provable memory systems like DSM are less useful than locally relevant AI profiles - agents need contextual memory, not cryptographically verified memories.

ai-memorydsmprovable-memorylocal-aiagent-profile

What to Do with Your Idle Custom PC - Convert It to an AI Agent Server

·3 min read

Repurpose your gaming PC as an AI agent homelab with Proxmox. Run local models, host always-on agents, and put that idle GPU to work.

homelabproxmoxgaming-pcself-hostedlocal-aiselfhosted

First Speculative Decoding Across GPU and Neural Engine on Apple Silicon

·2 min read

Running two models on the same Apple Silicon chip - a 1B draft model on the Neural Engine and a larger model on GPU for faster local inference.

speculative-decodingapple-siliconneural-enginelocal-aiperformance

Tiny AI Models for Game NPCs - What Works Under 1B Parameters

·5 min read

Using small language models (500M-1.1B parameters) for game NPC dialogue in survival games. Benchmark data, what tiny models handle well, where they break, and why this matters for desktop agents.

tiny-modelsgamingnpcslocal-aiexperiments

Your AI Agent Needs Persistent Memory That Grows with You

·3 min read

Chat history is not memory. Real AI agent memory means a local knowledge graph that learns your contacts, habits, and preferences over time - not just what

agent-memoryknowledge-graphpersistencepersonalizationlocal-ai

Dedicated AI Hardware vs Your Existing Mac - Why a Separate Device Is Premature

·2 min read

Your Mac already has everything needed to run a full AI agent locally. Dedicated AI hardware adds cost and complexity without solving real problems.

ai-hardwaremacapple-siliconlocal-aipragmatism

Local AI Agents Work Without Cloud Restrictions

·2 min read

Cloud-based agents inherit platform content policies. Local agents running on your Mac use local models or direct API access - no intermediary filtering

local-aicensorshipprivacydesktop-agentfreedom

385ms Tool Selection Running Fully Local - No Pixel Parsing Needed

·2 min read

Local agents using macOS accessibility APIs skip the screenshot-parse-click cycle. Structured app data means instant element targeting and sub-second tool

speedlocal-aiaccessibility-apiapple-siliconperformance

Once You Go Local with AI Agents, There's No Going Back

·2 min read

After using a truly local AI agent - with instant response, full privacy, and persistent memory - cloud-based tools feel like using a remote desktop.

local-aino-going-backlatencyprivacyexperience

Local AI Knowledge Bases Should Go Beyond Bookmarks

·2 min read

Bookmarks are one data source. A comprehensive local knowledge base indexes your contacts, email patterns, file usage, app habits, and workflow traces into

knowledge-basebookmarkslocal-aiknowledge-graphcomprehensive

Local Voice Synthesis for Desktop Agents - Why Latency Matters More Than Quality

·2 min read

System TTS is robotic. Cloud TTS has 2+ second latency. For conversational AI agents on Mac, local synthesis on Apple Silicon hits the sweet spot - under 2

voice-synthesisttslocal-aiapple-siliconlatency

Self-Hosting an AI Agent on macOS - What You Need to Know

·2 min read

Self-hosted agents run on your Mac with no cloud dependency. Native Swift, local processing, your data stays on your machine. The trade-off is you manage

self-hostingmacoslocal-aiprivacyopen-source

Running whisper.cpp on Apple Silicon for Local Voice Recognition

·2 min read

The best setup for local voice recognition on Mac: whisper.cpp with large-v3-turbo on Apple Silicon. Here is the model choice, pipeline architecture, and

whisperapple-siliconvoice-recognitionlocal-aispeech-to-text

Browse by Topic