LLM Providers¶

LLM features in Knowmarks are optional. When configured, they enable:

Conversational search — Ask questions and get answers grounded in your saved content
Auto-summaries — 2-3 sentence distillations of saved items
Query expansion — Broader search via LLM-rewritten queries
Project refinement — LLM-assisted re-ranking of project associations
Relevance explanations — Human-readable reasons for why items match
Collection insights — Background analysis of your knowledge base

All LLM features degrade gracefully when no provider is configured — the core save/search/organize workflow works without any LLM.

OpenRouter (default)¶

The default configuration points to OpenRouter with Gemini 2.5 Flash:

export KM_LLM_URL=https://openrouter.ai/api/v1
export KM_LLM_MODEL=google/gemini-2.5-flash
export KM_LLM_API_KEY=your-openrouter-key

OpenRouter provides access to many models through a single API key. Sign up at openrouter.ai.

LM Studio¶

For fully local LLM inference:

Install LM Studio
Download a model (e.g., Llama 3.1 8B Instruct)
Start the local server in LM Studio

export KM_LLM_URL=http://localhost:1234/v1
export KM_LLM_MODEL=meta-llama-3.1-8b-instruct

No API key needed for local LM Studio.

Ollama¶

Use Ollama's OpenAI-compatible endpoint:

export KM_LLM_URL=http://localhost:11434/v1
export KM_LLM_MODEL=llama3.1

Make sure the model is pulled first: ollama pull llama3.1

Any OpenAI-Compatible API¶

Any service implementing the OpenAI chat completions format works:

export KM_LLM_URL=https://your-provider.com/v1
export KM_LLM_MODEL=your-model-name
export KM_LLM_API_KEY=your-api-key

Disabling LLM Features¶

To disable all LLM features:

export KM_LLM_ENABLED=0

When disabled, conversational search falls back to standard hybrid search, and features like auto-summaries and query expansion are skipped. The core experience is unaffected.

Recommended Models¶

For best results with Knowmarks:

Hosted: Gemini 2.5 Flash (fast, inexpensive, good at structured extraction)
Local (high-end): Llama 3.1 8B Instruct or Qwen 2.5 7B
Local (lightweight): Phi-3.5 Mini or Gemma 2 2B

Knowmarks uses structured prompts that work well with instruction-tuned models of any size.