Skip to main content

AI, OpenRouter and Pricing

Leveraging AI

A Precision-Engineered Pipeline for Your Personal Knowledge

pinakea does not just “use AI.” It orchestrates a sequence of specialized stages so raw items become structured knowledge you can search, summarize, and chat with.

Info

This page explains the pipeline architecture and why each stage exists.

Two AI Modes: Online and Mixed

pinakea supports two operating modes. Online Mode is recommended; Mixed Mode remains available for smaller Sets when local automatic item processing matters more than speed and output quality.

In Online mode, the full pipeline runs through cloud models via OpenRouter.

  • Uses top-tier cloud models for embeddings and pipeline generation.
  • Supports parallel processing for large imports.
  • Delivers the highest overall quality.
  • Requires internet access and a valid OpenRouter API key.

Mixed Mode

In Mixed mode, high-volume pipeline stages run locally on Apple Silicon, while chat stays cloud-based. Mixed Mode is for smaller Sets when avoiding OpenRouter cost for automatic item processing matters more than speed and output quality.

  • Local embeddings + summaries + titles + tags.
  • Sequential processing tuned for local GPU/memory constraints.
  • No OpenRouter cost for automatic item processing.
  • Core pipeline stages can continue offline.
  • Can take hours or days on larger Sets and keep your Mac busy.

pinakea uses around 1,000 items per Set as the recommended Mixed Mode guidance point. See Use Online Mode.

Seamless Switching

Switching modes switches the full pipeline profile, not just one model.

  • Embedding dimensions differ (cloud/local), so vector spaces are not interchangeable.
  • Online -> Mixed replaces Online embeddings generated with OpenRouter credit; those embeddings cannot be reused for Mixed Mode. Existing summaries, titles, and tags are kept.
  • Mixed -> Online clears local embeddings, AI-generated summaries, AI-generated titles, and LLM tags, then regenerates them through OpenRouter credit.
  • Starred items and explicit/user tags are preserved.

For practical mode behavior details, see Online and Mixed Modes. For the current recommendation, see Use Online Mode.

The AI Pipeline: From Raw Content to Refined Knowledge

When you add sources (folders, clips, mail, notes), each item flows through a staged pipeline.

Stage 1: Content Extraction

pinakea first normalizes content into clean text:

  • Markdown/text: direct parse.
  • PDFs: native extraction, with OCR fallback for scanned pages.
  • Images/screenshots: OCR via Apple Vision.
  • YouTube clips: transcript retrieval so spoken content becomes searchable.
  • Web pages: readability extraction to remove noise and keep article content.

Stage 2: Chunking

Long content is split into overlapping chunks so semantic retrieval remains accurate across boundaries.

  • Typical chunk size is tuned for retrieval efficiency.
  • Overlap preserves context between adjacent chunks.

Stage 3: Embedding

Each chunk is converted into a semantic vector.

ModeModelDetails
OnlineQwen3 Embedding 8BCloud, high-dimensional semantic retrieval
Mixedmxbai-embed-large-v1Local Apple Silicon embedding model

This is what allows meaning-based search (“delivery schedule” matching “project deadlines”).

Stage 4: Automatic Summary

Each item receives a summary for fast scanning.

ModeModelDetails
OnlineDeepSeek V4 FlashCloud, high throughput
MixedQwen 2.5 7B InstructLocal via llama.cpp

Stage 5: Intelligent Title

pinakea generates better descriptive titles from content, so your timeline remains readable at scale.

Stage 6: Automatic Tags

pinakea generates conceptual tags (not simple keyword extraction) to improve browse and retrieval flows.

Why These Model Choices

Embeddings: Retrieval Quality First

Embedding quality determines search and chat grounding quality. pinakea prioritizes:

  • semantic depth over lexical matching,
  • robust retrieval at large library sizes,
  • consistency within each mode’s vector space.

Pipeline Generation: Speed + Consistency

For summaries/titles/tags, pinakea optimizes for:

  • throughput during ingest,
  • stable output quality,
  • predictable cost behavior across long-running pipelines.

Online Mode uses DeepSeek V4 Flash for titles, summaries, tags, and chat. It combines a 1M context window with excellent output quality and a much lower listed token price than comparable 1M-context alternatives. Online embeddings use Qwen3 Embedding 8B, which keeps semantic search and chat retrieval in a consistent vector space.

Chat: Conversations with Your Knowledge

Search gets you to relevant items. Chat synthesizes across them.

Retrieval Flow

  1. Your question is embedded into the same semantic space as your indexed chunks.
  2. Retrieval selects the most relevant chunks by semantic proximity.
  3. Multi-turn context keeps seed + incremental evidence aligned.
  4. Citations tie generated claims back to source items.

Citation Guarantees

Answers include clickable references so you can jump directly to supporting source content.

Summaries and Chat

Automatic Summaries

  • Precomputed during pipeline processing.
  • Optimized for quick scanning in timeline workflows.
  • Free on every plan.

On-Demand Summaries

On-demand summaries are day and daypart summaries. They are available on Free, Pro Monthly, and Pro Lifetime, and they do not spend the Free monthly processing allowance.

Chat Prompts

Chat is the synthesis surface for follow-up questions over an item, selection, search result, day, daypart, or current timeline context. Chat prompts do not spend the Free monthly processing allowance.

Privacy and Data Flow

pinakea is local-first by design.

  • Source content stays on your Mac.
  • In Mixed mode, core pipeline stages can run locally.
  • Cloud-dependent operations send the content needed for that request over encrypted connections to model providers via OpenRouter. Online Mode also sends source content through the cloud pipeline for embeddings, summaries, titles, tags, and tag consolidation.

Cost Control

Cloud usage is BYOK through OpenRouter:

  • you control spend limits,
  • you see spend in OpenRouter + in-app status reporting,
  • no pinakea markup on model usage.

Detailed cost examples: LLM Cost (OpenRouter).