Move AI commands to top of command palette, add Clear Chat
Details
When AI is enabled, "Chat with Duncan", "Clear Duncan Chat", and
"Close Duncan" now appear at the very top of the command palette
results. Previously they were buried after navigation and sync commands.
"Clear Chat" calls ClearAll on the AI tray, clearing all messages,
intent suggestions, and content suggestion history in one action.
AI commands are conditionally included — they don't appear when AI
is disabled, keeping the palette clean.
Version: 1.50.7 → 1.50.8
Harden local LLM output: minimal prompt, aggressive sanitizer, sentence cap
Details
Local models (Phi-3, Mistral) ignore multi-rule system prompts and parrot
examples verbatim. Three-pronged fix:
1. System prompt collapsed to a single line with minimal directives. No
numbered rules, no examples. Small models handle one dense instruction
paragraph better than structured rule lists.
2. Sanitizer expanded to strip markdown bold/italic, markdown headers,
bullet prefixes, "Note:" disclaimer lines, and all known chat template
tokens (including Llama 3.x eot_id/start_header_id variants).
3. Hard sentence-count truncation per tier: Short caps at 2 sentences,
Medium at 5, Long at 30. Applied after all other sanitization so the
model physically cannot produce a wall of text for a greeting.
Token budgets also tightened: Short 100->60, Medium 300->250.
Fix local LLM response quality: sanitize tokens, inject user name, strip examples
Details
Three fixes for the Duncan chat tray:
1. Removed example conversations from system prompt — local models were
parroting them verbatim instead of following the behavioral guidance.
Replaced with terse directive-only prompt.
2. Added response sanitizer that strips raw chat template tokens
(<|assistant|>, <|end|>, <|im_start|>, etc.) and role prefixes
("- User:", "Assistant:", "Duncan:") that local models leak into output.
3. Injected user's display name (from app settings) into the system prompt
so Duncan addresses the user by name naturally.
Add response tier classifier for context-aware chat response lengths
Details
Introduces a three-tier response budget system (Short/Medium/Long) that
classifies the user's message intent before sending the AI request. Short
tier (greetings, yes/no, can't-do topics) caps at 100 tokens with 1-2
sentence guidance. Medium tier (explanations, how-to, brainstorming) allows
300 tokens and 3-5 sentences. Long tier (summarize a page, draft an email,
long-form writing) allows 800 tokens with paragraph-level output. The
classifier uses keyword matching with message length heuristics, and the
system prompt length rule adapts per tier. All configuration remains
centralized in AiPersona.cs.
Overhaul Duncan chat system prompt for terse, personality-driven responses
Details
The previous system prompt was a single generic sentence that the model
ignored, producing wall-of-text responses with repetitive AI disclaimers.
Replaced with a strict rule-based prompt enforcing 1-2 sentence replies,
no self-identification, no filler, and clear boundaries around what Duncan
can and cannot do (no internet, local-only). Dropped MaxTokens from 1024
to 200 and Temperature from 0.7 to 0.4 to further constrain verbosity.
Moved the system prompt into AiPersona.cs alongside the Name constant so
all persona configuration lives in one file.
Get notified about new releases