ACS Planner & Selection
Abstract: This specification defines the core planning algorithm used by ACS (Adaptive Context Scaling) to select optimal knowledge fragments under cognitive budgets. The algorithm implements a sophisticated benefit/cost model with tie-breaking rules and KV policy management for multi-layered context assembly.
Objectives
- Maximize marginal benefit under token budget
- Balance LOD mix per
lod_mix
and entity coverage - Ensure deterministic, reproducible selection behavior
- Support graceful degradation under resource constraints
Benefit/Cost Model (v0)
Core Scoring Function
typescript
interface ScoringComponents {
entity_overlap: number; // α·entity_overlap
recency: number; // β·recency
citation_availability: number; // γ·citation_availability
}
function calculateBenefit(fragment: Fragment, gaze: Gaze): number {
const α = 0.6; // Entity overlap weight
const β = 0.3; // Recency weight
const γ = 0.1; // Citation weight
const entity_overlap = calculateEntityOverlap(fragment.entities, gaze.entities);
const recency = calculateRecencyScore(fragment.timestamp);
const citation_availability = fragment.citations.length > 0 ? 1.0 : 0.5;
return α * entity_overlap + β * recency + γ * citation_availability;
}
function calculateScore(fragment: Fragment, gaze: Gaze): number {
const benefit = calculateBenefit(fragment, gaze);
const cost = fragment.cost_tokens;
return benefit / (1 + cost / 1000); // Normalized cost scaling
}
Entity Overlap Calculation
typescript
function calculateEntityOverlap(fragmentEntities: string[], gazeEntities: string[]): number {
const intersection = fragmentEntities.filter(e => gazeEntities.includes(e));
return gazeEntities.length > 0 ? intersection.length / gazeEntities.length : 0;
}
Recency Scoring
typescript
function calculateRecencyScore(timestamp: Date): number {
const ageHours = (Date.now() - timestamp.getTime()) / (1000 * 60 * 60);
return Math.exp(-ageHours / (24 * 7)); // Exponential decay over weeks
}
Greedy Selection Algorithm
typescript
function selectFragments(
candidates: Fragment[],
gaze: Gaze,
budgets: Budgets
): SelectionResult {
// 1. Score all candidates
const scoredCandidates = candidates
.map(fragment => ({
fragment,
score: calculateScore(fragment, gaze)
}))
.sort((a, b) => b.score - a.score);
// 2. Apply tie-breaking rules
const rankedCandidates = applyTieBreaking(scoredCandidates);
// 3. Greedy selection under budget constraint
const selected: Fragment[] = [];
let totalTokens = 0;
for (const candidate of rankedCandidates) {
if (totalTokens + candidate.fragment.cost_tokens <= budgets.tokens_max) {
selected.push(candidate.fragment);
totalTokens += candidate.fragment.cost_tokens;
}
}
return {
fragments: selected,
metrics: {
total_cost_tokens: totalTokens,
mean_benefit: calculateMeanBenefit(selected, gaze)
}
};
}
Tie‑Breaking Rules
Order of preference when scores are equal:
- Freshness: Newer fragment versions over stale
- Citation confidence: Higher citation count first
- Cohesion: Fewer unresolved dependencies
- Diversity: Prefer different source than last N picks
- Size: Smaller fragments first when near budget boundary
- Deterministic fallback: Lexicographic by fragment_id
typescript
function applyTieBreaking(candidates: ScoredFragment[]): ScoredFragment[] {
return candidates.sort((a, b) => {
// Primary: score (already applied)
if (Math.abs(a.score - b.score) > 0.001) return b.score - a.score;
// Tie-break 1: Freshness
const freshnessA = a.fragment.timestamp.getTime();
const freshnessB = b.fragment.timestamp.getTime();
if (freshnessA !== freshnessB) return freshnessB - freshnessA;
// Tie-break 2: Citation confidence
const citationsA = a.fragment.citations.length;
const citationsB = b.fragment.citations.length;
if (citationsA !== citationsB) return citationsB - citationsA;
// Tie-break 3: Size (smaller first near budget boundary)
const costA = a.fragment.cost_tokens;
const costB = b.fragment.cost_tokens;
if (costA !== costB) return costA - costB;
// Tie-break 4: Deterministic fallback
return a.fragment.id.localeCompare(b.fragment.id);
});
}
KV Policy Heuristics
Policy Framework
typescript
interface KVPolicy {
pin: string[]; // Fragment IDs that must remain in memory
compress: string[]; // Fragment IDs eligible for compression
evict: string[]; // Fragment IDs to remove from cache
}
function generateKVPolicy(selectedFragments: Fragment[]): KVPolicy {
const policy: KVPolicy = { pin: [], compress: [], evict: [] };
for (const fragment of selectedFragments) {
if (fragment.lod === 'macro') {
// Pin macro summaries for stable context
policy.pin.push(fragment.id);
} else if (fragment.cost_tokens > 500 && fragment.benefit >= 0.3 && fragment.benefit <= 0.6) {
// Compress large micros with medium benefit
policy.compress.push(fragment.id);
} else if (fragment.benefit < 0.2 && fragment.lastAccessTime < Date.now() - 86400000) {
// Evict low-benefit, stale items (older than 24h)
policy.evict.push(fragment.id);
}
}
return policy;
}
Memory Management Rules
- Keys required by a chosen fragment MUST be pinned until rendered or evicted
- Pin macro summaries; compress large micros; evict redundant atomics
- Evict lowest‑reuse keys first; reuse score decays with time since last access
- Promote keys referenced by top‑K candidate fragments to reduce thrash
- Max KV size: 64 items (default); beyond that, apply soft backoff to reduce new key introductions
Cache Eviction Algorithm
typescript
function selectEvictionCandidates(cache: CacheEntry[], maxSize: number): string[] {
if (cache.length <= maxSize) return [];
// Score for eviction (higher = more likely to evict)
const scoredEntries = cache.map(entry => ({
id: entry.id,
evictionScore: calculateEvictionScore(entry)
})).sort((a, b) => b.evictionScore - a.evictionScore);
const toEvict = cache.length - maxSize;
return scoredEntries.slice(0, toEvict).map(entry => entry.id);
}
function calculateEvictionScore(entry: CacheEntry): number {
const ageHours = (Date.now() - entry.lastAccessTime) / (1000 * 60 * 60);
const agePenalty = Math.log(1 + ageHours);
const sizePenalty = entry.sizeTokens / 1000;
const benefitReward = entry.lastBenefit || 0;
return agePenalty + sizePenalty - benefitReward * 2;
}
Performance Metrics
Core Metrics Tracked
- planner_ms: Time spent in planning phase
- candidate_count: Total candidates evaluated
- tokens_selected: Total tokens in selected fragments
- coverage@entities: Fraction of gaze entities covered by selection
- diversity_score: Source diversity in final selection
- cache_hit_rate: KV cache effectiveness
Observability Implementation
typescript
interface PlannerMetrics {
request_id: string;
planner_ms: number;
candidate_count: number;
tokens_selected: number;
coverage_entities: number;
diversity_score: number;
top3_fragment_ids: string[];
top3_scores: number[];
kv_cache_stats: {
hits: number;
misses: number;
evictions: number;
};
}
// Logging guidelines:
// - Log per selection: request_id, k_selected, total_cost_tokens, top3_ids, top3_scores
// - Do not log raw text; hash fragment IDs only if required for privacy
// - Include performance breakdown for optimization
Edge Cases & Error Handling
Common Edge Cases
- Empty candidates → empty plan, metrics zeroed
- tokens_max == 0 → empty plan
- Single very expensive candidate: include only if benefit>0.9 and explicit allow_overshoot flag
- All candidates exceed budget → select highest benefit fragment with warning
- Identical scores and tie-breakers → deterministic fallback by ID
Budget Overshoot Policy
typescript
function handleBudgetOvershoot(
candidates: Fragment[],
budgets: Budgets,
options: { allow_overshoot?: boolean }
): Fragment[] {
if (!options.allow_overshoot) {
return []; // Strict budget compliance
}
// Find single best fragment that exceeds budget
const bestCandidate = candidates.reduce((best, current) =>
current.benefit > best.benefit ? current : best
);
if (bestCandidate.benefit > 0.9) {
console.warn(`Budget overshoot: ${bestCandidate.cost_tokens} > ${budgets.tokens_max}`);
return [bestCandidate];
}
return [];
}
Future Work
- Fusion with Experience‑RAG scores (recency × reinforcement)
- Subgraph expansion pre‑selection for graph hops
- Learned scoring model with offline evaluation
- Multi-objective optimization for quality vs. latency trade-offs
- Dynamic budget allocation based on query complexity
- A/B testing framework for scoring function improvements