Dense attention OOM risk
Cost: O(N) memory, grows forever.
Sliding window only (W = 8)
Cost: O(W). Perplexity collapses past the eviction boundary.
Sliding window + attention sinks (K = 4, W = 8)
Cost: O(W + K). Perplexity stays stable indefinitely.