Freshness and the Limits of Heuristic Trend Detection in Temporal RAG
이 뉴스, 어떠셨어요?
한 번의 탭으로 반응을 남겨요 · 로그인 불필요
Abstract
We present a lightweight, model-agnostic temporal layer for RAG and use cybersecurity data to separate two problems that are usually conflated.
For freshness, a half-life recency prior surfaces the newest relevant item where a cosine-only baseline scores 0.00; on a hard NVD CVE test, where the freshest item is not the most similar, it reaches Latest@10 of 0.60 versus 0.20 for a semantic-then-newest baseline, but stays partial and parameter-sensitive.
For topic evolution, a heuristic tracker's low 0.08 macro-F1 is driven by the labeling rule, not the clusterer (HDBSCAN: 0.10; fixing the rule alone reaches 0.49, and 0.96 without clustering noise).
We contribute a reproducible decoupling of the two, with honest real-data scope and a reference implementation.