Sampling for Region-Aggregated Spatial Scan Statistics

arXiv Stat

이 뉴스, 어떠셨어요?

한 번의 탭으로 반응을 남겨요 · 로그인 불필요

CC BY

이 매체는 공공·자유 라이선스로 본문을 직접 표시합니다.

Abstract

Anomaly detection in geospatial data is a crucial tool in geographic information science (GIS), with applications ranging from national security to public-health surveillance to the study of societal disparities.

This work focuses on spatial scan statistics and addresses a key mismatch: spatial counts are typically aggregated into predefined regions (census tracts, zip codes, counties), whereas the most efficient scan algorithms operate on spatial point data.

The standard remedy -- collapsing each region to its centroid, as in widely used tools such as SaTScan -- is convenient but, as we show, discards the region's spatial extent and causes a significant loss in statistical power.

To resolve this, we propose a simple yet scalable fix: replace each spatial region with 20-50 points sampled uniformly from its geometry and spread the region's values evenly across them.

This approach improves statistical power while maintaining computational tractability.

A convergence analysis explains why so few samples per region suffice.

We recommend this sampling-based conversion as the default way to apply point-based spatial scan statistics to region-aggregated data for anomaly detection.

전문 보기

Sampling for Region-Aggregated Spatial Scan Statistics

이 뉴스, 어떠셨어요?

Abstract

관련 뉴스

'research' 카테고리 뉴스

PACE: A Neuro-Symbolic Framework for Plausible and Actionable Counterfactual Explanations

Auto-FL-Research: Agentic Search for Federated Learning Algorithms

The Wiola Architecture for Efficient Small Language Models

arXiv의 다른 기사

CreativityNeuro: Steering Language Model Weights to Improve Divergent Thinking and Reduce Mode Collapse

Discrete Diffusion Language Models for Interactive Radiology Report Drafting

Beyond Next-Token Prediction: An RLVR Proof of Concept for Tool-Use Agents on Atlassian Workflows