Post-selection inference for network structure
이 뉴스, 어떠셨어요?
한 번의 탭으로 반응을 남겨요 · 로그인 불필요
Abstract
Researchers often use the density of connections between groups of agents, such as communities, blocs, or markets, to characterize the structure of a social or economic network.
In many cases, these groups are selected using the network data, making conventional fixed-group inference procedures potentially invalid.
To address this issue, we develop two new confidence intervals that are universally valid post-selection in the sense that they guarantee simultaneous coverage asymptotically over all pairs of groups whose relative sizes do not vanish.
Our first interval builds on a strategy of \cite{berk2013valid}.
Our second interval is based on a Talagrand-type concentration inequality for empirical processes.
Both intervals are simple to compute and scalable to large networks, but a key technical contribution of our paper is show that only the second interval achieves the best-possible width asymptotically up to a constant factor.
Three empirical illustrations show that accounting for selection can matter in practice.
Some evidence for homophily in a social network and a hub-and-spoke structure in a trade network survives our correction, while evidence for disjoint market segments in a worker transition network does not.