Nonextensive Statistical Signatures of the Bilaterian Transition in Proteome Length Distributions
이 뉴스, 어떠셨어요?
한 번의 탭으로 반응을 남겨요 · 로그인 불필요
Abstract
Protein length distributions across the tree of life carry a quantitative signature of organismal complexity.
Nonextensive statistical mechanics, through the Tsallis generalized entropy formalism, provides a natural framework for describing complex systems characterized by long-range correlations, scale invariance, and hierarchical organization -- features that classical Boltzmann-Gibbs statistics cannot accommodate.
In this work, the complementary cumulative distribution function (CCDF) of protein lengths is analyzed within this framework for the reference proteomes of 22 fully sequenced organisms spanning the domains Archaea, Bacteria, and Eukarya, with deliberate sampling across the animal transition zone from sponges and cnidarians to higher bilaterians.
Maximum likelihood (MLE) fitting of truncated discrete q-exponential distributions, with bootstrap 95% confidence intervals (CIs) reveals that the entropic index q resolves into three statistically distinct regimes: superextensive (q < 1) for prokaryotes, unicellular and non-animal multicellular eukaryotes, and basal animals; a boundary regime (CI on spanning unity) for the two cnidarians studied and the basal bilaterian C. teleta; and subextensive (q > 1) for all higher bilaterians, with q increasing monotonically across the four deuterostomes sampled from S. purpuratus (1.033) to H. sapiens (1.147).
The q-exponential outperforms the ordinary exponential distribution across all 22 proteomes and becomes progressively more competitive against alternative two-parameter distributions as proteome complexity increases.
These results identify the Tsallis entropic index as a continuous, physically interpretable indicator of proteome organizational complexity and extend the applicability of nonextensive statistical mechanics to proteomic systems.