Scholar iON
Academic Synthesis
The collected works present a diverse exploration of statistical physics across various domains, illustrating the versatility and applicability of statistical methodologies in understanding complex systems. Abramov et al. (2023) employ $q$-statistics to analyze human electroencephalograms, suggesting that this non-additive entropy framework is apt for capturing the brain's complexity beyond traditional Boltzmann-Gibbs statistics. Allen and Waclaw (2018) highlight the role of statistical physics in unraveling bacterial growth phenomena, emphasizing the interplay between theoretical challenges and experimental validation. Zhu (2005) contributes to the discourse on statistical significance by refining the correlation between normal distribution and p-values, applicable in both discrete and continuous data contexts. Okun (2000) investigates the impact of static gravity on photons, integrating classical and quantum perspectives. Collectively, these studies underscore the profound insight statistical physics offers into intricate natural processes, fostering advancements across neuroscience, microbiology, statistical theory, and gravitational physics.
The brain is a complex system whose understanding enables potentially deeper approaches to mental phenomena. Dynamics of wide classes of complex systems have been satisfactorily described within $q$-statistics, a current generalization of Boltzmann-Gibbs (BG) statistics. Here, we study human electroencephalograms of typical human adults (EEG), very specifically their inter-occurrence times across an arbitrarily chosen threshold of the signal (observed, for instance, at the midparietal location in scalp). The distributions of these inter-occurrence times differ from those usually emerging within BG statistical mechanics. They are instead well approached within the $q$-statistical theory, based on non-additive entropies characterized by the index $q$. The present method points towards a suitable tool for quantitatively accessing brain complexity, thus potentially opening useful studies of the properties of both typical and altered brain physiology.
Bacterial growth presents many beautiful phenomena that pose new theoretical challenges to statistical physicists, and are also amenable to laboratory experimentation. This review provides some of the essential biological background, discusses recent applications of statistical physics in this field, and highlights the potential for future research.
A definition for the statistical significance by constructing a correlation between the normal distribution integral probability and the p-value observed in an experiment is proposed, which is suitable for both counting experiment and continuous test statistics.
The influence of static gravitational field on frequency, wave-length and velocity of photons and on the energy levels of atoms and nuclei is considered in the most elementary way. The interconnection between these phenomena is stressed.
Data science has become increasingly essential for the production of official statistics, as it enables the automated collection, processing, and analysis of large amounts of data. With such data science practices in place, it enables more timely, more insightful and more flexible reporting. However, the quality and integrity of data-science-driven statistics rely on the accuracy and reliability of the data sources and the machine learning techniques that support them. In particular, changes in data sources are inevitable to occur and pose significant risks that are crucial to address in the context of machine learning for official statistics.
This paper gives an overview of the main risks, liabilities, and uncertainties associated with changing data sources in the context of machine learning for official statistics. We provide a checklist of the most prevalent origins and causes of changing data sources; not only on a technical level but also regarding ownership, ethics, regulation, and public perception. Next, we highlight the repercussions of changing data sources on statistical reporting. These include technical effects such as concept drift, bias, availability, validity, accuracy and completeness, but also the neutrality and potential discontinuation of the statistical offering. We offer a few important precautionary measures, such as enhancing robustness in both data sourcing and statistical techniques, and thorough monitoring. In doing so, machine learning-based official statistics can maintain integrity, reliability, consistency, and relevance in policy-making, decision-making, and public discourse.
In this paper, we introduce an approach to the protein folding problem from the point of view of statistical physics. Protein folding is a stochastic process by which a polypeptide folds into its characteristic and functional 3D structure from random coil. The process involves an intricate interplay between global geometry and local structure, and each protein seems to present special problems. We introduce CSAW (conditioned self-avoiding walk), a model of protein folding that combines the features of self-avoiding walk (SAW) and the Monte Carlo method. In this model, the unfolded protein chain is treated as a random coil described by SAW. Folding is induced by hydrophobic forces and other interactions, such as hydrogen bonding, which can be taken into account by imposing conditions on SAW. Conceptually, the mathematical basis is a generalized Langevin equation. To illustrate the flexibility and capabilities of the model, we consider several examples, including helix formation, elastic properties, and the transition in the folding of myoglobin. From the CSAW simulation and physical arguments, we find a universal elastic energy for proteins, which depends only on the radius of gyration $R_{g}$ and the residue number $N$. The elastic energy gives rise to scaling laws $R_{g}\sim N^Ξ½$ in different regions with exponents $Ξ½=3/5,3/7,2/5$, consistent with the observed unfolded stage, pre-globule, and molten globule, respectively. These results indicate that CSAW can serve as a theoretical laboratory to study universal principles in protein folding.
While many good textbooks are available on Protein Structure, Molecular Simulations, Thermodynamics and Bioinformatics methods in general, there is no good introductory level book for the field of Structural Bioinformatics. This book aims to give an introduction into Structural Bioinformatics, which is where the previous topics meet to explore three dimensional protein structures through computational analysis. We provide an overview of existing computational techniques, to validate, simulate, predict and analyse protein structures. More importantly, it will aim to provide practical knowledge about how and when to use such techniques. We will consider proteins from three major vantage points: Protein structure quantification, Protein structure prediction, and Protein simulation & dynamics.
In this chapter we explore basic physical and chemical concepts required to understand protein folding. We introduce major (de)stabilising factors of folded protein structures such as the hydrophobic effect and backbone entropy. In addition, we consider different states along the folding pathway, as well as natively disordered proteins and aggregated protein states. In this chapter, an intuitive understanding is provided about the protein folding process, to prepare for the next chapter on the thermodynamics of protein folding. In particular, it is emphasized that protein folding is a stochastic process and that proteins unfold and refold in a dynamic equilibrium. The effect of temperature on the stability of the folded and unfolded states is also explained.
The prediction of protein stability changes following single-point mutations plays a pivotal role in computational biology, particularly in areas like drug discovery, enzyme reengineering, and genetic disease analysis. Although deep-learning strategies have pushed the field forward, their use in standard workflows remains limited due to resource demands. Conversely, potential-like methods are fast, intuitive, and efficient. Yet, these typically estimate Gibbs free energy shifts without considering the free-energy variations in the unfolded protein state, an omission that may breach mass balance and diminish accuracy. This study shows that incorporating a mass-balance correction (MBC) to account for the unfolded state significantly enhances these methods. While many machine learning models partially model this balance, our analysis suggests that a refined representation of the unfolded state may improve the predictive performance.
Multi-Agent Pathfinding (MAPF) plays a critical role in various domains. Traditional MAPF methods typically assume unit edge costs and single-timestep actions, which limit their applicability to real-world scenarios. MAPFR extends MAPF to handle non-unit costs with real-valued edge costs and continuous-time actions, but its geometric collision model leads to an unbounded state space that compromises solver efficiency. In this paper, we propose MAPFZ, a novel MAPF variant on graphs with non-unit integer costs that preserves a finite state space while offering improved realism over classical MAPF. To solve MAPFZ efficiently, we develop CBS-NIC, an enhanced Conflict-Based Search framework incorporating time-interval-based conflict detection and an improved Safe Interval Path Planning (SIPP) algorithm. Additionally, we propose Bayesian Optimization for Graph Design (BOGD), a discretization method for non-unit edge costs that balances efficiency and accuracy with a sub-linear regret bound. Extensive experiments demonstrate that our approach outperforms state-of-the-art methods in runtime and success rate across diverse benchmark scenarios.
Higher-order spacing statistics in the $m$ superposed spectra of circular random matrices of the same class are studied numerically. We conjecture that for given $m$ (or order $k$) and $Ξ²$, the sequence of modified Dyson index $Ξ²'(k)$ (or $Ξ²'(m)$) obtained using the sum of absolute differences between the cumulative distribution functions method (denoted as $D(Ξ²')$) is unique. Also, for a given $k$, the distribution tends to the corresponding $k$-th order Poisson statistics in the limit $m\rightarrow \infty$. The quantum chaotic kicked top model for various Hilbert space dimensions is studied, and it is found to satisfy our conjecture. This involves the numerical verification of $m=2$ case of COE results. Our result can be used as a tool for the characterization of a system and to determine the symmetry structure of the system without desymmetrization of the spectra. Additionally, the comparative study of the higher-order spacing and ratio distributions in both $m=1$ and $m=2$ cases of COE as well as GOE is performed within and across these ensembles numerically using the $D(Ξ²')$ method. This study is carried out both by varying the dimension and keeping the number of realizations constant, and vice-versa. The same asymptotic higher-order statistics are observed across COE and GOE in terms of a given spectral fluctuation measure. But, within a given ensemble of COE or GOE, the results of higher-order spacing and ratio distributions agree with each other only up to some lower $k$, and beyond that, they start deviating from each other. Further, the spectral fluctuations of the intermediate map of various dimensions are studied. Various important observations and discussions from the analysis of our extensive numerical computations are presented.