Scholar iON
Academic Synthesis
The selected papers from the "stat.OT" category on arXiv highlight diverse applications of statistical methods in astrophysics, data availability, financial modeling, and quantum computing. Reichart's critique of Guidorzi's work emphasizes the importance of properly accounting for sample variance in modeling GRB variability-luminosity correlations, illustrating a significant debate on methodological rigor in astrophysical data analysis. Grothkopf et al.'s study on ESO data papers underscores the role of open-access dissemination via arXiv in enhancing citation impact, although the underlying causative factors remain speculative. Press and Dannenberg's exploration of Q-variance through a multiplicative Langevin process provides insights into asset volatility modeling, while Chang et al. demonstrate the integration of quantum walks and GPU acceleration for efficient distribution generation, indicating a promising intersection of quantum algorithms and high-performance computing. Collectively, these studies underscore the evolving landscape of statistical methodologies across disciplines, emphasizing both the challenges and advancements in leveraging statistical tools for scientific inquiry.
Guidorzi has now written two papers (astro-ph/0507588 and astro-ph/0508483, both accepted to MNRAS) on the GRB variability-luminosity correlation in which he finds that expanded samples of L vs. V data are not well described by a power law because the scatter of the data around such a model is more than can be accounted for by the data's statistical errors alone (sample variance) -- "in contrast with the original findings by Reichart et al. (2001)" -- but then proceeds to model these data with a power law anyway and finds significantly shallower L vs. V relationships than Reichart et al. (2001) found. However, as Reichart & Nysewander (2005; astro-ph/0508111) pointed out after Guidorzi's first posting but before his second, Reichart et al. (2001) never modeled their L vs. V data with a power law. Instead, they used a power law with a distribution around it to accommodate and measure this sample variance. Ignoring sample variance in a fit that requires it very easily results in incorrect fitted parameter values due to increased sensitivity to outliers, as well as significantly underestimated uncertainties in these fitted parameter values. Fitting to Guidorzi's own data, Reichart & Nysewander (2005) showed that when sample variance is included in the model, L ~ V^3.4(+0.9,-0.6) with a sample variance of sigma_logV = 0.20(+0.04,-0.04), which is in excellent agreement with the original finding of Reichart et al. (2001) -- L ~ V^3.3(+1.1,-0.9) with a sample variance of sigma_logV = 0.18(+0.07,-0.05) -- when the sample was approximately one-third its current size.
Using the ESO Telescope Bibliography database telbib, we have investigated the percentage of ESO data papers that were submitted to the arXiv/astro-ph e-print server and that are therefore free to read. Our study revealed an availability of up to 96% of telbib papers on arXiv over the years 2010 to 2017. We also compared the citation counts of arXiv vs. non-arXiv papers and found that on average, papers submitted to arXiv are cited 2.8 times more often than those not on arXiv. While simulations suggest that these findings are statistically significant, we cannot yet draw firm conclusions as to the main cause of these differences.
Q-variance (so-called) posits a statistical relationship $\mathbf{E}(Ο^2 | z) = Ο_0^2 + \tfrac{1}{2}z^2$ between an asset's volatility $Ο^2$, as observed in a time interval $T$, and its (suitably scaled) return $z$ in the same interval. We here show that this relationship is {\em exactly equivalent} to to positing an Inverse Gamma probability distribution for $Ο^2$ itself. We then show that such a distribution is exactly generated by a multiplicative Langevin process with an arbitrary, settable coherence time $Ο_c$, so that very nearly the same Q-variance relationship will hold for all $T \ll Ο_c$.
We present a novel Adaptive Distribution Generator that leverages a quantum walks-based approach to generate high precision and efficiency of target probability distributions. Our method integrates variational quantum circuits with discrete-time quantum walks, specifically, split-step quantum walks and their entangled extensions, to dynamically tune coin parameters and drive the evolution of quantum states towards desired distributions. This enables accurate one-dimensional probability modeling for applications such as financial simulation and structured two-dimensional pattern generation exemplified by digit representations(0~9). Implemented within the CUDA-Q framework, our approach exploits GPU acceleration to significantly reduce computational overhead and improve scalability relative to conventional methods. Extensive benchmarks demonstrate that our Quantum Walks-Based Adaptive Distribution Generator achieves high simulation fidelity and bridges the gap between theoretical quantum algorithms and practical high-performance computation.
This paper addresses the challenges of pricing exotic options and structured products, which traditional models often fail to handle due to their inability to capture real-world market phenomena like fat-tailed distributions and volatility clustering. We introduce a Diffusion-Conditional Probability Model (DDPM) to generate more realistic price paths. Our method incorporates a composite loss function with financial-specific features, and we propose a P-Q dynamic game framework for evaluating the model's economic value through adversarial backtesting. Static validation shows our P-model effectively matches market mean and volatility. In dynamic games, it demonstrates significantly higher profitability than a traditional Monte Carlo-based model for European and Asian options. However, the model shows limitations in pricing products highly sensitive to extreme events, such as snowballs and accumulators, because it tends to underestimate tail risks. The study concludes that diffusion models hold significant potential for enhancing pricing accuracy, though further research is needed to improve their ability to model extreme market risks.
The widespread of Coronavirus has led to a worldwide pandemic with a high mortality rate. Currently, the knowledge accumulated from different studies about this virus is very limited. Leveraging a wide-range of biological knowledge, such as gene ontology and protein-protein interaction (PPI) networks from other closely related species presents a vital approach to infer the molecular impact of a new species. In this paper, we propose the transferred multi-relational embedding model Bio-JOIE to capture the knowledge of gene ontology and PPI networks, which demonstrates superb capability in modeling the SARS-CoV-2-human protein interactions. Bio-JOIE jointly trains two model components. The knowledge model encodes the relational facts from the protein and GO domains into separated embedding spaces, using a hierarchy-aware encoding technique employed for the GO terms. On top of that, the transfer model learns a non-linear transformation to transfer the knowledge of PPIs and gene ontology annotations across their embedding spaces. By leveraging only structured knowledge, Bio-JOIE significantly outperforms existing state-of-the-art methods in PPI type prediction on multiple species. Furthermore, we also demonstrate the potential of leveraging the learned representations on clustering proteins with enzymatic function into enzyme commission families. Finally, we show that Bio-JOIE can accurately identify PPIs between the SARS-CoV-2 proteins and human proteins, providing valuable insights for advancing research on this new disease.
The meaningful comparison of ion mobility (IM) results and of collision cross section (CCS) values on different platforms is a prerequisite for using CCS for identification or structural assignment. The amount of internal energy imparted to the ions prior to the ion mobility cell is a source of experimental variation. Here we investigated the effects of virtually all tuning parameters of the Agilent 6560 IM-Q-TOF on the arrival time distributions of Ubiquitin7+, and found conditions in which the native state prevails. We will discuss the effects of solvent evaporation conditions in the source, in the entire pre-IM DC voltage gradient, and with the funnel RF amplitudes, and will also report on ubiquitin7+ conformations in different solvents, including native supercharging conditions. Collision-induced unfolding (CIU) can be conveniently provoked in two distinct regions: behind the source capillary (by changing the fragmentor voltage) and in the trapping funnel (by changing the trap entrance grid delta voltage). The softness of the instrumental conditions were then optimized with the benchmark DNA G-quadruplex [(dG4T4G4)2.(NH4+)3-8H]5-, for which ion activation results in ammonia loss. To reduce the ion internal energy and obtain the intact 3-NH4+ complex, we reduced the post-IM voltage gradient, but this resulted in a lower IM resolving power due to increased diffusion behind the drift tube. The article thus describes the various trade-offs between ion activation, ion transmission, and ion mobility performance for native MS of very fragile structures.
Delays in biological systems may be used to model events for which the underlying dynamics cannot be precisely observed, or to provide abstraction of some behavior of the system resulting more compact models. In this paper we enrich the stochastic process algebra Bio-PEPA, with the possibility of assigning delays to actions, yielding a new non-Markovian process algebra: Bio-PEPAd. This is a conservative extension meaning that the original syntax of Bio-PEPA is retained and the delay specification which can now be associated with actions may be added to existing Bio-PEPA models. The semantics of the firing of the actions with delays is the delay-as-duration approach, earlier presented in papers on the stochastic simulation of biological systems with delays. These semantics of the algebra are given in the Starting-Terminating style, meaning that the state and the completion of an action are observed as two separate events, as required by delays. Furthermore we outline how to perform stochastic simulation of Bio-PEPAd systems and how to automatically translate a Bio-PEPAd system into a set of Delay Differential Equations, the deterministic framework for modeling of biological systems with delays. We end the paper with two example models of biological systems with delays to illustrate the approach.
The purpose of this paper is to study the shapes and stabilities of bio-membranes within the framework of exterior differential forms. After a brief review of the current status in theoretical and experimental studies on the shapes of bio-membranes, a geometric scheme is proposed to discuss the shape equation of closed lipid bilayers, the shape equation and boundary conditions of open lipid bilayers and two-component membranes, the shape equation and in-plane strain equations of cell membranes with cross-linking structures, and the stabilities of closed lipid bilayers and cell membranes. The key point of this scheme is to deal with the variational problems on the surfaces embedded in three-dimensional Euclidean space by using exterior differential forms.
This work aims at showing the relevance and the applications possibilities of the Fibonacci sequence, and also its q-deformed or quantum extension, in the study of the genetic code(s). First, after the presentation of a new formula, an indexed double Fibonacci sequence, comprising the first six Fibonacci numbers, is shown to describe the 20 amino acids multiplets and their degeneracy as well as a characteristic pattern for the 61 meaningful codons. Next, the twenty amino acids, classified according to their increasing atom-number (carbon, nitrogen, oxygen and sulfur), exhibit several Fibonacci sequence patterns. Several mathematical relations are given, describing various atom-number patterns. Finally, a q-Fibonacci simple phenomenological model, with q a real deformation parameter, is used to describe, in a unified way, not only the standard genetic code, when q=1, but also all known slight variations of this latter, when q~1, as well as the case of the 21st amino acid (Selenocysteine) and the 22nd one (Pyrrolysine), also when q~1. As a by-product of this elementary model, we also show that, in the limit q=0, the number of amino acids reaches the value 6, in good agreement with old and still persistent claims stating that life, in its early development, could have used only a small number of amino acids.