Scholar iON
Academic Synthesis
The selected body of research reflects diverse applications of statistical and mathematical methods in understanding complex biological and computational systems. Margolin et al. (2010) focus on the inference of genetic networks, emphasizing the need to identify higher-order interactions through multivariate dependence, revealing insights into cellular processes that could not be captured by second-order statistics alone. Minichini and Sciarrino (2005) employ mathematical models based on crystal basis to describe nucleotide mutation, offering a novel approach to depict nucleotide sequence distributions, an essential aspect for understanding genetic variability. Schreck and Yuan (2011) utilize statistical mechanics to model protein aggregation, a crucial phenomenon in neurodegenerative diseases and biomaterial development, providing analytical insights through models that align well with experimental data. Lastly, Bouneffouf (2013) explores the evolution of user content in recommender systems, highlighting ongoing challenges and the necessity for dynamic adaptation to user preferences. Collectively, these works underscore the critical role of advanced statistical methods in uncovering intricate relationships and dynamics within biological, genetic, and computational frameworks, pushing the boundaries of current understanding and application.
A critical task in systems biology is the identification of genes that interact to control cellular processes by transcriptional activation of a set of target genes. Many methods have been developed to use statistical correlations in high-throughput datasets to infer such interactions. However, cellular pathways are highly cooperative, often requiring the joint effect of many molecules, and few methods have been proposed to explicitly identify such higher-order interactions, partially due to the fact that the notion of multivariate statistical dependency itself remains imprecisely defined. We define the concept of dependence among multiple variables using maximum entropy techniques and introduce computational tests for their identification. Synthetic network results reveal that this procedure uncovers dependencies even in undersampled regimes, when the joint probability distribution cannot be reliably estimated. Analysis of microarray data from human B cells reveals that third-order statistics, but not second-order ones, uncover relationships between genes that interact in a pathway to cooperatively regulate a common set of targets.
A nucleotides sequence is identified, in the two (four) letters alphabet, by the the labels of a vector state of an irreducible representation of U_q(sl(2)) (U_q(sl(2) + sl(2))), in the limit q -> 0. A master equation for the distribution function is written, where the intensity of the one-spin flip is assumed to depend from the variation of the labels of the state. In the two letters approximation, the numerically computed equilibrium distribution for short sequences is nicely fitted by a Yule distribution, which is the observed distribution of the ranked short oligonucleotides frequency in DNA. The four letter alphabet description, applied to the codons, is able to reproduce the form of the fitted rank ordered usage frequencies distribution.
We develop a theory of aggregation using statistical mechanical methods. An example of a complicated aggregation system with several levels of structures is peptide/protein self-assembly. The problem of protein aggregation is important for the understanding and treatment of neurodegenerative diseases and also for the development of bio-macromolecules as new materials. We write the effective Hamiltonian in terms of interaction energies between protein monomers, protein and solvent, as well as between protein filaments. The grand partition function can be expressed in terms of a Zimm-Bragg-like transfer matrix, which is calculated exactly and all thermodynamic properties can be obtained. We start with two-state and three-state descriptions of protein monomers using Potts models that can be generalized to include q-states, for which the exactly solvable feature of the model remains. We focus on n X N lattice systems, corresponding to the ordered structures observed in some real fibrils. We have obtained results on nucleation processes and phase diagrams, in which a protein property such as the sheet content of aggregates is expressed as a function of the number of proteins on the lattice and inter-protein or interfacial interaction energies. We have applied our methods to AΞ²(1-40) and Curli fibrils and obtained results in good agreement with experiments.
The evolution of the user's content still remains a problem for an accurate recommendation.This is why the current research aims to design Recommender Systems (RS) able to continually adapt information that matches the user's interests. This paper aims to explain this problematic point in outlining the proposals that have been made in research with their advantages and disadvantages.
We formulate option market making as a constrained, risk-sensitive control problem that unifies execution, hedging, and arbitrage-free implied-volatility surfaces inside a single learning loop. A fully differentiable eSSVI layer enforces static no-arbitrage conditions (butterfly and calendar) while the policy controls half-spreads, hedge intensity, and structured surface deformations (state-dependent rho-shift and psi-scale). Executions are intensity-driven and respond monotonically to spreads and relative mispricing; tail risk is shaped with a differentiable CVaR objective via the Rockafellar--Uryasev program. We provide theory for (i) grid-consistency and rates for butterfly/calendar surrogates, (ii) a primal--dual grounding of a learnable dual action acting as a state-dependent Lagrange multiplier, (iii) differentiable CVaR estimators with mixed pathwise and likelihood-ratio gradients and epi-convergence to the nonsmooth objective, (iv) an eSSVI wing-growth bound aligned with Lee's moment constraints, and (v) policy-gradient validity under smooth surrogates. In simulation (Heston fallback; ABIDES-ready), the agent attains positive adjusted P\&L on most intraday segments while keeping calendar violations at numerical zero and butterfly violations at the numerical floor; ex-post tails remain realistic and can be tuned through the CVaR weight. The five control heads admit clear economic semantics and analytic sensitivities, yielding a white-box learner that unifies pricing consistency and execution control in a reproducible pipeline.
This study investigated the dynamic connectivity patterns between EEG and fMRI modalities, contributing to our understanding of brain network interactions. By employing a comprehensive approach that integrated static and dynamic analyses of EEG-fMRI data, we were able to uncover distinct connectivity states and characterize their temporal fluctuations. The results revealed modular organization within the intrinsic connectivity networks (ICNs) of the brain, highlighting the significant roles of sensory systems and the default mode network. The use of a sliding window technique allowed us to assess how functional connectivity varies over time, further elucidating the transient nature of brain connectivity. Additionally, our findings align with previous literature, reinforcing the notion that cognitive states can be effectively identified through short-duration data, specifically within the 30-60 second timeframe. The established relationships between connectivity strength and cognitive processes, particularly during different visual states, underscore the relevance of our approach for future research into brain dynamics. Overall, this study not only enhances our understanding of the interplay between EEG and fMRI signals but also paves the way for further exploration into the neural correlates of cognitive functions and their implications in clinical settings. Future research should focus on refining these methodologies and exploring their applications in various cognitive and clinical contexts.
We present a new explanation for a quantum eraser. Mathematical description of the traditional explanation needs quantum-superposition states. However, the phenomenon can be explained without quantum-superposition states by introducing unobservable potentials which can be identified as an indefinite metric vector. In addition, a delayed choice experiment can also be explained by the interference between the photons and unobservable potentials, which seems like an unreal long-range correlation beyond the causality.
The analysis of the USA 2001 income distribution shows that it can be described by at least two main components, which obey the generalized Tsallis statistics with different values of the q parameter. Theoretical calculations using the gas kinetics model with a distributed saving propensity factor and two ensembles reproduce the empirical data and provide further information on the structure of the distribution, which shows a clear stratification. This stratification is amenable to different interpretations, which are analyzed. The distribution function is invariant with the average individual income, which implies that the inequity of the distribution cannot be modified by increasing the total income.
We study the primary DNA structure of four of the most completely sequenced human chromosomes (including chromosome 19 which is the most dense in coding), using Non-extensive Statistics. We show that the exponents governing the decay of the coding size distributions vary between $5.2 \le r \le 5.7$ for the short scales and $1.45 \le q \le 1.50$ for the large scales. On the contrary, the exponents governing the decay of the non-coding size distributions in these four chromosomes, take the values $2.4 \le r \le 3.2$ for the short scales and $1.50 \le q \le 1.72$ for the large scales. This quantitative difference, in particular in the tail exponent $q$, indicates that the non-coding (coding) size distributions have long (short) range correlations. This non-trivial difference in the DNA statistics is attributed to the non-conservative (conservative) evolution dynamics acting on the non-coding (coding) DNA sequences.
We analyze gene expression time-series data of yeast S. cerevisiae measured along two full cell-cycles. We quantify these data by using q-exponentials, gene expression ranking and a temporal mean-variance analysis. We construct gene interaction networks based on correlation coefficients and study the formation of the corresponding giant components and minimum spanning trees. By coloring genes according to their cell function we find functional clusters in the correlation networks and functional branches in the associated trees. Our results suggest that a percolation point of functional clusters can be identified on these gene expression correlation networks.