Scholar iON
Academic Synthesis
This collection of scholarly papers showcases diverse advancements in statistical methodologies and their applications across various domains. A common theme is the enhancement of predictive accuracy and robustness in complex systems, such as cardiovascular imaging, adversarial language model defense, frequency measurement, and hydrological dynamics. Notable is the application of innovative frameworks like ARIADNE and Sentra-Guard, which leverage machine learning and preference-based learning to improve diagnostic precision and adversarial prompt detection, respectively. The research on Ξ© counter and streamflow predictability underscores the significance of novel algorithms, like linear regression and visibility graphs, in refining measurement accuracy and understanding dynamical systems. Collectively, these studies highlight the intersections of statistical techniques with domain-specific challenges, emphasizing the ongoing need for interdisciplinary approaches to address evolving scientific and technological demands.
Conventional pixel-wise loss functions fail to enforce topological constraints in coronary vessel segmentation, producing fragmented vascular trees despite high pixel-level accuracy. We present ARIADNE, a two-stage framework coupling preference-aligned perception with RL-based diagnostic reasoning for topologically coherent stenosis detection. The perception module employs DPO to fine-tune the Sa2VA vision-language foundation model using Betti number constraints as preference signals, aligning the policy toward geometrically complete vessel structures rather than pixel-wise overlap metrics. The reasoning module formulates stenosis localization as a Markov Decision Process with an explicit rejection mechanism that autonomously defers ambiguous anatomical candidates such as bifurcations and vessel crossings, shifting from coverage maximization to reliability optimization. On 1,400 clinical angiograms, ARIADNE achieves state-of-the-art centerline Dice of 0.838, reduces false positives by 41% compared to geometric baselines. External validation on multi-center benchmarks ARCADE and XCAD confirms generalization across acquisition protocols. This represents the first application of DPO for topological alignment in medical imaging, demonstrating that preference-based learning over structural constraints mitigates topological violations while maintaining diagnostic sensitivity in interventional cardiology workflows.
This paper presents a real-time modular defense system named Sentra-Guard. The system detects and mitigates jailbreak and prompt injection attacks targeting large language models (LLMs). The framework uses a hybrid architecture with FAISS-indexed SBERT embedding representations that capture the semantic meaning of prompts, combined with fine-tuned transformer classifiers, which are machine learning models specialized for distinguishing between benign and adversarial language inputs. It identifies adversarial prompts in both direct and obfuscated attack vectors. A core innovation is the classifier-retriever fusion module, which dynamically computes context-aware risk scores that estimate how likely a prompt is to be adversarial based on its content and context. The framework ensures multilingual resilience with a language-agnostic preprocessing layer. This component automatically translates non-English prompts into English for semantic evaluation, enabling consistent detection across over 100 languages. The system includes a HITL feedback loop, where decisions made by the automated system are reviewed by human experts for continual learning and rapid adaptation under adversarial pressure. Sentra-Guard maintains an evolving dual-labeled knowledge base of benign and malicious prompts, enhancing detection reliability and reducing false positives. Evaluation results show a 99.96% detection rate (AUC = 1.00, F1 = 1.00) and an attack success rate (ASR) of only 0.004%. This outperforms leading baselines such as LlamaGuard-2 (1.3%) and OpenAI Moderation (3.7%). Unlike black-box approaches, Sentra-Guard is transparent, fine-tunable, and compatible with diverse LLM backends. Its modular design supports scalable deployment in both commercial and open-source environments. The system establishes a new state-of-the-art in adversarial LLM defense.
This article introduces the Ξ© counter, a frequency counter -- or a frequency-to-digital converter, in a different jargon -- based on the Linear Regression (LR) algorithm on time stamps. We discuss the noise of the electronics. We derive the statistical properties of the Ξ© counter on rigorous mathematical basis, including the weighted measure and the frequency response. We describe an implementation based on a SoC, under test in our laboratory, and we compare the Ξ© counter to the traditional Ξ and Ξ counters. The LR exhibits optimum rejection of white phase noise, superior to that of the Ξ and Ξ counters. White noise is the major practical problem of wideband digital electronics, both in the instrument internal circuits and in the fast processes which we may want to measure. The Ξ© counter finds a natural application in the measurement of the Parabolic Variance, described in the companion article arXiv:1506.00687 [physics.data-an].
Streamflow is a dynamical process that integrates water movement in space and time within basin boundaries. The authors characterize the dynamics associated with streamflow time series data from about seventy-one U.S. Geological Survey (USGS) stream-gauge stations in the state of Iowa. They employ a novel approach called visibility graph (VG). It uses the concept of mapping time series into complex networks to investigate the time evolutionary behavior of dynamical system. The authors focus on a simple variant of VG algorithm called horizontal visibility graph (HVG). The tracking of dynamics and hence, the predictability of streamflow processes, are carried out by extracting two key pieces of information called characteristic exponent, Ξ» of degree distribution and global clustering coefficient, GC pertaining to HVG derived network. The authors use these two measures to identify whether streamflow process has its origin in random or chaotic processes. They show that the characterization of streamflow dynamics is sensitive to data attributes. Through a systematic and comprehensive analysis, the authors illustrate that streamflow dynamics characterization is sensitive to the normalization, and the time-scale of streamflow time-series. At daily scale, streamflow at all stations used in the analysis, reveals randomness with strong spatial scale (basin size) dependence. This has implications for predictability of streamflow and floods. The authors demonstrate that dynamics transition through potentially chaotic to randomly correlated process as the averaging time-scale increases. Finally, the temporal trends of Ξ» and GC are statistically significant at about 40% of the total number of stations analyzed. Attributing this trend to factors such as changing climate or land use requires further research.
Experimental results stated in quant-ph/0612031 are seminal: The authors have realized nondemolition measurements of the photon number. As to the interpretation of the results, it seems to be less than convincing: The treatment of the system state and of the role of measurement is not compatible with the conventional point of view. We propose an adequate treatment, in which the experimental results are a manifestation of a partial Zeno effect (a slowdown of relaxation).
We show that Bell correlations may arise as a special sort of selection artefact, produced by ordinary control of the initial state of the experiments concerned. This accounts for nonlocality, without recourse to any direct spacelike causality or influence. The argument improves an earlier proposal in (arXiv:2101.05370v4 [quant-ph], arXiv:2212.06986 [quant-ph]) in two main respects: (i) in demonstrating its application in a real Bell experiment; and (ii) in avoiding the need for a postulate of retrocausality. This version includes an Appendix, discussing the relation of the proposal to the conclusions of Wood and Spekkens (arXiv:1208.4119 [quant-ph]).
As is well known, the existed perturbation theory can be applied to calculations of energy, state and transition probability in many quantum systems. However, there are different paths and methods to improve its calculation precision and efficiency in our view. According to an improved scheme of perturbation theory proposed by [An Min Wang, quant-ph/0611217], we reconsider the transition probability and perturbed energy for a Hydrogen atom in a constant magnetic field. We find the results obtained by using Wang's scheme are indeed more satisfying in the calculation precision and efficiency. Therefore, Wang's scheme can be thought of as a powerful tool in the perturbation calculation of quantum systems.
Large language models (LLMs) for Arabic are still dominated by Modern Standard Arabic (MSA), with limited support for Saudi dialects such as Najdi and Hijazi. This underrepresentation hinders their ability to capture authentic dialectal variation. Using a privately curated Saudi Dialect Instruction dataset (Hijazi and Najdi; 5,466 synthetic instruction-response pairs; 50/50 split), we LoRA-tune ALLaM-7B-Instruct-preview, the first foundation model developed in Saudi Arabia, for Saudi dialect generation. We investigate two variants: (i) Dialect-Token training, which prepends an explicit dialect tag to the instruction, and (ii) No-Token training, which omits the tag at formatting time. Evaluation on a held-out test set combines an external dialect classifier with text fidelity metrics (chrF++ and BERTScore) and diversity measures. The Dialect-Token model achieves the best control, raising the Saudi rate from 47.97% to 84.21% and reducing MSA leakage from 32.63% to 6.21%; fidelity also improves (chrF++ +3.53, BERTScore +0.059). Both LoRA variants outperform strong generic instruction models (Falcon-7B-Instruct, Llama-3.1-8B-Instruct, Qwen-2.5-7B-Instruct, AceGPT-v2-8B-Chat, JAIS-13B-Chat) in dialect control and fidelity, while avoiding metadata-tag echoing that these baselines frequently exhibit. We do not release the dataset or any model weights/adapters; instead, we release training/evaluation/inference code and a detailed datasheet (schema and aggregate statistics) to support independent verification.
We present a tool that, from automatically recognised names, tries to infer inter-person relations in order to present associated people on maps. Based on an in-house Named Entity Recognition tool, applied on clusters of an average of 15,000 news articles per day, in 15 different languages, we build a knowledge base that allows extracting statistical co-occurrences of persons and visualising them on a per-person page or in various graphs.
We present Tell Me, a mental well-being system that leverages advances in large language models to provide accessible, context-aware support for users and researchers. The system integrates three components: (i) a retrieval-augmented generation (RAG) assistant for personalized, knowledge-grounded dialogue; (ii) a synthetic client-therapist dialogue generator conditioned on client profiles to facilitate research on therapeutic language and data augmentation; and (iii) a Well-being AI crew, implemented with CrewAI, that produces weekly self-care plans and guided meditation audio. The system is designed as a reflective space for emotional processing rather than a substitute for professional therapy. It illustrates how conversational assistants can lower barriers to support, complement existing care, and broaden access to mental health resources. To address the shortage of confidential therapeutic data, we introduce synthetic client-therapist dialogue generation conditioned on client profiles. Finally, the planner demonstrates an innovative agentic workflow for dynamically adaptive, personalized self-care, bridging the limitations of static well-being tools. We describe the architecture, demonstrate its functionalities, and report evaluation of the RAG assistant in curated well-being scenarios using both automatic LLM-based judgments and a human-user study. This work highlights opportunities for interdisciplinary collaboration between NLP researchers and mental health professionals to advance responsible innovation in human-AI interaction for well-being.