Scholar iON
Academic Synthesis
The selected scholarly works collectively highlight significant advancements and applications in machine learning and artificial intelligence (AI). Ablikim et al. (2021) contribute to the field of particle physics by identifying a potential new tetraquark state, leveraging machine learning techniques for data analysis in high-energy physics experiments. Lu and Sen (2020) delve into the contextual stochastic block model, confirming theoretical thresholds for community detection in sparse graphs, which underscores the intersection of statistical mechanics and machine learning. Dunham et al. (2020) and Leiter et al. (2024) focus on AI's role in scientific research, with Dunham et al. developing methodologies to identify AI-relevant publications across large corpora, and Leiter et al. documenting the evolution of AI research and its impact on academic writing. These studies collectively emphasize the growing integration of machine learning across diverse scientific disciplines, highlighting trends, methodological innovations, and the transformative potential of AI in research.
We report a study of the processes of e^{+}e^{-}βK^{+}D_{s}^{-}D^{*0} and K^{+}D_{s}^{*-}D^{0} based on e^{+}e^{-} annihilation samples collected with the BESIII detector operating at BEPCII at five center-of-mass energies ranging from 4.628 to 4.698Β GeV with a total integrated luminosity of 3.7ββfb^{-1}. An excess of events over the known contributions of the conventional charmed mesons is observed near the D_{s}^{-}D^{*0} and D_{s}^{*-}D^{0} mass thresholds in the K^{+} recoil-mass spectrum for events collected at sqrt[s]=4.681ββGeV. The structure matches a mass-dependent-width Breit-Wigner line shape, whose pole mass and width are determined as (3982.5_{-2.6}^{+1.8}Β±2.1)ββMeV/c^{2} and (12.8_{-4.4}^{+5.3}Β±3.0)ββMeV, respectively. The first uncertainties are statistical and the second are systematic. The significance of the resonance hypothesis is estimated to be 5.3ββΟ over the contributions only from the conventional charmed mesons. This is the first candidate for a charged hidden-charm tetraquark with strangeness, decaying into D_{s}^{-}D^{*0} and D_{s}^{*-}D^{0}. However, the properties of the excess need further exploration with more statistics.
We study community detection in the contextual stochastic block model arXiv:1807.09596 [cs.SI], arXiv:1607.02675 [stat.ME]. In arXiv:1807.09596 [cs.SI], the second author studied this problem in the setting of sparse graphs with high-dimensional node-covariates. Using the non-rigorous cavity method from statistical physics, they conjectured the sharp limits for community detection in this setting. Further, the information theoretic threshold was verified, assuming that the average degree of the observed graph is large. It is expected that the conjecture holds as soon as the average degree exceeds one, so that the graph has a giant component. We establish this conjecture, and characterize the sharp threshold for detection and weak recovery.
We describe a strategy for identifying the universe of research publications relevant to the application and development of artificial intelligence. The approach leverages the arXiv corpus of scientific preprints, in which authors choose subject tags for their papers from a set defined by editors. We compose a functional definition of AI relevance by learning these subjects from paper metadata, and then inferring the arXiv-subject labels of papers in larger corpora: Clarivate Web of Science, Digital Science Dimensions, and Microsoft Academic Graph. This yields predictive classification $F_1$ scores between .75 and .86 for Natural Language Processing (cs.CL), Computer Vision (cs.CV), and Robotics (cs.RO). For a single model that learns these and four other AI-relevant subjects (cs.AI, cs.LG, stat.ML, and cs.MA), we see precision of .83 and recall of .85. We evaluate the out-of-domain performance of our classifiers against other sources of topic information and predictions from alternative methods. We find that a supervised solution can generalize to identify publications that belong to the high-level fields of study represented on arXiv. This offers a method for identifying AI-relevant publications that updates at the pace of research output, without reliance on subject-matter experts for query development or labeling.
The NLLG (Natural Language Learning&Generation) arXiv reports assist in navigating the rapidly evolving landscape of NLP and AI research across cs.CL, cs.CV, cs.AI, and cs.LG categories. This fourth installment captures a transformative period in AI history - from January 1, 2023, following ChatGPT's debut, through September 30, 2024. Our analysis reveals substantial new developments in the field - with 45% of the top 40 most-cited papers being new entries since our last report eight months ago and offers insights into emerging trends and major breakthroughs, such as novel multimodal architectures, including diffusion and state space models. Natural Language Processing (NLP; cs.CL) remains the dominant main category in the list of our top-40 papers but its dominance is on the decline in favor of Computer vision (cs.CV) and general machine learning (cs.LG). This report also presents novel findings on the integration of generative AI in academic writing, documenting its increasing adoption since 2022 while revealing an intriguing pattern: top-cited papers show notably fewer markers of AI-generated content compared to random samples. Furthermore, we track the evolution of AI-associated language, identifying declining trends in previously common indicators such as"delve".
In the quiet backwaters of cs.CV, cs.LG and stat.ML, a cornucopia of new learning systems is emerging from a primordial soup of mathematics-learning systems with no need for external supervision. To date, little thought has been given to how these self-supervised learners have sprung into being or the principles that govern their continuing diversification. After a period of deliberate study and dispassionate judgement during which each author set their Zoom virtual background to a separate Galapagos island, we now entertain no doubt that each of these learning machines are lineal descendants of some older and generally extinct species. We make five contributions: (1) We gather and catalogue row-major arrays of machine learning specimens, each exhibiting heritable discriminative features; (2) We document a mutation mechanism by which almost imperceptible changes are introduced to the genotype of new systems, but their phenotype (birdsong in the form of tweets and vestigial plumage such as press releases) communicates dramatic changes; (3) We propose a unifying theory of self-supervised machine evolution and compare to other unifying theories on standard unifying theory benchmarks, where we establish a new (and unifying) state of the art; (4) We discuss the importance of digital biodiversity, in light of the endearingly optimistic Paris Agreement.
In this chapter we review the current theoretical state of the art of small black holes at the LHC. We discuss the production mechanism for small non thermal black holes at the LHC and discuss new signatures due to a possible discrete mass spectrum of these black holes.
We solve the dynamic equation for the kinetic spherical model that initially is in an arbitrary equilibrium state and then is left to evolve in a heat-bath with another temperature. Flows of the Renormalizational group are determined.
We present the statistical approach to the combining of signal significances.
The optimum interval method for finding an upper limit of a one-dimensionally distributed signal in the presence of an unknown background is extended to the case of high statistics. There is also some discussion of how the method can be extended to the multiple dimensional case.
An introduction to numerical statistics.