UNiON Scholar
UNiON Web Scholar iON AI About Scholar
4 scholarly results for eess.IV
Scholar iON Academic Synthesis
The body of research collectively addresses advancements in signal processing and machine learning, with a focus on optimizing computational efficiency and resolution. The works on the RNN-T decoder by Botros et al. explore model simplification through weight tying and reduced context size, achieving significant parameter reduction without sacrificing accuracy, which is crucial for on-device applications. Liu and Zhang extend the theory of computational resolution limits to multi-dimensional spaces, offering new insights into the phase transition phenomena in super-resolution tasks. Meanwhile, Leblanc et al. propose a novel compressive sensing approach in radio-interferometry using random beamforming, demonstrating how data size can be significantly reduced while ensuring robust image reconstruction. These studies underline the significance of innovative algorithmic strategies in handling large datasets and enhancing resolution capabilities across various domains.
πŸŽ“ Deep dive with Scholar iON β†’
semanticscholar.org Β· scholarly article
Tied & Reduced RNN-T Decoder
Rami Botros; Tara N. Sainath; R. David; Emmanuel Guzman; Wei Li; Yanzhang He
2021 Interspeech πŸ“– Cited 56 times Open Access DOI: 10.21437/Interspeech.2021-212
Previous works on the Recurrent Neural Network-Transducer (RNN-T) models have shown that, under some conditions, it is possible to simplify its prediction network with little or no loss in recognition accuracy (arXiv:2003.07705 [eess.AS], [2], arXiv:2012.06749 [cs.CL]). This is done by limiting the context size of previous labels and/or using a simpler architecture for its layers instead of LSTMs. The benefits of such changes include reduction in model size, faster inference and power savings, which are all useful for on-device applications. In this work, we study ways to make the RNN-T decoder (prediction network + joint network) smaller and faster without degradation in recognition performance. Our prediction network performs a simple weighted averaging of the input embeddings, and shares its embedding matrix weights with the joint network's output layer (a.k.a. weight tying, commonly used in language modeling arXiv:1611.01462 [cs.LG]). This simple design, when used in conjunction with additional Edit-based Minimum Bayes Risk (EMBR) training, reduces the RNN-T Decoder from 23M parameters to just 2M, without affecting word-error rate (WER).
arxiv.org Β· scholarly article
Tied & Reduced RNN-T Decoder
Rami Botros; Tara N. Sainath; Robert David; Emmanuel Guzman; Wei Li; Yanzhang He
2021 arXiv Open Access DOI: 10.21437/Interspeech.2021-212
Previous works on the Recurrent Neural Network-Transducer (RNN-T) models have shown that, under some conditions, it is possible to simplify its prediction network with little or no loss in recognition accuracy (arXiv:2003.07705 [eess.AS], [2], arXiv:2012.06749 [cs.CL]). This is done by limiting the context size of previous labels and/or using a simpler architecture for its layers instead of LSTMs. The benefits of such changes include reduction in model size, faster inference and power savings, which are all useful for on-device applications. In this work, we study ways to make the RNN-T decoder (prediction network + joint network) smaller and faster without degradation in recognition performance. Our prediction network performs a simple weighted averaging of the input embeddings, and shares its embedding matrix weights with the joint network's output layer (a.k.a. weight tying, commonly used in language modeling arXiv:1611.01462 [cs.LG]). This simple design, when used in conjunction with additional Edit-based Minimum Bayes Risk (EMBR) training, reduces the RNN-T Decoder from 23M parameters to just 2M, without affecting word-error rate (WER).
arxiv.org Β· scholarly article
Mathematical Theory of Computational Resolution Limit in Multi-dimensions
Ping Liu; Hai Zhang
2021 arXiv Open Access DOI: 10.1088/1361-6420/ac245b
Resolving a linear combination of point sources from their band-limited Fourier data is a fundamental problem in imaging and signal processing. With the incomplete Fourier data and the inevitable noise in the measurement, there is a fundamental limit on the separation distance between point sources that can be resolved. This is the so-called resolution limit problem. Characterization of this resolution limit is still a long-standing puzzle despite the prevalent use of the classic Rayleigh limit. It is well-known that Rayleigh limit is heuristic and its drawbacks become prominent when dealing with data that is subjected to delicate processing, as is what modern computational imaging methods do. Therefore, more precise characterization of the resolution limit becomes increasingly necessary with the development of data processing methods. For this purpose, we developed a theory of "computational resolution limit" for both number detection and support recovery in one dimension in [arXiv:2003.02917[cs.IT], arXiv:1912.05430[eess.IV]]. In this paper, we extend the one-dimensional theory to multi-dimensions. More precisely, we define and quantitatively characterize the "computational resolution limit" for the number detection and support recovery problems in a general k-dimensional space. Our results indicate that there exists a phase transition phenomenon regarding to the super-resolution factor and the signal-to-noise ratio in each of the two recovery problems. Our main results are derived using a subspace projection strategy. Finally, to verify the theory, we proposed deterministic subspace projection based algorithms for the number detection and support recovery problems in dimension two and three. The numerical results confirm the phase transition phenomenon predicted by the theory.
arxiv.org Β· scholarly article
Compressive radio-interferometric sensing with random beamforming as rank-one signal covariance projections
Olivier Leblanc; Yves Wiaux; Laurent Jacques
2024 arXiv Open Access
Radio-interferometry (RI) observes the sky at unprecedented angular resolutions, enabling the study of several far-away galactic objects such as galaxies and black holes. In RI, an array of antennas probes cosmic signals coming from the observed region of the sky. The covariance matrix of the vector gathering all these antenna measurements offers, by leveraging the Van Cittert-Zernike theorem, an incomplete and noisy Fourier sensing of the image of interest. The number of noisy Fourier measurements -- or visibilities -- scales as $\mathcal O(Q^2B)$ for $Q$ antennas and $B$ short-time integration (STI) intervals. We address the challenges posed by this vast volume of data, which is anticipated to increase significantly with the advent of large antenna arrays, by proposing a compressive sensing technique applied directly at the level of the antenna measurements. First, this paper shows that beamforming -- a common technique of dephasing antenna signals -- usually used to focus some region of the sky, is equivalent to sensing a rank-one projection (ROP) of the signal covariance matrix. We build upon our recent work arXiv:2306.12698v3 [eess.IV] to propose a compressive sensing scheme relying on random beamforming, trading the $Q^2$-dependence of the data size for a smaller number $P$ ROPs. We provide image recovery guarantees for sparse image reconstruction. Secondly, the data size is made independent of $B$ by applying $M$ Bernoulli modulations of the ROP vectors obtained for the STI. The resulting sample complexities, theoretically derived in a simpler case without modulations and numerically obtained in phase transition diagrams, are shown to scale as $\mathcal O(K)$ where $K$ is the image sparsity. This illustrates the potential of the approach.