Machine Learning for Speech Processing

Direction of Arrival (DOA) Estimation


Under-determined High Resolution DOA estimation: A 2pth-Order Source-Signal/Noise Subspace Constrained Optimization


Jinho Choi and Chang D. Yoo


For estimating the direction of arrival (DOA) of non-stationary source signal that includes non-Gaussian signals such as speech and audio, the solutions of two constrained optimization problems (COPs) involving the local 2pth order cumulants are investigated. In the two COPs, only the space diversity provided by a uniform linear array (ULA) is considered. When p=1, each of the COPs considered is equivalent to the COP considered in derving the Khatri-Rao (KR) subspace-based algorithms with one additional constraint to limit the interference of other source signals. The derived DOA estimation algorithms by solving one of the COPs can theoretically be shown to identify up to 2p(M-1) sources where M is the number of sensors. To reduce the DOA estimation error that can result from using a small number of snapshots and to improve robustness due to varying duration of stationarity, the average of several 2pth order cumulant matrices estimated using different frame-lengths is considered in the COPs. In this paper, we focus on the use of the second and fourth order cumulants (p=1, p=2). The experimental results demonstrate that the derived algorithms when p=1 outperform the KR subspace-based algorithms and the conventional FO cumulant based algorithm such as the 4-MUSIC for quasi-stationary, non-Gaussian synthetic and real speech/audio data under various adverse environments. The derived algorithms when p=2 that can be especially used in the situation that the number of sensors are smaller, are also evaluated for the identifiability and effectiveness.

Overview of Proposed Algorithms










High resolution test






  Root mean squared angle error (RMSE)


Related Papers

3. Jinho Choi and Chang D. Yoo, “Underdetermined High-Resolution DOA Estimation: A 2pth-Order Signal/Noise Subspace Constrained Optimization ”, submitted for publication.

2. Jinho Choi and Chang D. Yoo, “A High Resolution Multiple Source Localization based on Generalized Cumulant Structure (GCS) Matrix”, in Proceedings of Interspeech, Florence, Italy, 2011.

1. Jinho Choi and Chang D. Yoo, "A Maximum a Posteriori Sound Source Localization in Reverberant and Noisy Conditions", in Proceedings of Interspeech, Makuhari, Japan, 2010