This paper proposes a novel localization scheme for multiple sound sources that imposes the relaxed sparsity constrains (not all time-frequency coefficients are overlapped) on the source signals. First, a “DOA convergence” assumption is proposed, which means that if most of the time-frequency (T-F) bins in a T-F zone are derived from only one source – defined as single source bins (SSBs), the corresponding direction of arrival (DOA) estimates are relatively concentrated with a heavy density. This assumption is validated through statistical analysis by applying a quantitative measure of convergence. Accordingly, by applying the “DOA convergence” assumption, the detection of SSBs is converted to a clustering problem, K-means clustering and density-based spatial clustering of applications with noise (DBSCAN) algorithms are utilized to complete the task in this paper. The cross distortions (localization error due to the cocktail party phenomenon) in localization caused by multiple simultaneously occurring sources is significantly weakened by conducting DOA estimation among these SSBs, i.e., the multiple source localization is rewritten to a single source one among these SSBs. Moreover, the proposed SSBs detection is applicable to other localization methods and not limited to specific microphone topology. Experimental results demonstrate the localization accuracy of the proposed method outperforms the state-of-the- art localization approaches which are based on single source zone detection. However the proposed method is capable of real-time processing, the accuracy is insufficient in the current system. If non-real time processing is allowed, our method can be realized with higher accuracy than the conventional ones.