skip to main content

Sampled and discretized of short-time Fourier transform and non-negative matrix factorization: the single-channel source separation case

Department of Electrical Engineering and Informatics, Vocational College, Universitas Gadjah Mada. Yacaranda st., Sekip Unit IV, Yogyakarta, Indonesia 55281, Indonesia

Received: 5 Aug 2020; Revised: 27 Oct 2020; Accepted: 27 Nov 2020; Published: 31 Jan 2021; Available online: 7 Dec 2020.
Open Access Copyright (c) 2021 The Authors. Published by Department of Computer Engineering, Universitas Diponegoro
Creative Commons License This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Citation Format:
Abstract
The Short-time Fourier transform (STFT) is a popular time-frequency representation in many source separation problems. In this work, the sampled and discretized version of Discrete Gabor Transform (DGT) is proposed to replace STFT within the single-channel source separation problem of the Non-negative Matrix Factorization (NMF) framework. The result shows that NMF-DGT is better than NMF-STFT according to Signal-to-Interference Ratio (SIR), Signal-to-Artifact Ratio (SAR), and Signal-to-Distortion Ratio (SDR). In the supervised scheme, NMF-DGT has a SIR of 18.60 dB compared to 16.24 dB in NMF-STFT, SAR of 13.77 dB to 13.69 dB, and SDR of 12.45 dB to 11.16 dB. In the unsupervised scheme, NMF-DGT has a SIR of 0.40 dB compared to 0.27 dB by NMF-STFT, SAR of -10.21 dB to -10.36 dB, and SDR of -15.01 dB to -15.23 dB.
Fulltext View|Download
Keywords: DGT; STFT; NMF; time-frequency representation; single-channel source separation
Funding: Vocational College of Universitas Gadjah Mada under contract 83/UN1.SV/KPT/2020

Article Metrics:

  1. M. F. Issa and Z. Juhasz, “Improved EOG artifact removal using wavelet enhanced independent component analysis,” Brain Sciences, vol. 9, no. 12, 355, 2019. doi: 10.3390/brainsci9120355
  2. A. Ghazdali, A. Hakim, A. Laghrib, N. Mamouni, and S. Raghay, “A new method for the extraction of fetal ecg from the dependent abdominal signals using blind source separation and adaptive noise cancellation techniques,” Theoritical Biology and Medical Modelling, vol. 12, no. 25, pp. 1-20, 2015. doi: 10.1186/s12976-015-0021-2
  3. H. Qi, Z. Guo, X. Chen, Z. Shen, Z. J. Wang, "Video-based human heart rate measurement using joint blind source separation," Biomedical Signal Processing and Control, vol. 31, pp. 309-320, 2017. doi: 10.1016/j.bspc.2016.08.020
  4. M. Maazaoui, K. Abed-Meraim, and Y. Grenier, “Blind source separation for robot audition using fixed HRTF beamforming,” EURASIP Journal on Advances in Signal Processing, vol. 2012, 58, 2012. doi: 10.1186/1687-6180-2012-58
  5. H. Lee, “Simultaneous blind separation and recognition of speech mixtures using two microphones to control a robot cleaner,” International Journal of Advanced Robotic Systems, vol. 10, no. 2, pp. 1-10, 2017. doi: 10.5772/55408
  6. K. Zhang, G. Tian, and T. Lan, “Blind source separation based on JADE algorithm and application,” in 3rd International Conference on Mechatronics, Robotics and Automation, Shenzhen, China, Apr. 2015, pp. 252-255. doi: 10.2991/icmra-15.2015.50
  7. C-Y. Yu, Y. Li, B. Fei, and W-L. Li, “Blind source separation based x-ray image denoising from an image sequence,” Review of Scientific Instruments, vol. 86, no. 9, 2015, doi: 10.1063/1.4928815
  8. M. M. Hossain, B. E. Levy, D. Thapa, A. L. Oldenburg, and C. M. Gallippi, “Blind source separation-based motion detector for imaging super-paramagnetic iron oxide (SPIO) particles in magnetomotive ultrasound imaging,” IEEE Transactions on Medical Imaging, vol. 37, no. 10, pp. 2356-2366, 2018. doi: 10.1109/TMI.2018.2848204
  9. R. R. Wildeboer et al., “Blind source separation for clutter and noise suppression in ultrasound imaging: review for different applications,” IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, vol. 67, no. 8, pp. 1497-1512, 2020. doi: 10.1109/TUFFC.2020.2975483
  10. K. Yoshii, R. Tomioka, D. Mochihashi, M. Goto, “Beyond NMF: time-domain audio source separation without phase reconstruction,” in 14th International Society for Music Information Retrieval Conference, Curitiba, Brazil, Nov. 2013
  11. M. N. Schmidt, “Single-channel source separation using non-negative matrix factorization,” thesis, Technical University of Denmark, Denmark, 2009
  12. Y. Li, Y. Wang, and Q. Dong, “A novel mixing matrix estimation algorithm in instantaneous underdetermined blind source separation,” Signal, Image and Video Processing, vol. 14, pp. 1001-1008, 2020. doi: 10.1007/s11760-019-01632-z
  13. Y. Zhang, Z. Zhang, H. Tao, and Y. Lin, “A single source point detection algorithm for underdetermined blind source separation problem,” in ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2019, S. Liu and G. Yang (Eds.): ADHIP 2018, LNICST 279, 2019, pp. 68–76. doi: 10.1007/978-3-030-19086-6_8
  14. W. Guan, L. Dong, Y. Cai, J. Yan, and Y. Han, “Sparse component analysis with optimized clustering for underdetermined blind modal identification,” Measurement Science and Technology, vol. 30, no. 12, 2019. doi: 10.1088/1361-6501/ab3054
  15. H. Zhang, G. Hua, L. Yu, Y. Cai, and G. Bi, “Underdetermined blind separation of overlapped speech mixtures in time-frequency domain with estimated number of sources,” Speech Communication, vol. 89, pp. 1-16, 2017. doi: 10.1016/j.specom.2017.02.003
  16. T. Peng, Y. Chen, and Z. Liu, “A time–frequency domain blind source separation method for underdetermined instantaneous mixtures,” Circuits System Signal Processing, vol. 34, pp. 3883-3895, 2015. doi: 10.1007/s00034-015-0035-3
  17. H. Sawada, R. Mukai, S. Araki, and S. Makino, “Convolutive blind source separation for more than two sources in the frequency domain,” in International Conference on Acoustics, Speech, and Signal Processing, Montreal, Canada, May 2004, pp. iii-885. doi: 10.1109/ICASSP.2004.1326687
  18. M. Jafari, E. Vincent, S. Abdallah, M. Plumbley, and M. Davies, “Blind source separation of convolutive audio using an adaptive stereo basis,” UK ICA Research Network Workshop, Southampton, United Kingdom, Sep. 2006. Available: https://hal.inria.fr/inria-00544290. [Accessed: August 2, 2020]
  19. T. Asamizu, S. Saito, K. Oishi, and T. Furukawa, “Overdetermined blind source separation using approximate joint diagonalization,” in 60th International Midwest Symposium on Circuits and Systems, Boston, MA, USA, Oct. 2017, pp. 168-171. doi: 10.1109/MWSCAS.2017.8052887
  20. L. Wang, J. D. Reiss, and A. Cavallaro, “Over-determined source separation and localization using distributed microphones,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no. 9, pp. 170-177, 2016. doi: 10.1109/TASLP.2016.2573048
  21. K. Yatabe, Y. Masuyama, T. Kusano, and Y. Oikawa, "Representation of complex spectrogram via phase conversion," in Acoustical Science and Technology, vol. 40, no. 3, pp. 170-177, 2019. doi: 10.1250/ast.40.170
  22. H. G. Feichtinger and T. Strohmer, Eds., Gabor Analysis and Algorithms: Theory and Applications. Birkhäuser, Boston, 1998. doi: 10.1007/978-1-4612-2016-9
  23. Z. Průša, “STFT and DGT phase conventions and phase derivatives interpretation,” The Large Time-Frequency Analysis Toolbox (LTFAT Notes), 2016. Available: https://www.ltfat.github.io/notes/ ltfatnote042.pdf . [Accessed: July 29, 2020]
  24. S. A. Rafiei, H. Sheikhzadeh, and M. Sabbaqi, “A new reduced-interference source separation method based on a complementary combination of masking algorithm and mixing matrix estimation,” Iranian Journal of Science and Technology, Transactions of Electrical Engineering, vol. 44, pp. 1529-1547, 2020, doi: 10.1007/s40998-020-00326-4
  25. D. D. Lee and H. S. Seung, “Algorithms for non-negative matrix factorization,” in 13th International Conference on Neural Information Information Systems, Cambridge, USA, Jan. 2000, pp. 556-562
  26. M. W. Berry, M. Brown, A. N. Langville, V. P. Pauca, and R. J. Plemmons, ”Algorithms and applications for approximate nonnegative matrix factorization,” Computational Statistics & Data Analysis, vol. 52, no. 1, pp. 155-173, 2007. doi: 10.1016/j.csda.2006.11.006
  27. A. Hyvärinen and E. Oja, “Independent component analysis: algorithms and applications,” Neural Networks, vol. 13, no. 4-5, pp. 411-430, 2000. doi: 10.1016/S0893-6080(00)00026-5
  28. A. Hyvärinen, “Fast and robust fixed-point algorithms for independent component analysis,” IEEE Trans Neural Network, vol. 10, no. 3, pp. 626-634, 1999. doi: 10.1109/72.761722
  29. J. S. Garofolo, L. F. Lamel, W. M. Fisher, J. G. Fiscus, D. S. Pallett, N. L. Dahlgren, and V. Zue, “TIMIT acoustic-phonetic continuous speech corpus LDC93S1,” Philadelphia: Linguistic Data Consortium, 1993. Available: https://catalog.ldc. upenn.edu/LDC93S1. [Accessed: August 2, 2020]
  30. E. Vincent et al., “The signal separation evaluation campaign (2007-2010): Achievements and remaining challenges,” Signal Processing, vol. 92, no. 8, pp. 1928-1936, 2012. doi: 10.1016/j.sigpro.2011.10.007
  31. M. I. Mandel, S. Bressler, B. Shinn-Cunningham, and D. P. W. Ellis, "Evaluating source separation algorithms with Reverberant speech," IEEE Transactions on Audio, Speech, and Language Processing, vol. 18, no. 7, pp. 1872-1883, 2010. doi: 10.1109/TASL.2010.2052252

Last update: 2021-06-23 03:31:32

No citation recorded.

Last update: 2021-06-23 03:31:33

No citation recorded.