Sampled and discretized of short-time Fourier transform and non-negative matrix factorization: the single-channel source separation case

Jans Hendry; Isnan Nur Rifai; Yoga Mileniandi

doi:10.14710/jtsiskom.2020.13858

DOI: https://doi.org/10.14710/jtsiskom.2020.13858

Sampled and discretized of short-time Fourier transform and non-negative matrix factorization: the single-channel source separation case

Jans Hendry

, Isnan Nur Rifai, Yoga Mileniandi

Department of Electrical Engineering and Informatics, Vocational College, Universitas Gadjah Mada. Yacaranda st., Sekip Unit IV, Yogyakarta 55281, Indonesia

Received: 5 Aug 2020; Revised: 27 Oct 2020; Accepted: 27 Nov 2020; Available online: 7 Dec 2020; Published: 31 Jan 2021.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Citation Format:

Abstract

The Short-time Fourier transform (STFT) is a popular time-frequency representation in many source separation problems. In this work, the sampled and discretized version of Discrete Gabor Transform (DGT) is proposed to replace STFT within the single-channel source separation problem of the Non-negative Matrix Factorization (NMF) framework. The result shows that NMF-DGT is better than NMF-STFT according to Signal-to-Interference Ratio (SIR), Signal-to-Artifact Ratio (SAR), and Signal-to-Distortion Ratio (SDR). In the supervised scheme, NMF-DGT has a SIR of 18.60 dB compared to 16.24 dB in NMF-STFT, SAR of 13.77 dB to 13.69 dB, and SDR of 12.45 dB to 11.16 dB. In the unsupervised scheme, NMF-DGT has a SIR of 0.40 dB compared to 0.27 dB by NMF-STFT, SAR of -10.21 dB to -10.36 dB, and SDR of -15.01 dB to -15.23 dB.

Fulltext View|Download Email colleagues

Keywords: DGT; STFT; NMF; time-frequency representation; single-channel source separation

Funding: Vocational College of Universitas Gadjah Mada under contract 83/UN1.SV/KPT/2020

Article Metrics:

Article Info

Section: Original Research Articles

Language : EN

In Volume 9, Issue 1, Year 2021 (January 2021)

Pembuatan Aplikasi Mobile Learning sebagai Sarana Pembelajaran di Lingkungan Universitas Diponegoro Pengembangan Aplikasi Berbasis Web untuk Menampilkan Absensi dan Nilai Akhir Peserta Didik (Studi Kasus di SMP Negeri 32 Semarang) Retrieval of source documents in a text reuse system Rancang Bangun Layanan Cloud Computing Berbasis IaaS Menggunakan Virtualbox Purwarupa Sistem Pemantau dan Peringatan Kadar Gas Karbon Monoksida (CO) pada Kabin Mobil Berbasis Mikrokontroler ATMega8 More related articles

Most cited articles

Discrimination of civet coffee using visible spectroscopy People counter on CCTV video using histogram of oriented gradient and Kalman filter methods Design of wireless sensor networks (WSN) to monitor temperature and humidity using nrf24l01 Optimization for prediction model of palm oil land suitability using spatial decision tree algorithm Perancangan Papan Informasi Digital Berbasis Web pada Raspberry pi More cited articles

M. F. Issa and Z. Juhasz, “Improved EOG artifact removal using wavelet enhanced independent component analysis,” Brain Sciences, vol. 9, no. 12, 355, 2019. doi: 10.3390/brainsci9120355
A. Ghazdali, A. Hakim, A. Laghrib, N. Mamouni, and S. Raghay, “A new method for the extraction of fetal ecg from the dependent abdominal signals using blind source separation and adaptive noise cancellation techniques,” Theoritical Biology and Medical Modelling, vol. 12, no. 25, pp. 1-20, 2015. doi: 10.1186/s12976-015-0021-2
H. Qi, Z. Guo, X. Chen, Z. Shen, Z. J. Wang, "Video-based human heart rate measurement using joint blind source separation," Biomedical Signal Processing and Control, vol. 31, pp. 309-320, 2017. doi: 10.1016/j.bspc.2016.08.020
M. Maazaoui, K. Abed-Meraim, and Y. Grenier, “Blind source separation for robot audition using fixed HRTF beamforming,” EURASIP Journal on Advances in Signal Processing, vol. 2012, 58, 2012. doi: 10.1186/1687-6180-2012-58
H. Lee, “Simultaneous blind separation and recognition of speech mixtures using two microphones to control a robot cleaner,” International Journal of Advanced Robotic Systems, vol. 10, no. 2, pp. 1-10, 2017. doi: 10.5772/55408
K. Zhang, G. Tian, and T. Lan, “Blind source separation based on JADE algorithm and application,” in 3rd International Conference on Mechatronics, Robotics and Automation, Shenzhen, China, Apr. 2015, pp. 252-255. doi: 10.2991/icmra-15.2015.50
C-Y. Yu, Y. Li, B. Fei, and W-L. Li, “Blind source separation based x-ray image denoising from an image sequence,” Review of Scientific Instruments, vol. 86, no. 9, 2015, doi: 10.1063/1.4928815
M. M. Hossain, B. E. Levy, D. Thapa, A. L. Oldenburg, and C. M. Gallippi, “Blind source separation-based motion detector for imaging super-paramagnetic iron oxide (SPIO) particles in magnetomotive ultrasound imaging,” IEEE Transactions on Medical Imaging, vol. 37, no. 10, pp. 2356-2366, 2018. doi: 10.1109/TMI.2018.2848204
R. R. Wildeboer et al., “Blind source separation for clutter and noise suppression in ultrasound imaging: review for different applications,” IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, vol. 67, no. 8, pp. 1497-1512, 2020. doi: 10.1109/TUFFC.2020.2975483
K. Yoshii, R. Tomioka, D. Mochihashi, M. Goto, “Beyond NMF: time-domain audio source separation without phase reconstruction,” in 14th International Society for Music Information Retrieval Conference, Curitiba, Brazil, Nov. 2013
M. N. Schmidt, “Single-channel source separation using non-negative matrix factorization,” thesis, Technical University of Denmark, Denmark, 2009
Y. Li, Y. Wang, and Q. Dong, “A novel mixing matrix estimation algorithm in instantaneous underdetermined blind source separation,” Signal, Image and Video Processing, vol. 14, pp. 1001-1008, 2020. doi: 10.1007/s11760-019-01632-z
Y. Zhang, Z. Zhang, H. Tao, and Y. Lin, “A single source point detection algorithm for underdetermined blind source separation problem,” in ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2019, S. Liu and G. Yang (Eds.): ADHIP 2018, LNICST 279, 2019, pp. 68–76. doi: 10.1007/978-3-030-19086-6_8
W. Guan, L. Dong, Y. Cai, J. Yan, and Y. Han, “Sparse component analysis with optimized clustering for underdetermined blind modal identification,” Measurement Science and Technology, vol. 30, no. 12, 2019. doi: 10.1088/1361-6501/ab3054
H. Zhang, G. Hua, L. Yu, Y. Cai, and G. Bi, “Underdetermined blind separation of overlapped speech mixtures in time-frequency domain with estimated number of sources,” Speech Communication, vol. 89, pp. 1-16, 2017. doi: 10.1016/j.specom.2017.02.003
T. Peng, Y. Chen, and Z. Liu, “A time–frequency domain blind source separation method for underdetermined instantaneous mixtures,” Circuits System Signal Processing, vol. 34, pp. 3883-3895, 2015. doi: 10.1007/s00034-015-0035-3
H. Sawada, R. Mukai, S. Araki, and S. Makino, “Convolutive blind source separation for more than two sources in the frequency domain,” in International Conference on Acoustics, Speech, and Signal Processing, Montreal, Canada, May 2004, pp. iii-885. doi: 10.1109/ICASSP.2004.1326687
M. Jafari, E. Vincent, S. Abdallah, M. Plumbley, and M. Davies, “Blind source separation of convolutive audio using an adaptive stereo basis,” UK ICA Research Network Workshop, Southampton, United Kingdom, Sep. 2006. Available: https://hal.inria.fr/inria-00544290. [Accessed: August 2, 2020]
T. Asamizu, S. Saito, K. Oishi, and T. Furukawa, “Overdetermined blind source separation using approximate joint diagonalization,” in 60th International Midwest Symposium on Circuits and Systems, Boston, MA, USA, Oct. 2017, pp. 168-171. doi: 10.1109/MWSCAS.2017.8052887
L. Wang, J. D. Reiss, and A. Cavallaro, “Over-determined source separation and localization using distributed microphones,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no. 9, pp. 170-177, 2016. doi: 10.1109/TASLP.2016.2573048
K. Yatabe, Y. Masuyama, T. Kusano, and Y. Oikawa, "Representation of complex spectrogram via phase conversion," in Acoustical Science and Technology, vol. 40, no. 3, pp. 170-177, 2019. doi: 10.1250/ast.40.170
H. G. Feichtinger and T. Strohmer, Eds., Gabor Analysis and Algorithms: Theory and Applications. Birkhäuser, Boston, 1998. doi: 10.1007/978-1-4612-2016-9
Z. Průša, “STFT and DGT phase conventions and phase derivatives interpretation,” The Large Time-Frequency Analysis Toolbox (LTFAT Notes), 2016. Available: https://www.ltfat.github.io/notes/ ltfatnote042.pdf . [Accessed: July 29, 2020]
S. A. Raﬁei, H. Sheikhzadeh, and M. Sabbaqi, “A new reduced-interference source separation method based on a complementary combination of masking algorithm and mixing matrix estimation,” Iranian Journal of Science and Technology, Transactions of Electrical Engineering, vol. 44, pp. 1529-1547, 2020, doi: 10.1007/s40998-020-00326-4
D. D. Lee and H. S. Seung, “Algorithms for non-negative matrix factorization,” in 13th International Conference on Neural Information Information Systems, Cambridge, USA, Jan. 2000, pp. 556-562
M. W. Berry, M. Brown, A. N. Langville, V. P. Pauca, and R. J. Plemmons, ”Algorithms and applications for approximate nonnegative matrix factorization,” Computational Statistics & Data Analysis, vol. 52, no. 1, pp. 155-173, 2007. doi: 10.1016/j.csda.2006.11.006
A. Hyvärinen and E. Oja, “Independent component analysis: algorithms and applications,” Neural Networks, vol. 13, no. 4-5, pp. 411-430, 2000. doi: 10.1016/S0893-6080(00)00026-5
A. Hyvärinen, “Fast and robust ﬁxed-point algorithms for independent component analysis,” IEEE Trans Neural Network, vol. 10, no. 3, pp. 626-634, 1999. doi: 10.1109/72.761722
J. S. Garofolo, L. F. Lamel, W. M. Fisher, J. G. Fiscus, D. S. Pallett, N. L. Dahlgren, and V. Zue, “TIMIT acoustic-phonetic continuous speech corpus LDC93S1,” Philadelphia: Linguistic Data Consortium, 1993. Available: https://catalog.ldc. upenn.edu/LDC93S1. [Accessed: August 2, 2020]
E. Vincent et al., “The signal separation evaluation campaign (2007-2010): Achievements and remaining challenges,” Signal Processing, vol. 92, no. 8, pp. 1928-1936, 2012. doi: 10.1016/j.sigpro.2011.10.007
M. I. Mandel, S. Bressler, B. Shinn-Cunningham, and D. P. W. Ellis, "Evaluating source separation algorithms with Reverberant speech," IEEE Transactions on Audio, Speech, and Language Processing, vol. 18, no. 7, pp. 1872-1883, 2010. doi: 10.1109/TASL.2010.2052252

Last update:

No citation recorded.

Last update: 2026-02-11 17:30:51

No citation recorded.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Starting from 2021, the author(s) whose article is published in the JTSiskom journal attain the copyright for their article and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. By submitting the manuscript to JTSiskom, the author(s) agree with this policy. No special document approval is required.

The author(s) guarantee that:

their article is original, written by the mentioned author(s),
has never been published before,
does not contain statements that violate the law, and
does not violate the rights of others, is subject to copyright held exclusively by the author(s), is free from the rights of third parties, and the necessary written permission to quote from other sources has been obtained by the author(s).

The author(s) retain all rights to the published work, such as (but not limited to) the following rights:

Copyright and other proprietary rights related to the article, such as patents,
The right to use the substance of the article in its own future works, including lectures and books,
The right to reproduce the article for its own purposes,
The right to archive all versions of the article in any repository, and
The right to enter into separate additional contractual arrangements for the non-exclusive distribution of published versions of the article (for example, posting them to institutional repositories or publishing them in a book), acknowledging its initial publication in this journal (Jurnal Teknologi dan Sistem Komputer).

Suppose the article was prepared jointly by more than one author. Each author submitting the manuscript warrants that all co-authors have given their permission to agree to copyright and license notices (agreements) on their behalf and notify co-authors of the terms of this policy. JTSiskom will not be held responsible for anything arising because of the writer's internal dispute. JTSiskom will only communicate with correspondence authors.

Authors should also understand that their articles (and any additional files, including data sets and analysis/computation data) will become publicly available once published. The license of published articles (and additional data) will be governed by a Creative Commons Attribution-ShareAlike 4.0 International License. JTSiskom allows users to copy, distribute, display and perform work under license. Users need to attribute the author(s) and JTSiskom to distribute works in journals and other publication media. Unless otherwise stated, the author(s) is a public entity as soon as the article is published.

Sampled and discretized of short-time Fourier transform and non-negative matrix factorization: the single-channel source separation case

EDITORIAL OFFICE OF JURNAL TEKNOLOGI DAN SISTEM KOMPUTER