Investigation of Feature Selection and Pooling Methods for Environmental Sound Classification
Parinaz Binandeh Dehaghani, Danilo Pena, A. Pedro Aguiar
本文探讨了使用轻量级CNN对环境声音分类(ESC)的降维和集合方法的影响。 我们评估各种超参数设置下的稀疏盐区域池(SSRP)及其变体SSRP-Basic(SSRP-B)和SSRP-Top-K(SSRP-T),并将其与主成分分析(PCA)进行比较。 对ESC-50数据集的实验表明,SSRP-T的准确率高达80.69%,显著优于基线CNN(66.75%)和PCA还原模型(37.60%)。 我们的研究结果证实,经过良好调整的稀疏池策略为ESC任务提供了强大,高效和高性能的解决方案,特别是在资源受限的情况下,平衡准确性和计算成本至关重要。
This paper explores the impact of dimensionality reduction and pooling methods for Environmental Sound Classification (ESC) using lightweight CNNs. We evaluate Sparse Salient Region Pooling (SSRP) and its variants, SSRP-Basic (SSRP-B) and SSRP-Top-K (SSRP-T), under various hyperparameter settings and compare them with Principal Component Analysis (PCA). Experiments on the ESC-50 dataset demonstrate that SSRP-T achieves up to 80.69 % accuracy, significantly outperforming both the baseline CNN (66...