基于增强AlexNet的音乐流派识别研究 刘万军,孟仁杰,曲海成,刘腊梅 (辽宁工程技术大学软件学院,辽宁葫芦岛125105) 摘要:针对机器学习模型对音乐流派特征识别能力较弱的问题,提出了一种基于深度卷积神经网络的音乐流 派识别(DCNN-MG)模型。该模型首先通过快速傅里叶变换提取音频信息,生成可以输入DCNN的频谱并切 割生成频谱切片。然后通过融合带泄露整流(Leaky ReLU)函数、双曲正切(Tanh)函数和Softplus分类器对 AlexNet进行增强。其次将生成的频谱切片输入增强的AlexNet进行多批次的训练与验证,提取并学习音乐特 征,得到可以有效分辨音乐特征的网络模型。最后使用输出模型进行音乐流派识别测试。实验结果表明,增强 的AlexNet在音乐特征识别准确率和网络收敛效果上明显优于AlexNet及其他常用的DCNN、DCNN-MGR模型 在音乐流派识别准确率上比其他机器学习模型提升了4%~20%。 关键词:音乐流派识别;深度卷积神经网络;机器学习;深度学习;AlexNet;音频特征提取;音乐特征识别 Music genre recognition research based on enhanced AlexNet LIU Wanjun,MENG Renjie,QU Haicheng,LIU Lamei (College of Software,Liaoning Technical University,Huludao 125105,China) Abstract:To solve the problem that machine learning model has weak ability to identify music genre features,a music genre recognition model based on deep convolutional neural network(DCNN-MGR)is proposed in this paper.At first, the model extracts audio information through Fast Fourier Transformation,generating spectrums that can be input to the DCNN and slicing the generated spectrums.Then AlexNet is enhanced by fusion of Leaky ReLU function,Tanh func- tion and Softplus classifier.The generated spectrum slices are input into the enhanced AlexNet for multi-batch training and verification.Music features are extracted and learned,and a network model that can effectively distinguish music features is obtained.At last,the output model is applied to music genre recognition and test.The experimental results show that the enhanced AlexNet is superior to AlexNet and other commonly used DCNN in terms of accuracy of music feature recognition and network convergence effect.The DCNN-MGR model is 4%~20%higher than other machine learning models in music genre recognition accuracy. Music genre recognition research based on enhanced AlexNet[J]. CAAI transactions on intelligent systems, 2020, 15(4): 750–757. Music genre recognition research based on enhanced AlexNet LIU Wanjun,MENG Renjie,QU Haicheng,LIU Lamei (College of Software, Liaoning Technical University, Huludao 125105, China) Abstract: To solve the problem that machine learning model has weak ability to identify music genre features, a music genre recognition model based on deep convolutional neural network (DCNN-MGR) is proposed in this paper. At first, the model extracts audio information through Fast Fourier Transformation, generating spectrums that can be input to the DCNN and slicing the generated spectrums. Then AlexNet is enhanced by fusion of Leaky ReLU function, Tanh func￾tion and Softplus classifier. The generated spectrum slices are input into the enhanced AlexNet for multi-batch training and verification. Music features are extracted and learned, and a network model that can effectively distinguish music features is obtained. At last, the output model is applied to music genre recognition and test. The experimental results show that the enhanced AlexNet is superior to AlexNet and other commonly used DCNN in terms of accuracy of music feature recognition and network convergence effect. The DCNN-MGR model is 4%~20% higher than other machine learning models in music genre recognition accuracy. Keywords: music genres recognition; deep convolutional neural network; machine learning; deep learning; AlexNet; au￾dio feature extraction; audio feature extraction 音乐流派是被提及最多的音乐标签之一。随 着互联网曲库容量的增加,按流派检索音乐成为 音乐信息检索的主流方法,同时也是音乐服务平 台为用户推荐音乐的重要基础。自动且精准地进 行音乐流派识别可以有效减少人力成本。常用的 音乐流派识别模型一般包括训练和测试两个阶 段。在训练阶段,首先通过建立数学模型描绘具 有区分度的音乐流派数字特征;然后采用预加 重、梅尔滤波、倒谱提升等方式提取音乐文件的 数字特征;最后基于不同流派的数字特征和分布 特性训练分类器。在测试阶段,使用与训练阶段 收稿日期:2019−09−16. 基金项目:国家自然科学基金青年基金项目 (41701479). 通信作者:孟仁杰. 