1673-159X

CN 51-1686/N

基于凝聚式层次聚类的调频广播自动识别

An Automatic Identification Method of FM Broadcast Based on Agglomerative Hierarchical Clustering

  • 摘要: 为解决广播自动识别问题,提出一种基于凝聚式层次聚类的调频广播自动识别方法。利用无线电监测设备获取正常广播和黑广播的音频数据,将音频数据转写成文本,对文本数据进行分析处理,计算文本数据的特征权重,构建向量空间模型并进行文本层次聚类分析。将各类文本提取关键词,作为识别广播属性和话题类别的初始语料库,同时在自动识别过程中,将属性未知的广播经过人工确认后进行关键词提取,进一步更新初始语料库。实测数据表明,该方法能够有效地识别出广播的属性和类别,可为相关无线电管理机构提供服务。

     

    Abstract: In order to solve the problem of automatic recognition of broadcast, an automatic recognition method of FM radio is proposed based on clustering. Using the radio monitoring equipment to obtain the audio data whose attributes are normal broadcast and black broadcast, the audio data is transformed into text, and these texts are analyzed, including calculate their feature weight, construct the vector space model for text hierarchical clustering, and then extract key words from various types of text as the initial corpus for identifying broadcast attributes and topic categories. During the automatic recognition process, the corpus can be further updated by manually confirming where the broadcast with unknown attribute. The experiment results show that this method can effectively identify the broadcast attributes and categories, and help for the radio monitoring department.

     

/

返回文章
返回