首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于XGBoost的搜索结果智能排序系统
引用本文:赵 晗,孟晓景,张春勇.基于XGBoost的搜索结果智能排序系统[J].教育技术导刊,2019,18(12):56-60.
作者姓名:赵 晗  孟晓景  张春勇
作者单位:山东科技大学 计算机科学与工程学院,山东 青岛 266590
基金项目:赛尔网络下一代互联网技术创新项目(NGII20160205)
摘    要:针对传统基于模型的搜索引擎排序及特征获取慢、非数值特征处理复杂等问题,提出一种基于XGBoost的搜索结果智能排序模型。基于XGBoost算法构建排序模型,使用独热编码和Apriori算法对非数值特征进行处理和筛选,利用Redis对用户和商家特征数据进行缓存,通过并行预测的方式加快模型预测商家得分速度,最后利用XGBoost自带的模型评价函数对最终训练出来的模型进行评估,结果显示模型预测准确率为0.76,说明模型给符合用户偏好的商家打出了较高的分数。其中在训练集上的AUC为0.72,在测试集上的AUC为0.69,两者相差不大,表明模型没有出现明显的过拟合现象,而且准确率较高,可用于构建商家排序模型。

关 键 词:XGBoost  特征缓存  特征筛选  并行预测  
收稿时间:2019-03-14

XGBoost-based Intelligent Search Results Sorting System
ZHAO Han,MENG Xiao-jing,ZHANG Chun-yong.XGBoost-based Intelligent Search Results Sorting System[J].Introduction of Educational Technology,2019,18(12):56-60.
Authors:ZHAO Han  MENG Xiao-jing  ZHANG Chun-yong
Institution:College of Computer, Shandong University of Science and Technology,Qingdao 266590,China
Abstract:Aiming at the problems such as slow sequencing speed, slow feature acquisition and complex non-numerical feature processing in traditional model-based search engines, an intelligent ordering model based on XGBoost is proposed. This paper uses the XGBoost algorithm to build a sorting model, uses unique thermal encoding and Apriori algorithm to process and filter non-numerical features, uses Redis to cache user and merchant’s feature data, and accelerates the model’s prediction of merchant’s scoring speed through parallel prediction. Finally,the model evaluation function of XGBoost is used to evaluate the final trained model. The prediction accuracy of the model is 0.76 and the illustrative model gives higher scores to merchants with user preferences, where the AUC on the training set is 0.72, the AUC on the test set is 0.69. There’s not much difference,which indicates that there is no obvious over-fitting phenomenon in the model,and the accuracy is high. Therefore the model can be used to make the business ranking model.
Keywords:XGBoost  feature caching  feature screening  parallel prediction  
点击此处可从《教育技术导刊》浏览原始摘要信息
点击此处可从《教育技术导刊》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号