Establishment of a prediction model of risk factors for psoriasis vulgaris based on machine learning
-
摘要:
目的 选择评价寻常型银屑病严重程度的敏感指标与指标组合,构建银屑病危险因素预测模型,为增强诊断评估和临床决策提供支持。 方法 纳入2019年11月1日—2021年10月31日就诊于陕西省中医医院、西安市中医医院、陕西中医药大学附属医院、宝鸡市中医医院、安康市中医医院、西宁市第一人民医院301例寻常型银屑病患者,收集人口统计学指标、银屑病量表信息、中医证型、实验室检查等筛查数据。基于贝叶斯、逻辑回归、支持向量机、随机森林、自适应增强和随机梯度下降6种机器学习算法构建及验证预测银屑病轻重症的分类模型,通过ROC曲线下面积(AUC)、精确率、召回率和准确率等参数评价模型的分类性能。 结果 可表达分类模型贝叶斯算法的AUC值为0.801 4,逻辑回归、支持向量机、随机森林、自适应提升和随机梯度下降5种隐式分类模型的AUC值分别为0.759 1、0.754 6、0.800 4、0.712 6和0.773 7。贝叶斯和随机森林预测模型在ROC曲线中的真阳性率均超过80%。 结论 “中医证型”“发病季节”“PDW”“Baso”和“DLQI”是预测银屑病严重程度的关键指标。机器学习和数据挖掘能够利用真实世界的临床数据构建银屑病危险因素的预测模型,为病情早期评估和制定医疗决策提供依据。 Abstract:Objective This study aims to identify sensitive indicators and their combinations to distinguish psoriasis, providing support for augmenting diagnostic evaluations and clinical decisions. Methods A total of 301 patients with psoriasis vulgaris from multiple centers, who visited from November 1, 2019 to October 31, 2021, were included. Data were collected on general characteristics, psoriasis scale information, TCM syndrome, and laboratory tests. Six machine learning algorithms were used to construct and verify the classification model for predicting mild and severe psoriasis. The classification performance of the models was evaluated by the parameters of receiver operating characteristic (ROC) area under the curve (AUC), precision, recall, and accuracy. Result The Naive Bayesian Model achieve an AUC of 0.801 4, while the AUC values of the other implicit classification models of logistic regression, SVM, random forest, AdaBoost, and SGD were 0.759 1, 0.754 6, 0.800 4, 0.712 6, and 0.773 7, respectively. Both Bayesian and random forest models had true positive rates of more than 80% in the ROC curve. Conclusion "DLQI", "TCM syndrome", "Baso", "PDW", and "Onset season" were found as key indicators for predicting psoriasis severity. Machine learning and data mining can use real-world clinical data to develop predictive models for psoriasis risk factors, providing a basis for early disease assessment and medical decision-making. -
Key words:
- Psoriasis vulgaris /
- Machine learning /
- Severity /
- Risk factors /
- Prediction model
-
表 1 301例PV患者一般资料比较[例(%)]
Table 1. General data characteristics of 301 patients with PV[cases (%)]
项目 所有样本 PASI<10分(n=112) PASI≥10分(n=189) 统计量 P值 项目 所有样本 PASI<10分(n=112) PASI≥10分(n=189) 统计量 P值 性别 10.466a 0.002 加重季节 8.476a 0.111 男性 196(65.12) 60(53.57) 136(71.96) 春天 5(1.66) 4(3.57) 1(0.53) 女性 105(34.88) 52(46.43) 53(28.04) 夏天 4(1.33) 2(1.79) 2(1.06) 民族 5.178a 0.115 秋天 63(20.93) 18(16.07) 45(23.81) 汉族 292(97.01) 107(95.54) 185(97.88) 冬天 44(14.62) 21(18.75) 23(12.17) 回族 3(1.00) 3(2.68) 0 换季 7(2.33) 3(2.68) 4(2.12) 满族 1(0.33) 0 1(0.53) 无关 178(59.14) 64(57.14) 114(60.32) 土族 5(1.66) 2(1.79) 3(1.59) 既往史 13.501a 0.015 婚姻 4.478a 0.193 无 212(70.43) 85(75.89) 127(67.20) 未婚 60(19.93) 27(24.11) 33(17.46) 糖尿病 9(2.99) 1(0.89) 8(4.23) 已婚 233(77.41) 82(73.21) 151(79.89) 高血压 30(9.97) 4(3.57) 26(13.76) 离异 5(1.66) 3(2.68) 2(1.06) 肝胆疾病 15(4.98) 8(7.14) 7(3.70) 丧偶 3(1.00) 0 3(1.59) 胃肠道疾病 8(2.66) 4(3.57) 4(2.12) 家族史 0.138a 0.746 其他 27(8.97) 10(8.93) 17(8.99) 无 253(84.05) 93(83.04) 160(84.66) 中医证型 31.254a <0.001 有 48(15.95) 19(16.96) 29(15.34) 血热证 175(58.14) 53(47.32) 122(64.55) 吸烟史 4.800a 0.059 血瘀证 64(21.26) 18(16.07) 46(24.34) 无 217(72.09) 87(77.68) 130(68.78) 血燥证 45(14.95) 33(29.46) 12(6.35) 有 83(27.57) 24(21.43) 59(31.22) 血虚证 7(2.33) 4(3.57) 3(1.59) 戒烟 1(0.33) 1(0.89) 0 其他 10(3.32) 4(3.57) 6(3.17) 过敏史 6.297a 0.015 中医体质 31.748a <0.001 无 272(90.37) 95(84.82) 177(93.65) 平和 65(21.59) 38(33.93) 27(14.29) 有 29(9.63) 17(15.18) 12(6.35) 阴虚 34(11.30) 12(10.71) 22(11.64) 发病原因 21.168a 0.003 阳虚 28(9.30) 11(9.82) 17(8.99) 无 216(71.76) 73(65.18) 143(75.66) 湿热 23(7.64) 12(10.71) 11(5.82) 疾病 23(7.64) 7(6.25) 16(8.47) 痰湿 21(6.98) 6(5.36) 15(7.94) 季节气候 26(8.64) 16(14.29) 10(5.29) 气虚 14(4.65) 4(3.57) 10(5.29) 劳累 10(3.32) 5(4.46) 5(2.65) 偏颇 13(4.32) 7(6.25) 6(3.17) 饮酒 6(1.99) 1(0.89) 5(2.65) 血瘀 12(3.99) 4(3.57) 8(4.23) 外伤 5(1.66) 1(0.89) 4(2.12) 气郁 4(1.33) 2(1.79) 2(1.06) 药物 3(1.00) 0 3(1.59) 其他 87(28.90) 16(14.29) 71(37.57) 饮食 3(1.00) 1(0.89) 2(1.06) 基线期医师主管评价指数 5.187b <0.001 其他 9(2.99) 8(7.14) 1(0.53) 几乎没有 20(6.64) 19(16.96) 1(0.53) 发病季节 19.998a <0.001 轻度 113(37.54) 48(42.86) 65(34.39) 春天 36(11.96) 13(11.61) 23(12.17) 中度 137(45.51) 40(35.71) 97(51.32) 夏天 30(9.97) 11(9.82) 19(10.05) 严重 30(9.97) 4(3.57) 26(13.76) 秋天 64(21.26) 28(25.00) 36(19.05) 非常严重 1(0.33) 1(0.89) 0 冬天 108(35.88) 51(45.54) 57(30.16) 无关 63(20.93) 9(8.04) 54(28.57) 注:a为χ2值,b为Z值。 表 2 轻重度PV的随机森林模型分类性能结果
Table 2. Random forest model classification performance results for mild and severe PV
病情严重程度 精确率 召回率 F1分数 支持度 轻度 0.60 0.56 0.58 43 重度 0.77 0.79 0.78 78 准确率 0.71 121 宏平均 0.68 0.68 0.68 121 加权平均 0.71 0.71 0.71 121 表 3 轻重度PV的贝叶斯模型分类性能结果
Table 3. Bayesian model classification performance results for mild and severe PV
病情严重程度 精确率 召回率 F1分数 支持度 轻度 0.67 0.84 0.74 43 重度 0.90 0.77 0.83 78 准确率 0.79 121 宏平均 0.78 0.80 0.78 121 加权平均 0.81 0.79 0.80 121 表 4 轻重度PV临床特征的贝叶斯概率分布模型
Table 4. Bayesian probability distribution model of clinical characteristics for mild and severe PV
特征 取值(%) flag_1(P=0.628)特征概率 flag_0(P=0.372)特征概率 中医证型 1 0.631 578 9 0.522 388 1 中医证型 2 0.228 070 2 0.119 403 0 中医证型 3 0.078 947 4 0.298 507 5 中医证型 4 0.017 543 9 0.029 850 7 中医证型 5 0.043 859 6 0.029 850 7 发病季节 1 0.114 035 1 0.104 477 6 发病季节 2 0.114 035 1 0.104 477 6 发病季节 3 0.201 754 4 0.223 880 6 发病季节 4 0.315 789 5 0.447 761 2 发病季节 9 0.254 386 0 0.119 403 0 PDW [0, 0.12] 0.350 877 2 0.164 179 1 PDW (0.12, 0.13] 0.447 368 4 0.671 641 8 PDW (0.13, 1.00] 0.201 754 4 0.164 179 1 Baso [0, 0.20] 0.421 052 6 0.134 328 4 Baso (0.20, 0.30] 0.412 280 7 0.626 865 7 Baso (0.30, 1.00] 0.166 666 7 0.238 806 0 DLQI [0, 0.38] 0.210 526 3 0.462 686 6 DLQI (0.38, 0.48] 0.421 052 6 0.253 731 3 DLQI (0.48, 1] 0.368 421 1 0.283 582 1 注:flag_1为重度组,flag_0为轻度组。中医证型,1、血热证;2、血瘀证;3、血燥证;4、血虚证;5、其他。发病季节,1、春季;2、夏季;3、秋季;4、冬季;9、无关。表中的分布区间是索引全局值的百分比,如(0.38,0.48)表示分布区间为>38%和≤48%。 -
[1] 中华医学会皮肤性病学分会银屑病专业委员会, 中国银屑病诊疗指南(2023版)[J]. 中华皮肤科杂志, 2023, 56(7): 573-625.Psoriasis Professional Committee of Chinese Society of Dermatology and Venereology, Guidelines for the Diagnosis and Treatment of Psoriasis in China (2023 Edition)[J]. Chinese Journal of Dermatology, 2023, 56(7): 573-625. [2] 史玉玲. 《中国银屑病诊疗指南(2023版)》解读[J]. 同济大学学报(医学版), 2023, 44(5): 631-633.SHI Y L. Interpretation of the Guidelines for the Diagnosis and Treatment of Psoriasis in China (2023 Edition)[J]. Journal of Tongji University (Medical Science), 2023, 44(5): 631-633. [3] ROWE M. An introduction to machine learning for clinicians[J]. Acad Med, 2019, 94(10): 1433-1436. doi: 10.1097/ACM.0000000000002792 [4] TUNTHANATHIP T, SAE-HENG S, OEARSAKUL T, et al. Machine learning applications for the prediction of surgical site infection in neurological operations[J]. Neurosurgical focus, 2019, 47(2): E7. DOI: 10.3171/2019.5.FOCUS19241. [5] ADLER E D, VOORS A A, KLEIN L, et al. Improving risk prediction in heart failure using machine learning[J]. Eur J Heart Fail, 2020, 22(1): 139-147. doi: 10.1002/ejhf.1628 [6] 赵辨. 中国临床皮肤病学[M]. 2版. 南京: 江苏凤凰科学技术出版社, 2017: 1104-1126.ZHAO B. Chinese Journal of Clinical Dermatology[M]. 2 Edition. Nanjing: Jiangsu Phoenix Science and Technology Press, 2017: 1104-1126. [7] HERATER F, BITTNER Z, CODREA M C, et al. Platelets aggregate with neutrophils and promote skin pathology in psoriasis[J]. Front Immunol, 2019, 101867. DOI: 10.3389/fimmu.2019.01867. [8] 徐黎明, 王力, 范志佳, 等. 血小板相关参数在银屑病中的临床价值探讨[J]. 检验医学, 2019, 34(4): 335-338.XU L M, WANG L, FAN Z J, et al. Clinical value of platelet-related parameters in psoriasis[J]. Laboratory Medicine, 2019, 34(4): 335-338. [9] ZHOU J, LI Y, GUO X. Predicting psoriasis using routine laboratory tests withrandom forest[J]. PLoS One, 2021, 16(10): e258768. DOI: 10.1371/journal.pone.0258768. [10] 杨庆琪, 刘芳, 李园园. 不同分型分期银屑病患者外周血指标的初步分析[J]. 中国疗养医学, 2021, 30(6): 587-591.YANG Q Q, LIU F, LI Y Y. Preliminary analysis of peripheral blood indexes in patients with psoriasis at different types and stages[J]. Chinese Healing Medicine, 2021, 30(6): 587-591. [11] LIU Z, PERRY L A, MORGAN V. The association between platelet indices and presence and severity of psoriasis: a systematic review and meta-analysis[J]. Clin Exp Med, 2023, 23(2): 333-346. [12] SIRIN M C, KORKMAZ S, ERTURAN I, et al. Evaluation of monocyte to HDL cholesterol ratio and other inflammatory markers in patients with psoriasis[J]. An Bras Dermatol, 2020, 95(5): 575-582. doi: 10.1016/j.abd.2020.02.008 [13] HASHIMOTO T, ROSEN J D, SANDERS K M, et al. Possible roles of basophils in chronic itch[J]. Exp Dermatol, 2019, 28(12): 1373-1379. [14] 殷文浩, 金梦祝, 戴晴. 火针配合针刺治疗小腿斑块状银屑病的效果观察[J]. 中华全科医学, 2022, 20(1): 117-120. doi: 10.16766/j.cnki.issn.1674-4152.002291YIN W H, JIN M Z, DAI Q. Effect of fire acupuncture combined with acupuncture in the treatment of calf plaque psoriasis[J]. Chinese Journal of General Practice, 2022, 20(1): 117-120. doi: 10.16766/j.cnki.issn.1674-4152.002291 [15] 李爱君, 母淑娟, 姚瑶, 等. 中医对糖皮质激素药理作用理解的探讨[J]. 山西医药杂志, 2018, 47(6): 700-701.LI A J, MU S J, YAO Y, et al. Discussion on the understanding of the pharmacological effects of glucocorticoids in traditional Chinese medicine[J]. Shanxi Medical Journal, 2018, 47(6): 700-701. [16] 高酥, 张家瑜, 陈易, 等. 链球菌感染与银屑病相关性探究[J]. 皮肤病与性病, 2021, 43(3): 340-341.GAO S, ZHANG J Y, CHEN Y, et al. Correlation between streptococcal infection and psoriasis[J]. Dermatology & STD, 2021, 43(3): 340-341. [17] GAMONAL S B L, GAMONAL A C C, MARQUES N C V, et al. Is vitamin D status relevant to psoriasis and psoriatic arthritis? A retrospective cross-sectional study[J]. Sao Paulo Med J, 2022, 141(3): e2022216. DOI: 10.1590/1516-3180.2022.0216.R1.01072022. [18] 黄珊, 白彦萍. 基于中医理论探究银屑病的季节规律性[J]. 中华中医药杂志, 2021, 36(10): 6209-6211.HUANG S, BAI Y P. Exploring the Seasonal Regularity of Psoriasis Based on Traditional Chinese Medicine Theory[J]. Chinese Journal of Traditional Chinese Medicine, 2021, 36(10): 6209-6211.