Feature engineering with clinical expert knowledge: A case study assessment of machine learning model complexity and performance.

Incorporating expert knowledge at the time machine learning models are trained holds promise for producing models that are easier to interpret. The main objectives of this study were to use a feature engineering approach to incorporate clinical expert knowledge prior to applying machine learning techniques, and to assess the impact of the approach on model complexity and performance. Four machine learning models were trained to predict mortality with a severe asthma case study. Experiments to select fewer input features based on a discriminative score showed low to moderate precision for discovering clinically meaningful triplets, indicating that discriminative score alone cannot replace clinical input. When compared to baseline machine learning models, we found a decrease in model complexity with use of fewer features informed by discriminative score and filtering of laboratory features with clinical input. We also found a small difference in performance for the mortality prediction task when comparing baseline ML models to models that used filtered features. Encoding demographic and triplet information in ML models with filtered features appeared to show performance improvements from the baseline. These findings indicated that the use of filtered features may reduce model complexity, and with little impact on performance.

Tags
Data and Resources
To access the resources you must log in

This item has no data

Identity

Description: The Identity category includes attributes that support the identification of the resource.

Field Value
PID https://www.doi.org/10.1371/journal.pone.0231300
PID pmc:PMC7179831
PID pmid:32324754
URL https://academic.microsoft.com/#/detail/3018911758
URL https://www.ncbi.nlm.nih.gov/pubmed/32324754
URL https://plos.figshare.com/collections/Feature_engineering_with_clinical_expert_knowledge_A_case_study_assessment_of_machine_learning_model_complexity_and_performance/4950270
URL https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0231300&type=printable
URL https://doi.org/10.1371/journal.pone.0231300
URL https://dx.plos.org/10.1371/journal.pone.0231300
URL https://doaj.org/toc/1932-6203
URL https://EconPapers.repec.org/RePEc:plo:pone00:0231300
URL http://dx.doi.org/10.1371/journal.pone.0231300
URL http://europepmc.org/articles/PMC7179831
Access Modality

Description: The Access Modality category includes attributes that report the modality of exploitation of the resource.

Field Value
Access Right Open Access
Attribution

Description: Authorships and contributors

Field Value
Author Kenneth D. Roe, 0000-0002-2619-8911
Author Vibhu Jawa, 0000-0003-4540-8344
Author Xiaohan Zhang
Author Christopher G. Chute, 0000-0001-5437-2545
Author Jeremy A. Epstein, 0000-0003-4435-7178
Author Jordan Matelsky
Author Ilya Shpitser
Author Casey Overby Taylor
Publishing

Description: Attributes about the publishing venue (e.g. journal) and deposit location (e.g. repository)

Field Value
Collected From PubMed Central; ORCID; UnpayWall; DOAJ-Articles; Crossref; Microsoft Academic Graph
Hosted By Europe PubMed Central; PLoS ONE
Journal PLoS ONE, ,
Publication Date 2020-04-01
Publisher Public Library of Science (PLoS)
Additional Info
Field Value
Language English
Resource Type Article
keyword Q
keyword R
keyword keywords.General Biochemistry, Genetics and Molecular Biology
system:type publication
Management Info
Field Value
Source https://science-innovation-policy.openaire.eu/search/publication?articleId=dedup_wf_001::fc88a81c053e94d81925eeb520194ffa
Author jsonws_user
Last Updated 27 December 2020, 00:26 (CET)
Created 27 December 2020, 00:26 (CET)