Machine learning approach to the classification of hepatitis B surface antigen seroclearance in hepatitis B virus

Abstract

This study used an integrated machine learning (ML) classification technique to classify patients with or without seroclearance of hepatitis B surface antigen (HBsAg) using single nucleotide polymorphism (SNP). Bayesian optimization was employed for tuning the hyperparameter values of the random forest (RF) and support vector machine (SVM) models. Results showed that the incorporation of RF as a feature selection method to the SVM classifier yielded higher performance metrics than solely using the baseline models, with 80% accuracy, 79% precision, 80% sensitivity, and area under the curve (AUC) of 0.8. This paper demonstrated that the integration of ML models led to a more suitable analysis of SNP profiles for disease risk prognosis.

Previous
Previous

Identification of candidate genomic regions by integrating cluster analysis and genome-wide association studies