From Frustration to Innovation: A Personal Quest in Healthcare Tech with EHRs

Enhanced Classification of Patients with Hepatitis with Machine Learning

6 min readJul 25, 2023

Preface

Back in early July, my job hunting journey took me on a rollercoaster ride, leaving me feeling both challenged and determined. Every obstacle presents an opportunity, and I decided to embrace it as a pivotal stepping stone in my career, propelling me towards new opportunities and growth.

Background

Recently, as I acquired a comprehensive hepatitis dataset, I also received concerning news about my friend who is taking Accutane. The diagnosis revealed concerning numbers related to their liver health. Accutane is known for its potential impact on the liver, which underscored the importance of monitoring liver health and its implications for patient care.

This dual experience sparked my curiosity, prompting me to delve deeper into the role of Electronic Health Records (EHRs) in liver health tracking.

With Accutane’s liver-related concerns in mind, I became even more aware of how EHRs can play a crucial role in providing real-time insights to healthcare providers. Through these records, providers can better monitor liver health, identify any adverse effects promptly, and adapt treatment plans to ensure patient safety and well-being.

Now, allow me to unveil the fruits of this journey — powerful prediction models poised to unlock the true potential of EHRs!

Data

Let’s look at our data, which contains 540 healthy patients with 75 patients who are diagnosed with hepatitis. I used Seaborn library to make it prettier.

Side notes:
ALT & AST bloodstream values is the main component of the routine Liver Function Tests (LFTs) — a main identification of a patient’s primary disorder as of hepatitic or cholestatic source. However, it’s not straightforward since they may be influenced by a wide range of non-hepatic factors. They’re not liver specific, hence they could play a role in detecting metabolic disturbances in Alzheimer — they’re associated (consistently) with cognitive performances.
Additionally, AST/ALT ratio can be useful for cirrhosis suggestion, but not for diagnosis, called the DeRitis ratio. (Origin)

Patients with Hepatitis — Glance of data distribution

Normal Distribution: Age, CHE, CHOL, PROT(slightly left skewed), CREA
Right Skewed: ALT, AST, BIL, GGT

Our target has positive correlations with ALP, ALT, AST, BIL, GGT, CREA, and GGT
Negative correlations with ALB, CHE, CHOL, PROT

Model Training

In the paper, the researchers utilized Random Forest to detect the most diagnostic traits of hepatitis. I’ll be adding CatBoost and XGBoost to perform an ensemble — toward higher complexity and aim for adoption in clinical practice!

Import Libraries

Import your models, metrics, and all that you need. I’ve imported GridSearch related libraries for later use.

from sklearn.metrics import accuracy_score, precision_score
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay

from sklearn.model_selection import StratifiedKFold 
from sklearn.model_selection import GridSearchCV

import catboost as cb 
from catboost import CatBoostClassifier, Pool

import xgboost as xgb 
from xgboost import XGBClassifier 

from sklearn.ensemble import RandomForestClassifier

import numpy as np 
import pandas as pd

CatBoost, XGBoost, and Random Forest

Side note — XGBoost’s Softmax:
Simply put, the formula extends logistic regression into multi-class classification.

Initial Model Performance Evaluation

The CatBoost performed slightly better than the XGBoost, sitting at a 0.955862 accuracy, with a mean score of 0.953028. Random forest is just…there. Well, it serves a purpose of benchmarking!

Hyperparameter Tuning | GridSearch

I performed GridSearch, which is arguably the most basic type of hyper-param methods, but hey, just wait for the result.

Initially, there was an error with CatBoostError in GridSearch:
WHAT? WHY? I yelped.

CatBoostError: only one of the parameters iterations, n_estimators, num_boost_round, num_trees should be initialized.)

After some digging, I realized that the error is related to the CatBoost library, where you can only initialize one of the params when creating a CatBoostClassifier.

That’s right. Tune your stuff correctly. Here’s how to do it correctly:

Here’s what we got from running the GridSearch successfully:

Best parameters for CatBoost: {'iterations': 90, 'learning_rate': 0.04, 'max_depth': 5}
Accuracy for CatBoost: 0.9915254237288136
Training Random Forest...
Best parameters for Random Forest: {'bootstrap': True, 'max_depth': 110, 'max_features': 3, 'min_samples_leaf': 3, 'min_samples_split': 8, 'n_estimators': 300}
Accuracy for Random Forest: 0.9830508474576272
Training XGBoost...
Best parameters for XGBoost: {'learning_rate': 0.01, 'max_depth': 3, 'n_estimators': 200}
Accuracy for XGBoost: 1.0

Amazing, wonderful. Do you feel bonita?

Re-train models with GridSearch Results

After grabbing the hyperparameter, let’s implement the results and train our model all over again.

# Ensemble with GridSearch results
xgb_model2 = xgb.XGBClassifier(learning_rate=0.01,
                               max_depth=3, 
                               n_estimators=200)

cb_model2 = cb.CatBoostClassifier(max_depth=5, 
                                  iterations = 90, 
                                  learning_rate=0.04)

rf_model2 = RandomForestClassifier(bootstrap = True, 
                                   max_depth=90, #110 
                                   max_features = 3, 
                                   min_samples_leaf = 3, 
                                   min_samples_split = 8, 
                                   n_estimators = 300) 
models2 ={
    'XGBoost': xgb_model2,
    'CatBoost': cb_model2, 
    'RandomForest':rf_model2
}

Wow ain’t that a great improvement, I thought, squeaking with joy.

Result Summary

XGBoost consistently performs well across most folds, achieving high accuracy ranging from 0.985075 to 1.000000.
CatBoost also exhibits excellent performance, achieving a perfect accuracy of 1.000000 in all folds except for one (fold 6), where it scored 0.985075.
RandomForest demonstrates strong performance as well, with accuracy scores ranging from 0.955882 to 1.000000 across different folds.

Overall, all three machine learning models (XGBoost, CatBoost, and RandomForest) demonstrate impressive performance, making them potential candidates for the classification task at hand. However, XGBoost and CatBoost stand out as the top-performing models, consistently achieving high accuracy across most folds.

Consistency

The models’ consistent performance across different folds suggests that they are robust and generalizable. This stability is crucial in real-world applications where the models need to perform well on unseen data.

Potential Overfitting

While the models exhibit high accuracy, it’s essential to investigate whether overfitting is present. Overfitting occurs when a model performs exceptionally well on the training data but fails to generalize to new data. Thorough cross-validation and hyperparameter tuning can help mitigate overfitting concerns.

Best Performing Model

Both CatBoost and XGBoost models show strong performance, making them better candidates for the hepatitis prediction task than Random Forest. However, it might be worth exploring additional evaluation metrics or conducting statistical significance tests to determine if one model significantly outperforms the other.

Final Thoughts | Potential for Real-World Application

The successful development of accurate prediction models for hepatitis demonstrates the power of data science in healthcare. The potential to diagnose and manage diseases with the help of machine learning models can significantly impact patient care and healthcare decision-making.

Overall, the model results showcase the effectiveness of CatBoost and XGBoost in predicting hepatitis-related outcomes. The next steps would involve further analysis, fine-tuning, and considering the models’ integration into real-world healthcare systems to unleash their full potential and make a positive impact in the medical domain.

Further Exploration

A potential strategy is to utilize cross-validate with other hepatitis datasets around the globe. Additional iterations of hyperparameter tuning (using other methods such as Random Search or Bayesian optimization), model architecture adjustments, and feature engineering might yield even more accurate predictions.

Author’s murmur

Well, that’s it. Here you go. Wasn’t thafun?

The rollercoaster ride I’ve been through, and its subsequent aftermath, served as a testament to my enduring character. You can call me stubborn, but I know myself well — I persevere. It just feels right to make positive impacts on healthcare industry.

Having worked in a neurosurgical clinic throughout teenage years along with family of doctors, innovations in the healthcare realm — patient care (including postoperative care!) and streamlines of medical billing process are topics that I find beyond intriguing.

Fortunately, the landscape continues to expand —and I eagerly look forward to joining teams that share my interests and aspirations!

Hence, are you hiring? Shoot me an email or let’s connect on LinkedIn.

Acknowledgements
Creators: Ralf Lichtinghagen, Frank Klawonn, Georg Hoffmann
Donor: Ralf Lichtinghagen: Institute of Clinical Chemistry; Medical University Hannover (MHH); Hannover, Germany; lichtinghagen.ralf ‘@’ mh-hannover.de
Donor: Frank Klawonn; Helmholtz Centre for Infection Research; Braunschweig, Germany; frank.klawonn ‘@’ helmholtz-hzi.de
Donor: Georg Hoffmann; Trillium GmbH; Grafrath, Germany; georg.hoffmann ‘@’ trillium.de
Relevant Papers
Lichtinghagen R et al. J Hepatol 2013; 59: 236–42
Hoffmann G et al. Using machine learning techniques to generate laboratory diagnostic pathways — a case study. J Lab Precis Med 2018; 3: 58–67
HCV data. (2020). UCI Machine Learning Repository. https://doi.org/10.24432/C5D612.