Date of Award


Document Type

Open Access

Degree Name

Bachelor of Science



First Advisor

Professor Jue Wang


kidney stones, modeling, logistic regression


Kidney stone disease has become more prevalent through the years, leading to high treatment cost and associated health risks. In this study, we explore a large medical database and machine learning methods to extract features and construct models for diagnosing kidney stone disease.

Data of 46,250 patients and 58,976 hospital admissions were extracted and analyzed, including patients’ demographic information, diagnoses, vital signs, and laboratory measurements of the blood and urine. We compared the kidney stone (KDS) patients to patients with abdominal and back pain (ABP), patients diagnosed with nephritis, nephrosis, renal sclerosis, chronic kidney disease, or acute and unspecified renal failure (NCA), patients diagnosed with urinary tract infections and other diseases of the kidneys and the uterus (OKU), and patients with other conditions (OTH). We built logistic regression models and random forest models to determine the best prediction outcome.

For the KDS vs. ABP group, a logistic regression model using the five variables including age, mean respiratory rate, blood chloride, blood creatinine, and blood CO2 levels from the patients’ first lab results gave the best prediction accuracy of 0.699. This model maximized sensitivity with a value of 0.726. For KDS vs. NCA we found that a logistic regression using the Elixhauser score and blood urea nitrogen (BUN) values from the first lab results for patients with first admittance produced the best outcome, with an accuracy of 0.883 and maximized specificity of 0.898. For KDS vs. OKU a logistic regression using the estimated glomerular filtration rate (EGFR) calculated from the average lab values gave the best outcome, with an accuracy of 0.852 and maximized specificity of 0.922. Finally, a logistic regression using age, EGFR, BUN, blood creatinine, and blood CO2 gave the best outcome for KDS vs. OTH, with an accuracy of 0.894 and maximized specificity of 0.903. This research gives the medical field models to potentially use on kidney stone patients. It also provides a steppingstone for researchers to build off if they want to build kidney stone models for a different population of patients.



Rights Statement

In Copyright - Educational Use Permitted.