Online ISSN: 2515-8260

A VARIABLE SELECTION IN ORDERED LOGISTIC REGRESSION MODEL USING DECISION TREE ANALYSIS FOR THE CLASSIFICATION: A CASE STUDY OF HYPERTENSION MODELING

Main Article Content

Wan Muhamad Amir W Ahmad1*, Hazik Bin Shahzad1 , Mohamad Nasarudin Adnan1 , Farah Muna Mohamad Ghazali1 , Noraini Mohamad1 Norhayati Yusop1 , Nor Farid Mohd Noor2 , Nor Azlida Aleng3 , Mohamad Shafiq Mohd Ibrahim4

Abstract

Background and Objective: Hypertension is a public health issue that depicts high blood pressure in which the force of the blood vessels increases persistently. According to the WHO, one in four men and one in five women have hypertension. Twenty to thirty percent of the adult population and more than five to eight percent of pregnancies worldwide suffer from hypertension, which is frequently curable when detected and treated early. Recognizing the significance of statistical modeling in hypertension, this study aims to develop a method that stakeholders can use to predict and manage hypertension cases. Two approaches used in this study were decision tree and ordinal regression. Both methods will be harmonized in the R syntax with some modification and extension. Materials and Methods: In this paper, we developed the method of decision tree analysis using R syntax with embedding the prediction classification. The classification for prediction with accuracy will indicate the successful classification analysis. This study used hypertension data consisting of one thousand observations to illustrate the development method. Before further testing, each pre-selected variable's clinical relevance and significance will be evaluated. Four selected variables will be tested using the decision tree. The selected variables are blood pressure, glucose, height, and triglycerides. The classification obtained will be used as input for the ordinal regression modeling. Result: It has been found that the level of hypertension can be determined by systolic blood pressure, glucose, and triglycerides, according to the most recent published research. These four variables are chosen and used for the input of the ordinal regression. The suggested variables will apply to the ordered logistic regression, and the goodness of measurement is conducted using the developed syntax. The significance level is set at 0.05 level. Conclusion: We can conclude that our proposed method yields excellent results with the highest level of forecasting precision possible. The method approach provides an accurate evaluation of the fit of the final model. The superior performance of the model led to improved outcomes and effective decision-making management.

Article Details