A proposed solution
The first step in the 4D approach is the collection of data, which refers to an ongoing process of capturing, reviewing, and interpreting the information gathered from multiple sources.
In the context of healthcare, this means the accumulation of medical information artifacts from hospitals, clinical assessments, immunization cards, etc., eventually contributing to a holistic view of patients.
The key to efficient patient engagement and satisfaction lies in the quality of results produced. For this very reason, collected data must go through a series of cleansing and preprocessing techniques to ensure the removal of unnecessary details. This stage is also a good opportunity to validate uniformity in data gathered from multiple sources.
Some of the modern and conventional techniques that could be used to build such a robust database are listed below.
Electronic Health Record: EHR is a system used and maintained by healthcare organizations to collect and store patients’ medical information. It contains demographics, clinical diagnosis, symptoms, medications, and laboratory and biomedical imaging data.
Medical insurers: Public and private medical insurers collect a wide range of data to evaluate coverage, track health services, and manage payments. Such data includes patient-specific information such as blood reports, treatment details, vaccinations, and hospitalizations.
Patient-generated data: This type of information greatly differs from that captured in clinical settings since patients are responsible for recording it. Increased usage of wearable devices and health apps has generated petabytes of data belonging to a large number of patients.
Once data is collected, it must be made to pass through an AI model that is capable of producing a resultant likelihood matrix. The quality and quantity of information are important since they directly impact how well or badly the model will work.
As a first step towards its development, you can opt for various machine learning algorithms that meet the desired objective. Some of the more popular ones increasingly finding usage in healthcare are described below.
After selecting an algorithm, we must train the model using prepared datasets. A critical step at this stage is model accuracy. While there are no internationally accepted standards, a model with an accuracy rate of 90% or more is considered good.
Such a figure can be achieved by incrementally passing training datasets to the model and comparing the results with corresponding validation datasets. Depending on the type of algorithm chosen, supervised or unsupervised type of learning techniques can be used to prepare a trained model.
Naïve Bayes: Naïve Bayes is an algorithm based on the concept of conditional probability. In the context of healthcare, it has been reported to show extremely high performance and can be used for mining, characterization as well as classification of large medical records.
Support Vector Machine: SVM is one of the most widely used algorithms in healthcare because of its ability to use seemingly simple clinical inputs to predict highly complex diseases. Research work carried out to use this algorithm in preventing dementia and diabetes also shows encouraging results.
Random Forest: Random Forest leverages the power of multiple decision trees to make recommendations. It can be used for classifying and predicting the probability of an event, such as disease risk management, which can further assist doctors in making critical decisions.
Evaluation of the machine learning model is a process that can gauge how the solution will behave in a real-life scenario. To do this, we measure the performance of a trained model on a new dataset and compare its predicted results with actual ones.
The test data must be chosen carefully to avoid overfitting problems (the model fits the specific dataset too well but might fail in real situations) and underfitting (the model cannot find a trend because the data provided is too simple).
Careful examination of encountered errors is also an important step in this direction. To analyze those, erroneously predicted data must be investigated to find potential problems in the model.
For example, patients with cardiovascular issues might be confused with those having renal problems since they share some common pathological features. Each misclassification can be prevented by changing to more complex models or adding more features to existing ones.
To quantify the quality of a model's prediction, any of the following metrics can be used.
Sensitivity: In the healthcare industry, sensitivity measures the proportion of actual positive cases predicted correctly as positive (also called true positives). In other words, a person who has heart disease must also be predicted by the model as having it.
Specificity: Being the opposite of sensitivity, specificity is defined as the proportion of actual negatives, which are also predicted as negative (also called false negatives). Simply put, a person who doesn't have heart disease must also be predicted by the model as not having it.
Precision: Focused on finding the proportion of incorrect results, precision is defined as the ratio of true positives to the total of the true positives and false positives. The more deviation we have between the predicted and actual results, the uglier this metric will look.
Now that we have created a reliable and accurate model that can make predictions, it won't do much good for anyone unless deployed. To decide on the deployment mode, you need to gauge whether the model should be integrated into an existing application or need a new one.
You must also assess the frequency of predictions that would need to be made, the number of applications that will need to access this model, the latency requirements of all those applications, and the technical and analytical maturity of your team.
Once the model is deployed, it is a good idea to establish a feedback loop so that users can report the wrong classification of data and help improve the model's accuracy even further. The potential targeted segments for using such an application are listed below.
Consequently, the model must be retrained regularly, and a new updated version should be redeployed using an automated, repeatable, and reliable procedure.
Medical professionals: Medical professionals can benefit from such an application by conducting an automated diagnosis of multiple patients, developing and anticipating the impact of new drugs, monitoring data of consumers' wearable devices, and suggesting control of observed risk factors.
Patients: Patients can self-record their medical history manually or through their IoT-enabled devices. The AI model then evaluates these attributes, and the user is informed of the result via a likelihood matrix. A prevention program tailored to users' needs can also be offered.
Public health officials: Government officials can use this data-driven approach in predicting future public health threats, proactively assessing the impact of control measures, and strengthening efforts toward developing integrated surveillance systems, diagnostics, and preventive interventions.
Home