The good, the bad, and the questionable.
The concept of machines doing human tasks has been a topic of debate for decades and perhaps even centuries. A fear among factory workers in the first half of the 20th century was that mechanization would displace the factory worker. Decades later, we see that automation has arguably bettered our lives and made society more productive. The same fear has been expressed regarding artificial intelligence (AI) in general and specifically its use in health care.
Artificial intelligence in its contemporary application traces its origins to the mid-1930s. Anyone who has seen The Imitation Game may recognize the agony of Alan Turing attempting to and eventually succeeding in cracking enemy-coded messages. His Turing machines are widely regarded as forerunners of today’s computers. What is the chore of computers but to perform complex tasks that would be burdensome for humans? How do computers accomplish their tasks? The answer is simply that they employ algorithms to funnel data to produce an answer. What is an algorithm? Simply, it is a set of rules followed by computers to digest large amounts of data and provide a conclusion (similar to a diagnosis).
Taking a step back, think of the rules for simple addition or long division. In elementary school, we all learned how to do these operations by hand. Now these and much more complex calculations can be done with our handheld devices using algorithms.
We can find examples of the application of algorithms all around us. Spellchecker is an example that is useful when our fingers hit the wrong keys. But what happens when we intend something that the spell-checking algorithm disagrees with? It must be manually corrected (more on bad algorithms later). There are numerous examples of algorithms in modern automobiles. In fact, it has been suggested that there are potential carryovers from the development of autonomous vehicles to physician assistance in health care.1 Think of IBM Watson making the diagnosis.2
As algorithms evolve and are refined, they improve accuracy. Mistakes need to be identified and corrected to prevent false positive results. One personal example of a bad algorithm is my Apple Watch. Although it alerts for serious incidents such as falls, it has also alerted me when I am wearing it and try to shake sunscreen to the end of the tube (Figure 1).
Let’s look at a case example. A 22-year-old man presents with a 2-day history of a unilateral floater. His personal and family medical histories are noncontributory. His ophthalmic history is uncomplicated, and he does not wear a refractive correction. He takes no medications and does not use any illicit drugs. Visual acuity is 20/20 in each eye without refractive correction. The anterior segment is unremarkable by slit lamp examination, and intraocular pressures are within the statistically normal range for each eye by applanation. Dilated fundus examination reveals the clinical findings recorded by color fundus photography (Figure 2). The list of differentials may include posterior vitreous detachment (not myopic), posttraumatic incidents (denies), retinal tear, and inflammatory or infectious etiologies. These findings open a new realm of questioning.
Further investigation reveals that the family had just adopted a new kitten and that the patient had a scratch on the back of his left hand. Combining history with the retinal (granuloma) and the optic disc (swelling) components, a diagnosis of neuroretinitis is made. This is a paradigm wherein feature extraction (clinical findings) are combined with classification (differential diagnoses) to produce a diagnostic conclusion. In an AI protocol, this would be at the level known as classic machine learning.3 If the fundus image were analyzed pixel by pixel, the observed clinical findings were interpreted along with the history, and a diagnostic conclusion were reached, it would be at the level known as deep learning.3 This level of clinical analysis is on the horizon and has been explored most widely in diabetic retinopathy.
The next phase for automated analysis of imaging is known as wayfinding AI.4 For example, when a patient presents with symptoms and signs suggesting central serous chorioretinopathy, optical coherence tomography (OCT) would be ordered in addition to standard color fundus photography. A wayfinding algorithm would analyze a volumetric OCT scan layer by layer to assess for subtle irregularities that may be overlooked when manually interpreting the cross-sectional line scans.
A level of algorithm beyond machine and deep learning is convolutional neural networks, all which is based on redundancies. Think about what comes up when searching a topic online. The refined redundancies remember previous searches and suggest new or additional items of potential interest. This happens to us every day.
Diabetic retinopathy (DR) has been the most studied among retinal disorders. There are several contributing reasons. DR is among the leading causes of vision loss. In fact, according to the Centers for Disease Control and Prevention, it is the leading cause of blindness among those of working age.5 Additionally, the prevalence of DR makes it a fertile area for research. If we take a step back and think of the classification scheme generated from findings from the Early Treatment of Diabetic Retinopathy Study (ETDRS), a few standard photographs formed the basis for classification among the stages of nonproliferative diabetic retinopathy and distinction from the proliferative stages.6,7 We all became familiar with the 4-2-1 rule.6
The Diabetic Retinopathy Severity Scale (DRSS) evolved from the ETDRS scheme and allowed for a more exact and precise specification of the level of retinopathy.8 This algorithm had a continuous scale of levels from 10 to 90, with a yes/no decision-making tree. Although it used a staging system similar to the ETDRS one, it is much more detailed regarding such items as the number, location, and significance of the vasculopathic changes of DR. The importance of such specificity was emphasized when it was invoked to demonstrate improvement in fundus appearance (level score) as well as visual acuity improvement in findings from the RISE and RIDE trials.9
With this specificity and continuous scaling, such a classification scheme could form the basis for automated image analysis. The next plateau is for automated AI to surpass human cortical decision-making and deliver perfect diagnoses at each encounter, which is a tall task. Convolutional neural networks have this capability, and Scientific American has declared that the paradigm shift to AI has become irreversible.10 Availability of a system that would accurately stage nonproliferative DR without clinician input would be invaluable from the standpoints of convenience, patient care, and consistency. In Figure 3, the patient’s nonproliferative DR worsened over the course of 13.5 months from moderate DR to moderate to severe DR (ETDRS) or from level 43 to 53 (DRSS), a more exact specification.
Just as new terminology has specified center-involving diabetic macular edema to replace the clinically significant macular edema designation from the ETDRS,11,12 the future of teleophthalmology continues to evolve. We can expect that patients will self-image their fundi, perhaps with OCT-angiography, whose images will be transmitted to a reading center or an eye care provider for interpretation and decision-making. The intervention of technology and the disruption by the pandemic have intersected to offer interesting innovations.13,14 These forces will drive convenient, safe, effective, and equitably applied eye care for preserving visual function among patients with DR.15,16
The global pandemic has hastened the use of virtual visits. For example, the Mayo Clinic in Jacksonville, Florida, set a strategic benchmark at the beginning of 2020 of having 30% of eligible visits conducted online by 2030. By the beginning of 2022, 60% of eligible visits were being conducted online (Klaas J. Wierenga, MD, personal communication, March 8, 2022.)
Screening and virtual ophthalmic evaluations have been deployed in the public space17 and clinical space. Although prototypes may have had a rocky start, identification of those patients at greatest risk for disease are now separated from those who are distinctly disease free.18 Interestingly, in this algorithm, eyes with macular edema were identified with groups at greatest risk for vision loss or in need of referral for surgical interventions. One of my local big box pharmacies offers diabetic eye examinations for those who qualify (Figure 4). The inclusion criteria for fundus imaging include exceeding thresholds for random blood glucose, HbA1C, or being treated for type 2 diabetes.
AI is all around us and is here to stay in health care. It has made inroads in the ophthalmic space, which will continue to expand. AI will be transformative for medicine by allowing automated analysis of telemetrically gathered data. This will be of great advantage in all clinical settings and will eventually lead to the no-touch patient examination. Bad algorithms will be weeded out, and nearly seamless diagnostic conclusions will result.