AI and the transformation of the medical world

Medical imaging is the process of capturing the structure of an inner organ or tissue. These images can assist medical staff with diagnostics, treatment, and monitoring of patients. It can also prevent any unnecessary invasive procedures.

The global AI healthcare market is expected to grow from 4.9 billion USD in 2020 to 45.2 billion USD by 2026. This rapid growth rate can be explained by the many advantages AI has to offer.

One of the main advantages is AI’s ability to process large amounts of data faster than a human can. For example, a pathologist may need up to 30 hours to examine a 10GB tissue slice whereas AI can process the same amount of information in seconds. In addition, in these kinds of tedious tasks, AI accuracy is far greater and there is much less chance of errors. These are just some of the many reasons AI should be used as a first screening and diagnostic tool.

AI platforms can also detect abnormalities in early stages that in turn enable early treatment and recovery. One of the growing usages of AI is for the purpose of non-invasive radiometric biomarkers. The development of such biomarkers enables us to measure and quantify organs and lesions automatically and compare them to an existing normal database. This allows physicians to accurately and reproducibly monitor the progress of a diagnosis or a given treatment and to avoid, in some cases, the need for an invasive procedure like a biopsy.

Another great advantage is the ability to incorporate AI at early stages of image acquisition. Among other things, this enables us to reduce the amount of radiation needed to acquire a high-resolution CT or shorten the duration needed for an MRI scan. And this leads to patient welfare improvements as well as healthcare cost reductions.

AI applications

In recent years there has been tremendous work in this field mainly focusing on cardiovascular, ophthalmology, neurology, and cancer detection.

In 2016 Google showed higher success in identifying diabetic retinopathy (DR) compared to a group of 7-8 U.S. Board-certified ophthalmologists. Diabetic retinopathy is the fastest growing cause of blindness, with almost half a billion diabetic patients at risk. When not treated this can lead to irreversible blindness. In their research, Google succeeded to train a DCNN (Deep Convolutional Neural Network) for the classification between moderate and worse referable diabetic retinopathy (RDR).

Above: Examples of retinal fundus photographs that are taken to screen for DR. The image on the left is of a healthy retina (A), whereas the image on the right is a retina with referable diabetic retinopathy (B) due a number of hemorrhages (red spots) present.

Cancer detection and monitoring is another important application of AI in medical imaging. A pathological report is crucial for an accurate diagnosis and the successful treatment of cancer. The process of examining thousands of 10-megapixel (MP) photos is both time consuming and prone to errors. Liu and Gadepalli had addressed this issue by training an AI (CNN-Inception V3 based) to detect tumors at lesion level. After training several AI models at multiple scales (similar to the way pathologists examine a tissue), the models were able to either match or exceed the performance of a pathologist.

Above: Left: A patch from a H&E-stained slide. The tumor cells are a lighter purple than the surrounding cells. A variety of artifacts are visible: the dark continuous region in the top left quadrant is an air bubble, and the white parallel streaks in the tumor and adjacent tissue are cutting artifacts. Furthermore, the tissue is hemorrhagic, necrotic and poorly processed, leading to color alterations to the typical pink and purple of a H&E slide. Right: the corresponding predicted heatmap that accurately identifies the tumor cells while ignoring the various artifacts, including lymphocytes and the cutting artifacts running through the tumor tissue.

AI challenges

With such great achievements, can AI solve any problem at hand? What are the pitfalls?

The main challenge AI faces, especially in the healthcare domain, is the amount of data available for training and testing. With privacy and legal issues this poses a complex gap for data sharing. Moreover, annotation of data is a laborious task which often requires a specialist. The natural imbalance of the data available is another issue. In some cases, most of the data is of healthy subjects and does not include rare conditions that need to be detected or monitored. However, in other cases the situation is the complete reverse whereby the data mostly contains medically ill subjects.

This challenge has been addressed in several ways. The first is using unsupervised deep learning methods that allow training with unlabeled data and do not require specialist annotation. The second is the ability to facilitate synthetic additional data with GANS models (generative adversarial network). A separate aspect is the increased efforts to better legalize and regulate patients’ data in order to formulate a shared anonymous data base. Using transfer learning is the common approach used today to deal with the data limitation issue in which the trained models’ previous knowledge can be exploited. The training essentially fine tunes or adjusts the previous trained model using only a small amount of data to the medical task in hand. It has been proven that the previously trained models are able to hold some low-level information that is required in the current medical imaging task.

Looking for a four-leaf clover

In the past decade there has been rapid growth in the number of published papers related to the usage of AI in medical imaging, with hundreds published each year. With so much knowledge and so many models to choose from, what approach should we consider for our task (classification, regression, segmentation, image enhancement)? What model should we choose? (FCN, DCNN, RNN, GAN)?

Above: This illustrates some of the commonly used deep network today in the medical imaging domain and their application.


In one of Surgical Theaters projects, the task was to identify and separate vertebrae at a CT scan for later diagnostic purposes. What approach best fits this task? Usually the choice depends on the nature of the data and the task in hand. For example, in this case, a 3D segmentation model is the natural choice since it can exploit the 3D information that resides in the CT scan. The model would need to decide for each voxel (2D pixel in all slices of the scan) whether it is part of the background or part of the foreground. 3D Unet was chosen as the best fit architecture due to its multi scale learning ability.

Above: An example of our segmentation results

On a different Cathworks project, the goal was to identify and track a specific cardio vessel in a coronary angiogram. The ability to do so can later be used as an aid for physicians to treat a variety of cardiovascular conditions.

For the purpose of tracking a vessel, a regression approach was taken. To incorporate both global and local image information, a multi cascade model was found to be optimal with a coarse to fine scheme. The first network detected the coarse points on the vessel, followed by a localized high-resolution network. The high-resolution network fine-tunes or adjusts the points according to the local vessel’s neighborhood in the higher resolution image. The choice of the correct combination of network architecture depends on the specific problem and the data available. In cases where the data itself has large variations one simple network might not do the trick. In that case, one can consider training multiple different networks, one for each section of the data. When run time is crucial, it is recommended to use parallel multiple networks.

4D Medical Imaging

We can find different cases where we have 4D medical data that includes both temporal and spatial information. For example, a guided video fluoroscopy or cardiovascular recording with a beating heart.

In such cases, what would be the optimal approach to pursue? Would a plain DCNN be enough? In these cases where additional information is present, it is recommended to integrate this information into the AI model to gain maximal automated learning. For example, as part of the efforts to track the beating cardiovascular vessels it was found that an additional LSTM layer on top of the regression model was beneficial. The regression CNN model extracts the different features learned from every frame separately and the LSTM (Long Term Memory Model) learns the additional dependency between the frames. This resembles a case where it would be easier for a human mind to understand a person is hopping and not running from a set of images (a video) rather than a single frame at a time. This temporal connection is interpreted by the LSTM layers.

Above: Coarse and fine results of the regression LSTM network in a cardio angiogram

AI has become a constantly growing part of current and future healthcare platforms and it is here to stay. There have been great efforts worldwide to establish a large data base of different medical images — in particular, images that relate to the COVID-19 pandemic. Hopefully soon AI will contribute to the joint effort in overcoming this current global crisis.

To learn more about how Vision Elements can help you with your computer vision and AI projects, visit our website.

Hila Blecher Segev is a Computer Vision and AI Associate at Vision Elements.


Original post:

Leave a Reply

Your email address will not be published. Required fields are marked *