RoboDoc , using Machine Learning to detect brain tumors

Jose
Analytics Vidhya
Published in
11 min readSep 5, 2020

--

AI and medicine applications

It is no news that the AI has started to permeate virtually all aspects of modern life , in often unexpected ways. We all know about Netflix recommendation algorithm or the data mining Facebook performs on user´s preference data(one of may reasons I only visit my Facebook account about once a year), and that´s all very exciting, but in my opinion the real value of Machine Learning (ML) and AI is to try and find applications that solve problems in different domain specific fields. In many ways , I conceive ML much in the same way as I conceive computation and informatics in general : as a tool that is only as valuable as the problems it solves.

With that in mind, and always looking for opportunities to start a ML/Analytics project, a few weeks back I jumped to an enticing idea with one of my friends , who happens to be a medical doctor that is doing his residency in Radiology. My friend, who I´ll be calling Doc from now on, is aware of my interest on AI and ML and posed to me the question on whether I knew if there was a way to develop an algorithm that would help him and his colleagues to detect brain abnormalities using CAT scans and X-ray images of the brain structures

I obviously jumped head-first at the challenge , because it offered a number of interesting opportunities to expand my skills with ML and even a little bit of software engineering. First of all, it was the chance to implement a new algorithm but also it would be the first time I´d be working on an image dataset and I had to research extensively on how to work on images and how to create a Machine Learning model from this datatype. Within the span of a couple of weeks I was able to create a model prototype that uses a K-Nearest Neighbor (KNN) algorithm to classify images in basically two categories : Healthy and Tumor. The healthy category corresponds with images that don´t feature tumors and tumor category corresponds to images in which there is a presumptive malignant tumor in the brain.

Healthy brain image example (Source:https://radiopaedia.org/articles/neuroradiology-interpretation-curriculum?lang=us)
Tumor image example (source:https://radiopaedia.org/articles/intracranial-tumours-summary?lang=us)

How does it work?

The premise of KNN is very simple. In broad terms, the idea is to classify a query into a defined category , by comparing it to elements that are known to belong to this category. Depending on how close our query is to the known elements dataset we can decide if it in facts belong to the category we´re looking for.

KNN can be used for both categorical or numerical features, but it tends to work very well with numerical features because then we can basically reduce the comparison between the query and our known set elements to a geometrical distance calculation and that is exactly what we´re going to do to figure out if a brain scan shows a tumor or not. How exactly? well , this is where the research part of this project came in, in order to figure out how image and object detection work I studied computer vision algorithms and ultimately came across the Scale Invariant Feature Transform algorithm (SIFT).

A thorough explanation on how SIFT works is beyond the scope of this article, however, there are tons of online resources to learn what’s beneath the hood. In fact, you can download the original paper and follow it (it isn’t particularly hard to understand) and in the references of this article I’m going to include the main sources that helped me to build this application.

What’s important for this project is that SIFT and KNN are actually made to work with each other, why? well because SIFT happens to extract for us the data we need to apply a KNN algorithm in the form of keypoints and their descriptors. In essence, SIFT extracts , from any given image, a set of ‘points of interest’ that show where sudden changes occur within the image’s pixels (things like corners , edges, changes of texture and color etc) and the beauty of it is that these keypoints are going to be independent from the scale and rotation of the image (hence the Scale Invariant part). It will then generate a set of descriptors which numerically describe the magnitude and direction of change between the pixels at the identified keypoint. That’s important because it basically generates a numerical model for the keypoints themselves

I figured that tumors are basically anomalies in a brain scan (even someone without neuro-radiology training such as myself can look at an image and at least guess that the spot I’m seeing shouldn’t be there) and that a healthy scan and one with a tumor would have fundamentally different keypoints, whereas two images with tumors would likely have similar anomalies. So to detect a tumor in a brain scan in this manner , we need to:

  1. Create a training set in which we have images both and without tumors
  2. Identify keypoints and find descriptors for these images
  3. Identify keypoints and descriptors for a query image
  4. Match the descriptors for the query against all our training set, the closest match will then tell us under which category our image is falling (Healthy or Tumor)

The result will then look something like this:

Image match, scan with tumor.

With that understood , let’s dive into the specifics

The Dataset

Coming up with medical data can be challenging, because numerous laws are in place to protect the privacy and medical history of patients. Nevertheless, I was lucky enough in finding a public dataset in Kaggle that featured images of healthy and tumorous brains:

Alternatively one can use radiopedia.org and the OASIS project datasets in order to find further examples. The problem is , as usual , finding a set that is big enough to allow us to build a model that can actually predict with a degree of reliability. The Kaggle dataset is not particularly big , featuring under 150 images for both categories, but for the purpose of a binary classification is should be sufficient.

Developing the solution

The way I conceived the solution , was in the form of a python application with a tkinter GUI. Many machine learning projects are usually shared and deployed in the form of Jupyter notebooks, which are practical in that they become something akin to an interactive report that is easily updated for new metrics and observations, as model modifications.

While the Jupyter approach is perfectly fine, I wanted to actually build something that could be used by my friend and his colleagues , who are not programmers or analysts of any capacity and will get of being able to easily interact with the model , and to do that, clearly building a GUI was a necessity. This is tied with my opening remarks in which I laid out my belief that ML and AI are at their best when they can be brought to the people that can make best use of it , in this case doctors and radiologists. To that end , I designed a simple GUI that could run and execute the KNN model. As the title of this article may imply, I called my program RoboDoc V0.0.

RoboDoc , main Screen

The functionality is simple , we can load an image query (browsing files in the computer) , then we train the model (which refers to generating a training and test set, more on that in a minute). And then we simply match our image to the trained set and we obtain a prediction.

For this model , the training involves selecting a training and test sets that maximize the model’s precision and recall so that we can have an adequate level of reliability about the prediction. To do this I randomly split the overall data set in a training and test dataset, following the usual 70/30 (70% training, 30% test). Afterwards, the program matches each entry in the test set to the training sets and computes the precision and recall rates.

After a number of initial runs , I noticed the model precision could range greatly between 60%-90% precision rate and each subsequent run can have a different recall and precision rate. This is because the way the training and test models are distributed in each run, since they’re randomly generated and the program doesn’t distinguish the instances its selecting for the training and test set we can end up with a training set that is skewed towards a category, lowering the model’s precision rate. To get around this, I defined a target precision and make the program train the model repeatedly until my target precision is achieved. After several runs , I concluded that a precision of 83% can be achieved with relative ease and thus , that was the baseline target that we should try to exceed, the model’s precision will range 85%-90% after training. There is a big problem with this approach , however, and is that it greatly increases the run time for the model because the precision does not converge to the target I set , but rather just randomly tries different sets until it gets a combination that exceeds the target. Below you can see a printout of the precision and recall values for several runs

Depending on the shuffle , the training can be done in one run or take a few more tries (sometimes going on for over 10 runs). This is clearly not optimal, so there’s room for improvement.

Finally , we just ask the program to match the query to our training set. The program will return the image from the training set that is the closest match to the query image and it will return the category of this match

Best Match based on keypoint distance

We can load another image and ask for the prediction as well, with the same training

This second image is part of the training set for this run, so clearly we have a perfect match , this serves to demonstrate that the program is correctly identifying images and finding their best match, notice how the prediction changed from ‘tumor’ to ‘healthy’. While these results are preliminary, they serve as proof of concept.

Discussion

As a proof of concept, RoboDoc’s results indicate that a KNN approach can work as a methodology for image recognition and feature extraction. It is interesting to note that this approach can achieve a level of precision that is comparable to that of a Neural Network implementation, when compared to the notebooks submitted by the kaggle participants for this datasets( that fall within the 80%-95% precision range). A Neural Network implementation is certainly an interesting direction worth exploring and it could be added as an additional feature for RoboDoc.

One aspect that was largely overlooked in RoboDoc’s first implementation is the actual data pre-processing of the source images prior to the keypoint and descriptor matching process. RoboDoc pre-processing is limited to resizing all images to the same, arbitrary size (600x600 pixels). While this guarantees that the keypoint matching will work , it is decidedly insufficient to create a normalized image set in which all images directly comparable and in which the features we want to compare (the tumor structures) are enhanced and normalized in each image making them directly comparable. By performing a feature-enhancing process, the query match result would improve as well providing a closer match to the tumor structure in the query .

Currently, RoboDoc can perform a crude binary classification in which is able to distinguish an image with a tumor structure from a healthy scan. However , the specificity of the match is still severely lacking, meaning that even if the image is correctly classified in the “tumor” or “healthy” categories, the best matched image and the query don´t tend to visually align with each other. I believe this is due to several factors, most notably the absence of a proper image pre-processing. In addition, Doc inquired on whether the program could identify specific tumors , lesions and other abnormalities within the brain scan, and the answer is, well, sure it would be able to.

If we extend the concept of the KNN binary classification to a multi-category classification then the program would, in theory, be able to identify all manner of structures and brain pathologies provided that we have sufficient training data for each labeled category. This means that we need to dramatically expand the training set and it would also mean that the training time for the model would increase drastically so engineering solutions would have to be found to optimize run-time.

Final Thoughts

RoboDoc offers exciting possibilities, and I believe the program has a lot of potential for both improvement and expansion, among the next steps of development I’d like to explore are :

  1. Expand the training dataset to be able to classify further brain pathologies
  2. Implement a neural network and compare it’s performance against the KNN algorithm (training and running times would be included in that evaluation)
  3. Implement further image pre-processing techniques , to improve both the matching result and its specificity.
  4. Finally, develop a web application deploys the model and makes it usable for medical professionals

References

Swift implementation

Open CV

TKINTER development

Meier B. , Python Programming Cookbok, Second Edition, Packt Publishing 2017

KNN Algorithm and Implementation

Kelleher J., Mac Namee B. , Aoife D. Fundamentals of Machine Learning for Predictive Data Analytics . MIT Press 2015

--

--

Jose
Analytics Vidhya

Engineer , Data and AI enthusiast . Amateur programmer