How Brain Data Can Become a Biomarker for Depression 🧠

Using CNNs to analyze EEG data for accurate depression diagnosis.

8 min readJan 31, 2022

There isn’t any definitive quantified way to diagnose many mental health disorders. As a result, misdiagnosis for depression is actually a lot more prevalent than many people think. In fact, a study showed that 65.9% of cases for major depressive disorder (MDD) were misdiagnosed.

More than half of the diagnoses were incorrect.

These people could have been given unnecessary medication, improper treatments and a lot more that could have been avoided if there was a definitive method of diagnosis.

Many diseases with a high diagnostic accuracy have established biomarkers (like blood pressure, genetics tests and heart rate) that can clearly indicate whether a person has a specific disease or not.

Since depression is psychological, finding a biomarker seemed out of question. But, we can actually derive data right from the brain.

Brain biomarkers.

Being Sad Isn’t All MDD Is.

Everyone feels sad at times, and for the most part, we can snap out of it. It’s not so simple for people with major depressive disorder (MDD).

Extensive periods of unexplained sadness can be an indicator to major depressive disorder. It affects how someone feels, thinks, behaves and leads to many mental and physical problems.

It’s a real disorder that goes beyond just feeling sad. MDD takes hold of a person’s mind and traps them in a pit of misery that seems to be impossible to get out of.

It seems impossible to get out, but it’s not. Although MDD can’t be completely cured with medicine, receiving treatment earlier helps immensely. And this is only possible if an accurate diagnosis is received earlier.

Early recognition and treatment are crucial, as duration of untreated depression correlates with worse outcomes.
- Nature

Right now, diagnosis basically just looks like a checklist of symptoms and looking at family history. That’s probably why so many people self-diagnose using random quizzes online — It’s hard to accurately pinpoint the disorder since blood tests or other typical lab testing don’t really help.

Well, if it’s a neurological disorder, shouldn’t there be some way to just map what’s going on in someone’s brain and see if things are looking pretty dark?

Yeah, there is: The EEG image data of an individual can actually pose as a biomarker to diagnose MDD with the help of CNNs.

Getting Together the Brain (EEG) Data

The belief that mental illnesses aren’t real or that people are just “dramatizing emotions” can be easily proven false with actual science behind the causes.

The most prevalent neurological difference between depressed people and those without depression is the concentration of receptors in the synapse and quantity of neurotransmitter release. They are both much lower for people with MDD.

Now that we have some sort of quantitative difference, neuroimaging tools can biomark it. EEGs (electroencephalograms) have proven to be the most ideal tool when used in deep learning (DL) applications through the field of brain computer interfaces (BCIs) as it’s non-invasive and accurate.

This method of diagnosis has been applied to a number of other diseases including Alzheimer’s and Parkinson’s and has proven to be effective — although, the symptoms are much more easily recognizable as well.

Let’s break down this tech-heavy diagnosis process. 🔨

Collecting the Data — Acquisition & Processing

You know those headsets you see in the news sometimes where someone is controlling a robotic arm with their brain? Well, that’s an EEG (not 🥚).

The EEG has a few key structural aspects: the neuro-headset, electrodes (the different pads placed on the user’s head) and mesh matrix (represents placement of electrodes on an electrode cap).

Brainwaves would be recorded from each of these electrodes and then amplified so they look like the squiggly lines we are used to seeing (this is so the data can be processed by a computer).

Mapping It

The next step is taking this time series and converting it to the grayscale format we need to derive MDD patterns from. 🔍

The geometry (angle, spacing, length) of the EEG signal represents different brain regions.
These are matched with the correct electrode placements.
The electrode measurements are converted to 2D image data (grayscale image).
This data is then stored as a topographic map that represents activity at a specific frequency band at one time slice.
This is repeated for each significant frequency band.
Finally, the maps are fed into a convolutional neural network (CNN) and a grayscale image representing a raw EEG data sample is produced.

Raw EEG data for someone without (left) and with (right) MDD. (Source)

TL;DR: We matched the squiggly lines to their electrodes, made topographic maps and then a CNN generated grayscale images for the data.

Interpreting the Grayscale Images

Rather than using a processed biomarker in which data patterns are distinctly laid out, DL provides another perspective that focuses on the structure of raw data.

But first, a quick crash course on DL.

Deep learning is a subset of machine learning, which is a subset of artificial intelligence.

It mimics the way in which humans accumulate certain types knowledge and so, its primary application is for collecting, analyzing and interpreting large amounts of data.

Perfect for considering each and every pixel on a grayscale image, isn’t it?

The automation in traditional machine learning typically remains the same: in a linear fashion. On the other hand, deep learning is a hierarchy with increasing complexity and abstraction. Each new level on the hierarchy is a new layer.

To put it more simply, imagine one of your first words was “bird.” You probably learned what a bird was by recognizing characteristics like flying, feathers and beaks by observing repeated instances of your parents pointing out a bird outside or Dora the Explorer giving you a high five after you guessed where it was. Through this, you conceptualized what a bird was.

Unknowingly, you developed a hierarchy with increasing complexity starting from if the “thing” is flying all the way to even recognizing different colours to pick out a blue jay specifically. 🐦

This is pretty much exactly what deep learning does. Every algorithm in the hierarchy applies a nonlinear transformation to its input (image of bird, dog, pillow etc.) and uses what it learns to generate a statistical model as output. The algorithms will continue its learning and run through iterations until the outputted graph is accurate (the DL graph looks like the dataset’s graph).

TL;DR: Deep learning has a hierarchy with increasingly complex layers that contribute to an end model which can then be validated and trained to be accurate.

Which algorithm exactly?

For this application, the ideal algorithm is a convolutional neural network (CNN) because they can take in an image as input, identify important aspects/patterns on the image and differentiate one from another.

Basically, it’s great for image classification, which is what we need to analyze the grayscale image data.

Let’s look into a specific CNN structure that would work well for this application.

ResNet-50

The ResNet-50 framework is a deep network (50 layers) that functions as feature extraction architecture for classification regression semantic segmentation (assigning a label to every pixel in the image and classifying them).

It works well because residual networks in general have an advantage in combatting the gradient vanishing problem and reducing error in the deeper layers. This is done through identity mapping as more layers are added to the shallower model and the other layers are copied from it. So, the training error in the deeper model is restricted to the shallow part.

For this use case, Q(x) is considered the mapping and consists of a few stacked layers with x referring to the first input set of these layers. The resulting residual function is R(x) = Q(x) - x.

In plain networks, this problem persists since the identity connections lead to loss of information about the original state of the image. The input to the next layer would no longer contain the exact original data and instead it would have been altered by the ReLU function.

Overall, ResNet-50 promises quicker training, increased learning rate and much more.

Side note:
• The gradient is essentially a derivative computed as in this case, a finite difference.
• Other ResNet structures exist (102, 152 …etc.) however, 50 layers works best when the data is easily distinguishable — and it is, if you look back at the grayscale images.

Next steps would include training, testing and validating the data. The implementation of technology is relatively simple; however the accuracy and execution of it is not yet up to speed.

We can get there though.

Accurate MDD Diagnosis Can Save Lives

Imagine someone gets diagnosed with depression even if they don’t have it. They could accidentally overdose and become reliant on a drug they didn’t need.

Now, imagine someone with depression wasn’t diagnosed with it properly. Their depression could get worse to a point where they can’t recover.

But you don’t really have to imagine it, because it is a reality.

In fact, as many as 30–50% of women diagnosed with depression were misdiagnosed. This and diagnostic delays can cause unnecessary deaths.

To prevent this, a more definitive way to diagnose depression where a doctor can’t unintentionally inject bias or overlook patient history is crucial.

Subjectivity isn’t good enough. CNNs and EEG data will ensure objectivity.

— —

Let’s Stay in Touch 💬

I am working towards applying this concept so look out for my next article where I will replicate a CNN to classify EEG image data for MDD diagnosis.

If you want to be notified or join my headquarters, subscribe to my newsletter where I talk about key lessons, post opportunities, knowledge bombs, progress updates and mental health! Or, follow me on Twitter/Medium.

I can’t wait for you to read what’s in store! 📑

Mlearning.ai Submission Suggestions

How to become a writer on Mlearning.ai

medium.com