Skip to content

// 15 May, 2025

Actually, anecdotes are data

By John Kelly, EVP, Technology at Envelop Risk

// Blog

When the data and the anecdotes disagree, the anecdotes are usually right.

// Jeff Bezos

An operating room spectator

I watched with a mixture of fascination and horror as the neurosurgeon cut into the skull. She removed a small piece of it and directed her team as they placed an electrode grid directly on the brain’s surface. I stood in the back, out of the way. I was the new AI and machine learning Ph.D. student, tasked with translating the electrodes’ raw neural signals into external control signals.

I had always wanted to work in a medical field, but being up close and personal with scalpels, needles, and internal organs wasn’t something I could stomach. My passion was analyzing the cold, hard numbers and creating new technology. But apparently, or so I was told, understanding where those numbers came from mattered, and I needed to appreciate everything involved in collecting them.

Yes, numbers do lie

I bundled up and made my escape, relieved as I rode my bike through the Pittsburgh winter away from bloody living tissues and back to the familiarity of silicon circuits. I laughed to myself with confidence. If the signal is there, the algorithms will find it! I opened the data files like a kid tearing into presents on Christmas morning, poring over thirty-two channels of raw neural activity.

I very much preferred biking through the snow over standing in the back of an operating room watching brain surgery

The neural signals were collected while the patient attempted to perform relatively mundane tasks on a computer screen, such as moving a ball to hit a target or controlling a character in a simple video game. The goal of our research was to develop systems that would enable people with spinal cord injury or severe neuromuscular disorders to communicate and directly control prosthetic limbs.

Mostly, the signals looked like noise – endless squiggly lines marching across the screen. But there were patterns, and it was my job to find them. There it is! A low frequency spike appeared across multiple channels. It seemed too good to be true – it corresponded almost perfectly with when a target had appeared on the patient’s screen.

Our main BCI control interface

I quickly trained the control algorithms to recognize that signal. The results were great! I hadn’t needed to watch someone’s head get cut open or consider the electrode grid’s location on the brain. Numbers don’t lie! It was that breakthrough moment every researcher dreams about.

With a wry smile, I went to share my findings with others in the lab. Almost immediately, instinctively, some without even looking at the data, they explained to me that those signals were from muscular activity – a tense forehead, jaw movement, or eye blinks.

My face turned red as I fled the room. Our goal was to provide direct neural control of assistive devices, decoupling the control from other movements and helping people who lack muscular control. What had seemed like perfect signals were immediately dismissed by domain experts. It was obvious to anyone who had seen the phenomenon even a single time.

Assistive devices were the goal, but the study participants really enjoyed mind controlling Mario

Data needs context

As a data scientist something I often repeat is, “anecdotes aren’t data.” We obsess over not over-fitting our algorithms to single datapoints or making overly broad, generalized assumptions. No one could predict a random person’s 100-meter dash time based on data from the Olympic finals, and neither could any algorithm.

But anecdotal observations do matter. They need to be part of a broader, more representative dataset and interpreted by those who understand the processes that generated them.

Domain expertise and knowledge of edge cases play a crucial role in applying AI to real-world solutions. Modern large language models (LLMs) can produce incredible results and enhance capabilities across many domains. But they still need guidance and validation. These models rely solely on the data they’ve seen, lack understanding of the mechanisms that produced the data, and cannot seek additional clarity or context.

Sometimes, like I did when first analyzing the neural data, these models very confidently give very wrong answers. All they’re capable of is making predictions based on what they’ve seen before, and the more they venture outside that historical data the more the results become wild (but sometimes fascinating) extrapolations.

The challenge is the purpose

I sat alone back in my small, windowless office, consoling myself with a greasy lunch that only someone still in their 20s could eat without repercussions. My algorithms had betrayed me. My confidence in the technology was shattered.

“But wait,” I thought. “If it were easy then it would have already been done.” I hadn’t pursued advanced research to do what was already possible. This setback was indeed a breakthrough moment – it was when I realized my challenge and purpose.

With the rest of my curly fries forgotten and my milkshake left to melt, I went back to work. If muscular signals were overpowering the neural activity, I needed to remove them. Instead of focusing on the control algorithm I focused on the data, isolating the useful information buried within the noise. The algorithms were tools for me to skillfully apply, not the other way around.

Various algorithms for removing noise from neural signals

Eventually we developed a system that enabled a man with quadriplegia to reach out with a prosthetic limb and hold his girlfriend’s hand, the first time he had done so in years. It remains one of the most rewarding experiences of my career, and a constant reminder of the value of anecdotal data, domain expertise, and multi-disciplinary teams.


John Kelly is the original architect and developer of Envelop Risk’s core technology, CyberTooth. After spending a few years in the UK building the team, he now lives back in the US with his family. John is also an internationally recognized ultra marathon runner, one of only three people to complete the Barkley Marathons more than once and holding the speed record on many well-known routes, including the Pennine Way. The thoughts and views in these posts are his own reflections from experiences as an accomplished athlete and entrepreneur, and do not necessarily reflect the views of others at Envelop Risk.