With Enlitic, a veteran data scientist plans to fight disease using deep learning

It was about a year ago when I realized, perhaps for the first time, just how prevalent deep learning was about to become.

It was at the KDD 2013 conference in Chicago, and then-Kaggle president and chief scientist Jeremy Howard was discussing how when teams using deep learning entered the company’s predictive-modeling competitions, they were winning at an impressive rate. If someone can figure out a way to make these techniques mainstream, he said, it could be a game-changer.

Howard left Kaggle shortly thereafter and began really looking, he says now, for a way to apply his knowledge of machine learning to a field where he could really make a difference — possibly even change the world (he also said as much in December in reply to a question on Quora about his status).

Simultaneously, deep learning was making its way out of research labs and into the hands of a lot more people. There are companies, open source projects, and no shortage of online tutorials trying to explain how it works and actually provide simplified tools for building these models. Nearly every week, it seems, a new paper is published highlighting some new advance in the field, some around building the models themselves and others around specific applications such as sentiment analysis, language understanding or computer vision. Google, Baidu, Yahoo and Facebook went on a buying spree of deep learning talent.

Structure Data 2012: Ryan Kim – Staff Writer, GigaOM, Eric Huls – VP, Allstate Insurance Company, Jeremy Howard – President and Chief Scientist, Kaggle
Jeremy Howard (left) at Structure: Data 2012 (c) Pinar Ozger /

And now Howard has found his calling as the founder and CEO of a company called Enlitic. He thinks that training deep learning systems on medical images and other patient records could revolutionize the way doctors diagnose and then treat complicated diseases. And although he’s only been actively pursuing this vision for a few months, “it’s kind of the culmination of 20 years” of seeking the ideal avenue to apply his knowledge of data analysis.

“This is exactly the right time and exactly the right place,” Howard said. “I’ve never seen anything like the results of recent advances in deep learning.”

The path to data-driven medicine

He compares the shift to that of moving from basing banner ad placement on studies or surveys about the IP address of a user to today’s advanced segmentation algorithms that analyze thousands of variables. Right now, a doctor will make a diagnosis based largely on how a patient’s story fits into a system the doctor understands. In a data-based approach, however, a doctor could start making diagnoses based more on patterns hidden among all their records.

The way that would work with deep learning, Howard explained, might be to take medical image, lab tests, doctors’ notes and personal information, and use them all to train a deep neural network. Done in a supervised manner, where the computer is able to associate a set of records with a known diagnosis and outcome, it should be able to learn the feature most strongly associated with given diseases and then help classify any new cases accordingly.

[pullquote person=”Jeremy Howard” attribution=”Jeremy Howard, CEO, Enlitic”]“This is exactly the right time and exactly the right place. I’ve never seen anything like the results of recent advances in deep learning.”[/pullquote]

Alternatively, one could also imagine training the system in a semi-supervised or unsupervised manner for the purpose of trying to solve some medical mystery. Medical researchers, or doctors, could feed a bunch of troubling, possibly related cases into a deep learning model and see what it learns about them. An easy example might that a certain feature of a set of tumors, hard for the human eye to see, stands out to the model, or that seemingly distinct words tend to be used in the same way.

Whatever the models might discover or predict, Howard isn’t suggesting they’ll do away with a doctor’s judgment. Rather, artificially intelligent computers could provide strong, unbiased second opinions, or perhaps lead a doctor down a path of investigation she other wouldn’t have considered.

“It wouldn’t replace the current way of doing medicine,” he said, “but I could imagine it being just as important.”

Indeed, a lot of people are betting on data and machine learning becoming important parts of the medical field. There’s IBM’s health-care-focused efforts with Watson, which Howard calls “a kindred spirit,” although it’s focused more on learning what’s in text books than in analyzing patient data. There are startups such as Lumiata analyzing medical literature to help nurses take over some diagnostic responsibilities from doctors. And then there are projects similar to what Enlitic wants to do, but often focused on specific conditions — work out of Stanford on breast cancer and out of the University of Washington on cardiovascular disease come to mind.

How deep learning has improved accuracy in the general-purpose ImageNet competition. Source: Enlitic

The biggest strength of deep learning: accuracy and speed

But Howard is confident that deep learning will be the biggest and best thing to happen to medicine, largely because it allows doctors to bring together so many different types of data across so many different illnesses and because its techniques have proven so effective. For example, although deep neural networks get smarter the more data they analyze, they’re already proving remarkably accurate in diagnosing breast cancer using just a handful of images. Howard said convolutional neural networks (arguably the most popular of deep learning models) are achieving near-human level performance in competitions to detect pre-cancerous cells in breast tissue, sometimes on as few as 80 training images (other models might take thousands).

He said 80 images is a “shockingly little” amount of training data, but not entirely surprising when you consider the quality of medical images. Unlike the random pictures of objects used to train many computer vision systems, medical images don’t suffer from shadows, funny angles or lack of focus. “They tend to be a lot more expressive and concise,” Howard said.

Considering the dearth of quality data some researchers — especially in the cancer field — cite as being an impediment to major breakthroughs, models that can make accurate predictions based on relatively little, if need be, look even more promising.

And although it’s early in the game — Enlitic only has five employees, is still wrapping up its first round of venture capital and still has a long R&D path ahead of it — Howard is convinced he’s onto something. Deep learning is “definitely the biggest technology breakthrough I’ve seen, ever,” he said. “I’m planning to dedicate the rest of my life to this.”