Microsoft researchers claim in a recently published paper that they have developed the first computer system capable of outperforming humans on a popular benchmark. While it’s estimated that humans can classify images in the ImageNet dataset with an error rate of 5.1 percent, Microsoft’s team said its deep-learning-based system achieved an error rate of only 4.94 percent.
Their paper was published less than a month after Baidu published a paper touting its record-setting system, which it claimed achieved an error rate of 5.98 percent using a homemade supercomputing architecture. The best performance in the actual ImageNet competition so far belongs to a team of Google researchers, who in the 2014 built a deep learning system with a 6.66 percent error rate.
“To our knowledge, our result is the first published instance of surpassing humans on this visual recognition challenge,” the paper states. “On the negative side, our algorithm still makes mistakes in cases that are not difficult for humans, especially for those requiring context understanding or high-level knowledge…
“While our algorithm produces a superior result on this particular dataset, this does not indicate that machine vision outperforms human vision on object recognition in general . . . Nevertheless, we believe our results show the tremendous potential of machine algorithms to match human-level performance for many visual recognition tasks.”
One of the Microsoft researchers, Jian Sun, explains the difference in plainer English in a Microsoft blog post: “Humans have no trouble distinguishing between a sheep and a cow. But computers are not perfect with these simple tasks. However, when it comes to distinguishing between different breeds of sheep, this is where computers outperform humans. The computer can be trained to look at the detail, texture, shape and context of the image and see distinctions that can’t be observed by humans.”
If you’re interested in learning how deep learning works, why it’s such a hot area right now and how it’s being applied commercially, think about attending our Structure Data conference, which takes place March 18 and 19 in New York. Speakers include deep learning and machine learning experts from Facebook, Yahoo, Microsoft, Spotify, Hampton Creek, Stanford and NASA, as well as startups Blue River Technology, Enlitic, MetaMind and TeraDeep.
We’ll dive even deeper into artificial intelligence at our Structure Intelligence conference (Sept. 22 and 23 in San Francisco), where early confirmed speakers come from Baidu, Microsoft, Numenta and NASA.