
This article is part of our reviews of AI research papers, a series of posts that explore the latest findings in artificial intelligence.
What makes us humans so good at making sense of visual data? That’s a question that has preoccupied artificial intelligence and computer vision scientists for decades. Efforts at reproducing the capabilities of human vision have so far yielded results that are commendable but still leave much to be desired.
Our current artificial intelligence algorithms can detect objects in images with remarkable accuracy, but only after they’ve seen many (thousands or maybe millions) examples and only if the new images are not too different from what they’ve seen before.
There is a range of efforts aimed at solving the shallowness and brittleness of deep learning, the main AI algorithm used in computer vision today. But sometimes, finding the right solution is predicated on asking the right questions and formulating the problem in the right way. And at present, there’s a lot of confusion surrounding what really needs to be done to fix computer vision algorithms.
In a paper published last month, scientists at Massachusetts Institute of Technology and University of California, Los Angeles, argue that the key to making AI systems that can reason about visual data like humans is to address the “dark matter” of computer vision, the things that are not visible in pixels.
Titled, “Dark, Beyond Deep: A Paradigm Shift to Cognitive AI with Humanlike Common Sense,” the paper delves into five key elements that are missing from current approaches to computer vision. Adding these five components will enable us to move from “big data for small tasks” AI to “small data for big tasks,” the authors argue.
Today’s AI: Big data for small tasks
“Recent progress in deep learning is essentially based on a ‘big data for small tasks’ paradigm, under which massive amounts of data are used to train a classifier for a single narrow task,” write the AI researchers from MIT and UCLA.
Most recent advances in artificial intelligence rely on deep neural networks, machine learning algorithms that roughly mimic the pattern-matching capabilities of human and animal brains. Deep neural networks are like layers of complex mathematical functions stacked on top of each other. To perform their functions, DNNs go through a “training” process, where they are fed many examples (e.g. images) and their corresponding outcome (e.g. the object the images contain). The DNN adjusts the weights of its functions to represent the common patterns found across objects of common classes.
In general, the more layers a deep neural network has and the more quality data it is trained on, the better it can extract and detect common patterns in data. For instance, to train a neural network that can detect cats with accuracy, you must provide it with many different pictures of cats, from different angles, against different backgrounds, and under different lighting conditions. That’s a lot of cat pictures.
Although DNNs have proven to be very successful and are a key component of many computer vision applications today, they do not see the world as humans do.
In fact, deep neural networks have existed for decades. The reason they have risen to popularity in recent years is the availability of huge data sets (e.g. ImageNet with 14 million labeled images) and more powerful processors. This has allowed AI scientists to create and train bigger neural networks in short timespans. But at their core, neural networks are still statistical engines that search for visible patterns in pixels. That is only part of what the human vision system does.
“The inference and reasoning abilities of current computer vision systems are narrow and highly specialized, require large sets of labeled training data designed for special tasks, and lack a general understanding of common facts (facts that are obvious to average humans),” the authors of “Dark, Beyond Deep” write.
The scientists also point out that human vision is not the memorization of pixel patterns. We use a single vision system to perform thousands of tasks, as opposed to AI systems that are tailored for one model, one task.
How can we achieve human-level computer vision? Some researchers believe that by continuing to invest in larger deep learning models, we’ll eventually be able to develop AI systems that match the efficiency of the human vision.
The authors of “Dark, Beyond Deep,” however, underline that breakthroughs in computer vision are not tied to better recognizing the things that are visible in images. Instead, we need AI systems that can understand and reason about the “dark matter” of visual data, the things that are not present in images and videos.
“By reasoning about the unobservable factors beyond visible pixels, we could approximate humanlike common sense, using limited data to achieve generalizations across a variety of tasks,” the MIT and UCLA scientists write.
These dark components are functionality, intuitive physics, intent, causality, and utility (FPICU). Solving the FPICU problem will enable us to move from “big data for small tasks” AI systems that can only answer “what and where” questions to “small data for big tasks” AI systems that can also discuss the “why, how, and what if” questions of images and videos.
Intuitive physics
Our understanding of how the world operates at the physical level is one of the key components of our visual system. Since infanthood, we start to explore the world, much of it through observation. We learn about things such as gravity, object persistence, dimensionality, and we later use these concepts to reason about visual scenes.
“The ability to perceive, predict, and therefore appropriately interact with objects in the physical world relies on rapid physical inference about the environment,” the authors of “Dark, Beyond Deep,” write.
With a quick glance at a scene, we can quickly understand which objects support or are hanging from others. We can tell with decent accuracy whether an object will tolerate the weight of another or if a stack of objects is likely to topple or not. We can also reason about not only rigid objects but also about the properties of liquids and sand. For instance, if you see an upended ketchup bottle, you’ll probably know that it has been positioned to harness gravity for easy dispensing.
While physical relationships are, for the most part, visible in images, understanding them without having a model of intuitive physics would be nearly impossible. For instance, whether you know anything about playing pool or not, you can quickly reason about which ball is causing other balls to move in the following scene because of your general knowledge of the physical world. You would also be able to understand the same scene from a different angle, or any other pool table scene.
What needs to change in current AI systems? “To construct humanlike commonsense knowledge, a computational model for intuitive physics that can support the performance of any task that involves physics, not just one narrow task, must be explicitly represented in an agent’s environmental understanding,” the authors write.
This goes against the current end-to-end paradigm in AI, where neural networks are given video sequences or images and their corresponding descriptions and expected to embed those physical properties into their weights.
Recent work shows that AI systems that have incorporated physics engines are much better at reasoning about relations between objects than pure neural network–based systems.
Causality
Causality is the ultimate missing piece of today’s artificial intelligence algorithms and the foundation of all FPICU components. Does the rooster’s crow cause the sun to rise or the sunrise prompts the rooster to crow? Does the rising temperature raise the mercury level in a thermometer? Does flipping the switch turn on the lights or vice versa?
We can see things happening at the same time and make assumptions about whether one causes the other or if there are no causal relations between them. Machine learning algorithms, on the other hand, can track correlations between different variables but can’t reason about causality. This is because causal events are not always visible, and they require an understanding of the world.
Causality enables us not only to reason about what’s happening in a scene but also about counterfactuals, “what if” scenarios that have not taken place. “Observers recruit their counterfactual reasoning capacity to interpret visual events. In other words, interpretation is not based only on what is observed, but also on what would have happened but did not,” the AI researchers write.
Why is this important? So far, success in AI systems have been largely tied to providing more and more data to make up for the lack of causal reasoning. This is especially true in reinforcement learning, in which AI agents are unleashed to explore environments through trial and error. Tech giants such as Google use their sheer computational power and limitless financial resources to brute-force their AI systems through millions of scenarios in hopes of capturing all possible combinations. This is the approach has largely been successful in areas such as board and video games.
As the authors of “Dark, Beyond Deep” note, however, reinforcement learning programs don’t capture causal relationships, which limits their capability to transfer their functionality to other problems. For instance, an AI that can play StarCraft 2 at championship level will be dumbfounded if it is given Warcraft 3 or an earlier version of StarCraft. It won’t even be able to generalize its skills beyond the maps and race it has been trained on, unless it goes through thousands of years of extra gameplay in the new settings.
“One approach to solving this challenge is to learn a causal encoding of the environment, because causal knowledge inherently encodes a transferable representation of the world,” the authors write. “Assuming the dynamics of the world are constant, causal relationships will remain true regardless of observational changes to the environment.”
Functionality
If you want to sit and can’t find a chair, you’ll look for a flat and solid surface that can support your weight. If you want to drive a nail in a wall and can’t find a hammer, you’ll look for a solid and heavy object that has a graspable part. If you want to carry water, you’ll look for a container. If you want to climb a wall, you’ll look for objects or protrusions that can act as handles.
Our vision system is largely task-driven. We reflect on our environment and the objects we see in terms of the functions they can perform. We can classify objects based on their functionalities.
Again, this is missing from today’s AI. Deep learning algorithms can find spatial consistency in images of the same object. But what happens when they have to deal with a class of objects that is very varied?

Since we look at objects in terms of functionality, we will immediately know that the above objects are all chairs, albeit very weird ones. But for a deep neural network that has been trained on images of conventional chairs, they will be confusing masses of pixels that will probably end up being classified as something else.
“Reasoning across such large intraclass variance is extremely difficult to capture and describe for modern computer vision and AI systems. Without a consistent visual pattern, properly identifying tools for a given task is a long-tail visual recognition problem,” the authors note.
Intent
“The perception and comprehension of intent enable humans to better understand and predict the behavior of other agents and engage with others in cooperative activities with shared goals,” write the AI researchers from MIT and UCLA.
Inferring intents and goals play a very important part in our understanding of visual scenes. Intent prediction enables us to generalize our understanding of scenes and be able to reason about novel situations without the need for prior examples.
We have the tendency to anthropomorphize animate objects, even when they’re not human—we empathize with them subconsciously to understand their goals. This allows us to reason about their courses of actions. And we do not even need rich visual cues to reason about intent. Sometimes, an eye gaze, a body posture or motion trajectory is enough for us to make inferences about goals and intentions.
Take the following video, which is an old psychology experiment. Can you tell what is happening? Most participants in the experiment were quick to establish social relationships between the simple geometric shapes and give them roles such as bully, victim, etc.
This is something that can’t be fully extracted from pixel patterns and needs complementary knowledge about social relations and intent.
Utility
Finally, the authors discuss the tendency of rational agents to make decisions that maximize their expected utility.
“Every possible action or state within a given model can be described with a single, uniform value. This value, usually referred to as utility, describes the usefulness of that action within the given context,” the AI researchers write.
For instance, when searching for a place to sit, we try to find the most comfortable chair. Many AI systems incorporate utility functions, such as scoring more points in a game or optimizing resource usage. But without incorporating the other components of FPICU, the use of utility functions remains very limited.
“these cognitive abilities have shown potential to be, in turn, the building blocks of cognitive AI, and should therefore be the foundation of future efforts in constructing this cognitive architecture,” write the authors of “Dark, Beyond Deep.”
This, of course, is easier said than done. There are numerous efforts to codify some of the components mentioned in the paper, and the authors mention some of the promising work that is being conducted in the field. But so far, advances have been incremental and the community is largely divided on which approach will work best.
The authors of “Dark, Beyond Deep” believe hybrid AI systems that incorporate both neural networks and classic intelligence algorithms have the best chance of achieving FPICU-capable AI systems.
“Experiments show that the current neural network-based models do not acquire mathematical reasoning abilities after learning, whereas classic search-based algorithms equipped with an additional perception module achieve a sharp performance gain with fewer search steps.”
Original post: https://bdtechtalks.com/2020/06/01/artificial-intelligence-computer-vision-fpicu/
For most recent news you have to go to see the web and on the web I
found this web page as a best web site for most recent updates.
Great article, just what I needed.
I do not even know how I ended up here, but I
thought this post was great. I don’t know who you are but certainly you are going to a famous blogger if you aren’t
already 😉 Cheers!
When some one searches for his necessary thing, thus he/she
wishes to be available that in detail, therefore that
thing is maintained over here.
Link exchange is nothing else except it is just placing the other
person’s weblog link on your page at proper place
and other person will also do similar in support of you.
This website really has all the information and facts I wanted concerning this subject and didn’t know who to ask.
I loved as much as you’ll receive carried out right here.
The sketch is attractive, your authored material stylish.
nonetheless, you command get got an shakiness over that
you wish be delivering the following. unwell unquestionably come more formerly
again since exactly the same nearly a lot often inside case
you shield this increase.
I am the business owner of JustCBD Store brand (justcbdstore.com) and I’m presently trying to develop my wholesale side of company. It would be great if someone at targetdomain can help me 🙂 I considered that the most suitable way to accomplish this would be to talk to vape companies and cbd retail stores. I was hoping if someone could suggest a reliable web site where I can purchase CBD Shops B2B Data List I am already looking at creativebeartech.com, theeliquidboutique.co.uk and wowitloveithaveit.com. Not sure which one would be the very best choice and would appreciate any assistance on this. Or would it be easier for me to scrape my own leads? Ideas?
I am the proprietor of JustCBD label (justcbdstore.com) and I’m presently seeking to expand my wholesale side of business. I really hope that someone at targetdomain can help me . I thought that the very best way to do this would be to talk to vape shops and cbd stores. I was really hoping if anyone could suggest a reliable web site where I can buy Vape Shop B2B Business Data List I am presently considering creativebeartech.com, theeliquidboutique.co.uk and wowitloveithaveit.com. Not sure which one would be the most ideal choice and would appreciate any advice on this. Or would it be easier for me to scrape my own leads? Suggestions?
Hello! Do you know if they make any plugins to protect against hackers?
I’m kinda paranoid about losing everything I’ve worked hard
on. Any suggestions?
Hi there, its good post about media print, we all be familiar with media is a enormous source of data.
Hello there! I simply want to offer you a huge thumbs
up for your excellent information you have got here on this post.
I’ll be coming back to your site for more soon.
Oh my goodness! Awesome article dude! Thank you so much, However I am going through issues with your RSS. I don’t understand the reason why I can’t join it. Is there anybody else getting similar RSS issues? Anyone who knows the solution will you kindly respond? Thanx!!
Good write-up. I definitely appreciate this site. Stick with it!
Way cool! Some very valid points! I appreciate you writing this post plus the rest of the website
is extremely good.
Do you have any video of that? I’d like to find out more details.
adreamoftrains web hosting company
Can I simply just say what a relief to discover a person that really understands what they’re discussing on the net. You definitely realize how to bring an issue to light and make it important. A lot more people must check this out and understand this side of the story. I was surprised you aren’t more popular since you most certainly possess the gift.
Very good article. I will be going through many of these issues as well..
I’m not sure exactly why but this website is loading incredibly slow for me.
Is anyone else having this issue or is it a problem on my end?
I’ll check back later on and see if the problem still exists.
Spot on with this write-up, I honestly think this amazing site needs much more attention. I’ll probably be returning to see more, thanks for the info!
This is a topic that is close to my heart… Best wishes! Exactly where are your contact details though?
There’s certainly a great deal to learn about this subject. I really like all of the points you have made.
Very quickly this site will be famous amid all blogging and site-building people, due to it’s
fastidious content adreamoftrains web hosting service
Thanks to my father who informed me about this weblog, this webpage is genuinely awesome.
I was previously suggested this blog through my cousin. I
am just will no longer positive if this submit is written via him as nobody
else understand such specified approximately my difficulty.
You’re incredible! Thanks!
My web page … LouBKreitlow
Admiring the time and energy you put into your blog and detailed information you provide.
It’s great to come across a blog every once in a while that
isn’t the same out of date rehashed information. Excellent read!
I’ve saved your site and I’m including your RSS feeds to
my Google account. 3gqLYTc cheap flights
Hey are using WordPress for your site platform? I’m new to the blog world but
I’m trying to get started and set up my own. Do you need any
html coding knowledge to make your own blog? Any help would be really appreciated!
cheap flights 3gqLYTc
Great site you have here but I was wondering
if you knew of any discussion boards that cover the same topics talked about here?
I’d really like to be a part of online community
where I can get advice from other experienced people that share
the same interest. If you have any suggestions, please let me know.
Cheers!
Its like you read my mind! You seem to know a lot about this, like you wrote
the book in it or something. I think that you can do with some pics
to drive the message home a bit, but instead
of that, this is wonderful blog. An excellent read. I’ll definitely be back.
It’s difficult to find well-informed people for this subject, but you sound like you know what you’re talking about! Thanks
When I initially left a comment I seem to have clicked on the -Notify me when new comments are added- checkbox and now every time a comment is added I recieve four emails with the exact same comment. Perhaps there is an easy method you are able to remove me from that service? Cheers!
Pretty! This has been an extremely wonderful post. Thank you for providing these details.
Way cool! Some extremely valid points! I appreciate you penning this post plus the rest of the site is also really good.
Excellent web site you’ve got here.. It’s hard to find high-quality writing like yours these days. I really appreciate people like you! Take care!!
You made some good points there. I looked on the web for more information about the issue and found most people will go along with your views on this website.
Oh my goodness! Amazing article dude! Many thanks, However I am having issues with your RSS. I don’t know why I can’t subscribe to it. Is there anybody else having similar RSS issues? Anyone who knows the answer can you kindly respond? Thanks!!
I’m pretty pleased to find this site. I need to to thank you for your time for this fantastic read!! I definitely enjoyed every part of it and I have you saved to fav to see new information in your web site.
You’re so cool! I don’t think I’ve truly read through anything like this before. So good to find another person with unique thoughts on this subject. Seriously.. thank you for starting this up. This web site is one thing that’s needed on the web, someone with some originality!
I simply want to say I am just very new to blogging and honestly loved your web-site. Probably I’m going to bookmark your website . You absolutely come with fabulous posts. Thanks for revealing your web-site.