We all know that there are numerous devices present that can retain moments on camera, but what would happen if they could capture situations that were about to happen? Yes, researchers create system that generates videos of the future.
This AI Can Create ‘Videos Of The Future’
There are numerous devices present that can retain moments on camera, but what would happen if they could capture situations that were about to happen?
Researchers at MIT CSAIL have developed an algorithm for deep learning that can create videos showing “what you expect to happen in the future”. In more detail, it is a system that predicts the future. It is an artificial intelligence algorithm, from an image it can create a video with their immediate future.
The idea is that the computer is capable, after being trained, to know what is going to happen right after seeing a scene, even if it should be something relatively logical, suppose a dish to fall, a train has arrived at the station, a wave that is coming to the beach, etc.
Researchers have trained the artificial intelligence system with 2 million videos including more than a year of recording so that the computer has been smart enough to know what usually occurs after a random scene. They want to use this system to improve security tactics and to apply them in the system of autonomous cars, as it will be easier to predict accidents.
The researchers’ team says that “future versions could be used for everything for everything from improved security tactics and safer self-driving cars. According to CSAIL PhD student and first author Carl Vondrick, the algorithm can also help machines recognize people’s activities without expensive human annotations”.
The first author Carl Vondrick said that “These videos show us what computers think can happen in a scene. If you can predict the future, you must have understood something about the present”.
This work actually focuses on the processing of entire scene at once, with the algorithm generating as many as 32 frames from scratch per second.
“Building up a scene frame-by-frame is like a big game of ‘Telephone,’ which means that the message falls apart by the time you go around the whole room. By instead trying to predict all frames simultaneously, it is as if you’re talking to everyone in the room at once” said Carl Vondrick. Over time, the generator learns to deceive the discriminator.
However, the work will be presented at the Conference on Neural Information Processing Systems [NIPS] which is held on next week in Barcelona.