Imagine a concert. A massive party. Or even a demonstration.
Hundreds, maybe thousands of people are taking pictures and video with their smartphones. Each is capturing a little slice of reality. Each is getting the best of what might be perfect from their angle, what’s visible from their vantage point.
What if you could auto-create a movie from all of them, instantly?
That’s exactly what IMINT is working on. IMINT is a Swedish company you’ve never heard of whose Vidhance tech uses deep learning to clarify, auto-correct, enhance, and otherwise improve video in over 250 million smartphones globally. Now the company, which has primarily been a B2B software provider, is working on a consumer-facing software release.
“I can be at some event with my wife and kids, and when I get home, it’s like I really want to combine my videos because someone else got a better angle at something,” CTO Johan Svensson told me recently on the TechFirst podcast. “But I never get to the point where I actually upload all this to my computer and I start my editing software and do that. So … we figured out that this could actually be done in a more automatic way — basically, completely automated now with … AI.”
Video processing algorithms can find the region of interest in videos, IMINT says, and in the future, they’ll automatically zoom in and intelligently focus on what they’re seeing: a goal being scored at a soccer game, the birthday cake candles being blown out, or the moment the beat drops in the club.
“The next wave of video creation will be a community effort, intelligently blending video from multiple smartphones into one finished product,” IMINT says. “Imagine a wedding with 200 guests and nearly as many smartphones all recording the event … An individual, group, or family can record the same event and easily generate a polished-looking video with multiple angles, effects, transitions, and more. Eventually, an even larger crowd — friends and strangers — will be able to simultaneously share smartphone video from a concert, party, rally, or other gathering and have it stitched together to tell visual stories no one has been able to tell before without considerable effort.”
Automated solutions are useful for creating memories after the fact, because who really has the time to sit down and make movies or slideshows out of the 950 photos and videos they took on their last vacation.
But this is different.
This is a cloud-based that takes video from a group of smartphones, checks metadata like GPS positioning, sensor resolution, actual resolution, motion, direction the phones are pointing (thanks to the magnetometer in your smartphone), correlates common objects found in the videos, and autocreates a collaborative movie of the event highlights.
“We will use the same algorithms to kind of find the peak moments and highlights of these videos, and then just compile it into … better selection than we’ve seen so far,” Svensson says. “We can actually handle parallel streams in a way more similar to like a professional production crew.”
One thing that will be incredibly helpful?
This approach consumes a lot of data: uploading video is taxing on most 4G and lower cellular networks, although WiFi can be fine. But 5G could make this almost seamless as video uploads in the background without any particular fuss or degradation of quality.
IMINT already does business with “most of the world’s smartphone manufacturers,” Svensson says, and those phones should be able to participate out of the box. Other phones will have to download an app first.
Privacy is clearly an issue with technology like this.
One thing that comes to mind with the recent Black Lives Matter marches as well some of the incidents with police is how many phones always seem to be out and recording. As a European company, IMINT is very aware of the privacy implications, and only phones that are connected via the company’s cloud-based application can upload and share video. What happens with it after that, of course, is anyone’s guess.
At least IMINT is not planning to spring the tool full-formed on the world. It’ll release the software iteratively and be responsive to what people do with it.
“So we’re going to plant a tool out there and we’re going to monitor how it’s actually used, and then we’re going to strengthen those areas,” Svensson says.
Ultimately, we might be able to surf through global happenings and parties that we missed in automagically created movies that capture the essence of the event, without any extra work by actual participants. That might be be possible live, with the opportunity to skip from camera to camera until you find the most interesting view.
The future, they say, will be televised.
Perhaps by all of us, collaboratively.