Current AI’s are terrible at understanding a first-person point of view… Facebook’s next gen AI will change all of that.
Despite popular opinion, AI software has no sense of ‘self’ and is unlikely to do so any time in the near future. However…
Facebook have just announced that they are hoping to build an AI system capable of taking an ‘egocentric’ view of the world.
The plan is for future AI programs to be capable of taking a more ‘human’ view of the world by learning from more first-person shot imagery and video that centres the cameras view firmly in, well, the centre.
As simple as that may sound, Facebook hope it will unlock unthought of potential to enhance AR (augmented reality) tools alongside wearable tech… For instance, your glasses could scan a room and find your car keys if you can’t see them.
Facebook are working on exactly that in conjunction with thirteen other universities and labs across nine countries on Ego4D, a long-term project they are funding without any outside help. So far Project Ego4D has collated over 2,200 hours of first-person video from over 700 participants just going about their daily lives.
Project Ego4D has been tasked with five developmental benchmarks for developing the next gen of AI:
- Episodic memory: What happened when? (e.g., “Where did I leave my keys?“)
- Forecasting: What am I likely to do next? (e.g., “Wait, you’ve already got your keys“)
- Hand and object manipulation: What am I doing? (e.g., “Putting the keys in the car“)
- Audio-visual diarisation: Who said what when? (e.g., “What was the main topic during class?”)
- Social interaction: Who is interacting with whom? (e.g., “Help me better hear the person talking to me at this noisy restaurant”)
All of the above play right into the rumours that Facebook, owners of Oculus VR, are planning on launching their own branded smart glasses soon.
We expect that to the extent companies use this dataset and benchmark to develop commercial applications, they will develop safeguards for such applications. For example, before AR glasses can enhance someone’s voice, there could be a protocol in place that they follow to ask someone else’s glasses for permission, or they could limit the range of the device so it can only pick up sounds from the people with whom I am already having a conversation or who are in my immediate vicinity. – Facebook Spokesperson