Featured
- Get link
- Other Apps
An OpenAI
has built an AI model that helps robots learn tasks like humans OpenAI
The new model, called RFM-1, was trained on years of data
collected from Covariant’s small fleet of item-picking robots that customers
like Crate & Barrel and Bonprix use in warehouses around the world, as well
as words and videos from the internet. In the coming months, the model will be released
to Covariant customers. The company hopes the system will become more capable
and efficient as it’s deployed in the real world.
So what can it do? In a demonstration I attended last week,
Covariant cofounders Peter Chen and Pieter Abbeel showed me how users can
prompt the model using five different types of input: text, images, video,
robot instructions, and measurements.
For example, show it an image of a bin filled with sports
equipment, and tell it to pick up the pack of tennis balls. The robot can then
grab the item, generate an image of what the bin will look like after the
tennis balls are gone, or create a video showing a bird’s-eye view of how the
robot will look doing the task.
If the model predicts it won’t be able to properly grasp the
item, it might even type back, “I can’t get a good grip. Do you have any tips?”
A response could advise it to use a specific number of the suction cups on its
arms to give it better a grasp—eight versus six, for example.
This represents a leap forward, Chen told me, in robots that
can adapt to their environment using training data rather than the complex,
task-specific code that powered the previous generation of industrial robots.
It’s also a step toward worksites where managers can issue instructions in human
language without concern for the limitations of human labour. (“Pack 600
meal-prep kits for red pepper pasta using the following recipe. Take no
breaks!”)
Lerrel Pinto, a researcher who runs the general-purpose
robotics and AI lab at New York University and has no ties to Covariant, says
that even though roboticists have built basic multimodal robots before and used
them in lab settings, deploying one at scale that’s able to communicate in this
many modes marks an impressive feat for the company.
To outpace its competitors, Covariant will have to get its
hands on enough data for the robot to become useful in the wild, Pinto told me.
Warehouse floors and loading docks are where it will be put to the test,
constantly interacting with new instructions, people, objects, and
environments.
“The groups which are going to train good models are going
to be the ones that have either access to already large amounts of robot data
or capabilities to generate those data,” he says.
Covariant says the model has a “human-like” ability to
reason, but it has its limitations. During the demonstration, in which I could
see a live feed of a Covariant robot as well as a chat window to communicate
with it, Chen invited me to prompt the model with anything I wanted. When I
asked the robot to “return the banana to Tote Two,” it struggled with retracing
its steps, leading it to pick up a sponge, then an apple, then a host of other
items before it finally accomplished the banana task.
“It doesn’t understand the new concept,” Chen said by way of
explanation, “but it’s a good example—it might not work well yet in the places
where you don’t have good training data.”
The company’s new model embodies a paradigm shift rippling
through the robotics world. Rather than teaching a robot how the world works
manually, through instructions like physics equations and code, researchers are
teaching it in the same way humans learn: through millions of
observations.
The result “really can act as a very effective flexible
brain to solve arbitrary robot tasks,” Chen said.
The playing field of companies using AI to power more nimble
robotic systems is likely to grow crowded this year. Earlier this month, the
humanoid-robotics start up Figure AI announced it would be partnering with
OpenAI and raised $675 million from tech giants like Nvidia and Microsoft. Marc
Raibert, the founder of Boston Dynamics, recently started an initiative to better integrate AI
into robotics.
This means that advancements in machine learning will likely
start translating to advancements in robotics. However, some issues remain
unresolved. If large language models continue to be trained on millions of
words without compensating the authors of those words, perhaps it will be
expected that robotics models will also be trained on videos without paying
their creators. And if language models hallucinate and perpetuate biases, what
equivalents will surface in robotics?
In the meantime, Covariant will push forward, keen on having
RFM-1 continually learn and refine. Eventually, the researchers aim to have the
robot train on videos that the model itself creates—the type of meta-learning
that not only makes my head spin but also sparks concern about
what will happen if errors made by the model compound themselves. But with such
a hunger for more training data, researchers see it almost as inevitable.
“Training on that will be a reality,” Abbeel says. “If we
talk again a half year from now, that’s what we’ll be talking about
by James O'Donnell
- Get link
- Other Apps
Popular Posts
- Get link
- Other Apps
- Get link
- Other Apps
Comments
Post a Comment