GEN-1 Foundation Model Boosts Robot Reliability and Speed

GEN-1 Foundation Model Boosts Robot Reliability and Speed
Source: Generalist
  • Generalist has launched GEN-1, a multimodal model trained on over 500,000 hours of physical interaction data to produce real-time robot actions.
  • The model reaches up to 99% success rates, completes tasks about three times faster, and requires roughly one hour of robot data per task.

Generalist introduced GEN-1, a multimodal robot learning model that controls robot actions in real time. It is trained on over 500,000 hours of physical interaction data, with updates across pretraining, post-training, reinforcement learning, and inference. The model can learn new tasks using about one hour of robot data. It reaches up to 99% success rates on some tasks, compared to 64% in earlier models, and completes them faster.

Source: YouTube / Generalist

In a blog post, Generalist presents results from repeated task runs showing consistency, speed, and recovery. In these runs, the model performs the same task over and over without intervention, folding t-shirts 86 times in a row, servicing robot vacuums over 200 times, packing blocks more than 1,800 times, folding boxes over 200 times, and packing phones over 100 times, reaching about 99% success rates versus 64% in earlier models. It also completes tasks faster, folding a box in about 12 seconds compared to roughly 34 seconds before, and packing a phone into a case in 15.5 seconds. When something goes wrong during a run, the model adjusts, regrasping, repositioning, or switching strategies to continue the task.

GEN-1 does not reach the same performance across every task, but Generalist is focused on improving results by scaling data, compute, and system design. The company expects future versions to handle a broader range of tasks and require less task-specific data, with early access to GEN-1 now available to partners.


🌀 Tom’s Take:

Robots have long been able to perform tasks in controlled demos, but not sustain that performance over time. GEN-1 shows runs where tasks are completed hundreds or thousands of times in a row, at speed, without stopping. That level of repeatability is what makes real-world use possible.


Source: Generalist AI