The physical world isn't ready for physical AI
What if physical AI could learn
Welcome back to Reasoned by Nikhil Pahwa, a guide to how AI is changing the world around us.
Today’s newsletter is about the unique challenges of Physical AI (at homes, in factories), based on a panel at SuperAI
Flying back from SuperAI last year, I wrote down a replacement thesis: first digital creation gets replaced, then digital actions, then manufacturing, then physical actions. Physical actions came last because they’re the hardest, and a panel at this year’s SuperAI spent most of its time explaining why.
I’ve written about how agents are designed to route around barriers: hit a broken API, try another method...keep trying till you exhaust tried/tokens. Unlike digital, Physical AI runs into barriers that can’t be routed (or shouldn’t) around: I just visualised a humanoid breaking down a door it can’t unlock.
At a panel on Physical AI, Yanliang Zhang (Chief Scientist, Western Robot) and Michael Spranger (Senior Executive Director, Sony Research) and Alan Ng (Quikbot Technologies) kept coming to the point that while the hardware for facility management, food industry robotics, and computer vision are all in place, the models are powerful and the demand is real — McKinsey and Morgan Stanley put the physical AI market at $25-30 trillion by 2050 — but what lacks is the trust infrastructure, questions about data sovereignty. Some notes from the panel:
1. The liability chain is broken
Ng said he builds what he calls “trust infrastructure” for the built environment for clients. His framing of the problem:
“I have a facility management company, which then submits the contract to a managing agent, which obviously finally offers the service to me. So take a look at this relationship. The one that actually should be liable for the physical AI autonomous system is really not the one at the end it does going to be sued. So there’s a break in the trust factor.”
“…“way before physical AI truly deploy in mass scale, I think the trust infrastructure, the legality part of it, the accountability by insurance or by the government need to be resolved.”
The gap between responsibility, accountability and liability is not new to the digital world, and neither the insurance market nor standard contracts have a structure for the gap between them. The risks can he higher in the physical world, so the gap that Ng is describing is itself a market opportunity: someone has to build the liability and insurance layer for physical AI before any robot deploys at scale.
2. The physical world was never designed to be networked
“Some buildings are 35 years old. Some buildings are 106 years old. Some buildings are 15 years old. And so the elevator was installed 30 years ago. So how do we service them? So there’s a lot of siloed issue.”
Ng made the point that a robot facing a 30-year-old lift controller on a proprietary protocol will not be able to work around a problem, or around different enviroments and different states of repair (or fragility).
The built environment, Ng said, is “a sector that has many siloed technologies.” Changi and Hong Kong airports can deploy autonomous vehicles because their infrastructure was built for it; most buildings were not. This is the same issue as in enterprise software, but with the added complexity of a physical environment.
Commercial Break :P
Vibe coding workshops
Workshop this Saturday: I’m doing the July edition of the Vibe Coding workshop in Delhi, which is the last one in Delhi for a couple of months at least. Sign up here.
Mumbai workshop: I’m planning a vibe coding workshop in August in Mumbai. You can use the same form above to set a reminder for Mumbai.
Corporate workshops: In case you’re interested in corporate vibe coding workshops for your teams, please contact rakesh at medianama dot com
Impact of AI Briefing (Custom):
In case of a bespoke briefing on the impact of AI on businesses, including yours, please raise an inquiry here.
Also, if you’d like to advertise in this newsletter, ping rakesh at medianama dot com.
3. Training data can’t cross borders
Zhang highlighted that the third leg of the golden triangle is missing: while hardware is converging, and models now powerful, the data has its own “sovereignty” issues:
“Because of the regulations, the data we collect in Singapore might not be able to use in the (United) States. The data you collect from China, (it) is unlikely you can use (it) in your country. Because this kind of challenge force(s) us to re-collect data for the same problem. However, for the conventional automation business, you develop everything in your factory, tomorrow you can sell to the world. But today, cannot. Data is an asset. Data is national security. This is a challenge we are facing today.”
Data thus becomes the moat. At SuperAI last year, Persona AI’s Nicolaus Radford argued the bottleneck in humanoids is software rather than hardware: industrial robots earned their training data through decades of factory-floor deployment, and there’s no equivalent corpus for the messy real world. In software a model release resets everyone’s advantage overnight, but in physical AI the data is the moat precisely because it can’t move.
4. Not all parts of the process are the same
“If you go to the fast food, when they prepare the hamburger, the last step is wrap it. Wrap it. And if we can solve this problem tomorrow, we can deploy robot across maybe over 10,000 fast food restaurants, (but) who just wraps the hamburger?
This is the weakest link problem: fast food chains will probably not be able to automate making burgers in their entirety because they will probably not be able to hire people just to do the last, hardest part of wrapping a burger. When it comes to factory operations, companies will have allocate some tasks to humanoids, and then expand the scope as the technology improves. End-to-end automation of physical tasks at present seems unlikely.
5. The capture of training data has ethical failures
Spranger highlighted some of the ethical issues with data capture.
First, consent: Sony published a Nature cover paper last December on building computer vision datasets correctly in terms of: “people who appear in datasets, they have the right to consent, but they also have a right to revocation. So how do you manage that?” The physical presence of robots makes things complicated.
“…If you think about sensors that are deployed in mobile phones, right? So today we have a choice. That data doesn’t automatically get uploaded for AI providers to train their models on. But of course, if you have a humanoid robot, the performance of the sense and the vision is directly tied to the data that it gets. And so you would actually, as a developer, you would like to use that data. But that is very sensitive data. I mean, that robot might be in your kitchen or in your bedroom or in your house, right? And so how do you deal with that? We do have technical means to deal with that, right?”
It’s the conflict I wrote about in what robots need that the internet never had. Shift X launched in New York offering free home cleaning in exchange for recording it; Pronto ran a version of this in India and landed in controversy.
As I’ve said previously, In the AI age, even if you’re paying, you’re still the product.
Second, bias:
“of course, we want these datasets to work for people across the world, right? Not just in a particular geography, but they have to be bias-free”, he said. “And so this sort of provenance and dataset connection, it’s actually surprisingly still a core issue in some of the core fields of AI. I kid you not, some of the computer vision models that have been in use in the past, even in the last year, they’ve been shown to include child sexual abuse material, and nobody noticed. And I think no company on Earth wants anything to do with that or on their servers or otherwise.
You can’t fix this in post-training; it’s in the foundation the models are built on, and fixing it means rebuilding the dataset, which almost nobody is paid to do.
6. The jobs nobody wants
“Ladies and gentlemen, some of the most essential services in your life and my life, is really undesirable jobs.”
Ng’s argument is that physical AI fills the jobs educated workforces refuse rather than the jobs they compete for. He points to healthcare, where “the people that’s pouring away waste and cleaning up all kinds of mundane and dirty and dull jobs, these are jobs that need to be attacked immediately.” On luggage handling, he said, “Nobody want to load your luggage into a plane these days. That’s the reason why when you fly some European brands, you saw your luggage being dumped because people hate that job.” As he put it: “Even if we have jobs, we have people, people just don’t want to do that job. And we need that.” He calls it “a structural deficiency.”
I don’t think this necessarily applies everywhere, because there will always be people who just want to do their job and get back to their lives, instead of seeking “self actualisation”. I think while there are jobs that nobody wants to do, like cleaning/fixing sewage pipes or maybe picking up garbage, there are jobs that people don’t want to do at that prices. That is not to say that in some parts of the world, the demographics are squeezing the timeline. Spranger explained it:
“One of the large car manufacturers in the world being listed on the Nikkei. I think the stat is something like 60% of their workforce is 50 years old. And so in Japan, people retire at 60 years. So half the company is gone in 10 years.”
Immigration is always a solution, but much of the world is insular, if not racist: they’d rather have robots than people who are “not like us”.
But to link this back to an earlier point, the demographic clock won’t slow down waiting for trust infrastructure to catch up.
7. $100 a month for a household robot
“I think three years we should see robot ready to work in your household range between $8,000 to $50,000... You probably don’t even need to pay that because it will be based on a service level every month, a hundred bucks and you get one. No different from what we are paying for a maid today.”
Zhang was more aggressive: “Last year, we predict 10 years. This year, we say in the next five years, robot will go to your home no matter you like or not.”
This will probably only happen if liability is solved for: I’m not sure if data, ethics, and privacy problems will be on-top-of-mind for regulators or those willing to trade privacy for convenience.
8. Robots will need character
“When we look at robots today, they don’t really have character. I think in the future, and this is where entertainment obviously comes in, and maybe my view from a Sony perspective, I think we’ll see much more robots that have a particular character that we can relate to. I think a household robot is great if it does dishes, right? But it is also something that can create joy, entertainment, fun, excitement if it’s kind of built correctly around IP.”
Zhang suggested that one should go and watch “Star Wars” one more time, and while I doubt we’ll see C-3PO soon (is a Roomba half a R2D2 already?), but this does link back to last years panel on Physical AI, where the point about a robot as a platform and being able to download skills into it was made.
What no one spoke about
One aspect that I’m seeing come across in digital agents, especially Hermes, is the self-learning capability. The current Physical AI models and approaches to them almost appear static: you download a firmware and data update and get more capabilities.
However, what would actually end up being a lock-in for a particular hardware-software combination of a physical agent is not just that it understands how to navigate an environment, but it knows how to navigate your environment and then some more: your preferences, patterns, quirks, where you keep certain things, how you prefer things to be. Why aren’t we talking about this in case of Physical AI?
My friend Nithin KD increasingly speaks of his OpenClaw Davidson as his buddy, and I’ll leave you with a couple of his tweets:
*
If you found this useful, share it with someone. If you want to learn vibe coding (it’s fun, trust me), sign up.




