Robotics is one of those things the business funny papers regularly wonder about; it seems like consumer robotics is a revolutionary trillion dollar market which is perpetually 20-years away -more or less like nuclear fusion.
I had contemplated fiddling with robotics in hopes of building something that would do a useful science-fictiony thing, like go fetch me a beer from the refrigerator. Seemed like a nice way of fucking around with math, the machine shop and ending up with something cool and useful to fiddle with. To do this, my beer fetching robot would have to navigate my potentially cluttered apartment to the refrigerator, open the door, look for the arbitrarily shaped/sized beer bottle amidst the ketchup bottles, jars of herring, broccoli and other such irrelevant objects, move things out of the way, grasp the bottle and return to me. After conversing with a world-renowned expert in autonomous vehicles; a subset of robotics, I was informed that this isn’t really possible. All the actions I described above are open problems. Sure, you could do some ridiculous workaround that makes it look like autonomous behavior. I could also train a monkey or a dog to do the same thing, or get up and get the damn beer myself.
There really aren’t any lists in open problems in robotics, I am assuming because it would be a depressingly long litany. I figured I would assemble one; one which I assume will be gratuitously incomplete and occasionally wrong, but which makes up for all that by actually existing. Like my list of open problems in physics and astronomy, I could very well be wrong about some of these, or behind the times, and since my expertise consists in google and 5-10 year old conversations with a cool dude between deadlifts, but it seems worth doing.
- Motion planning is an actual area of research, with its own journals, schools of thought, experts and sets of open problems. Things like, “how do I get my robot from point A to point B without falling into a canyon, getting stuck, or being able to deal generally with obstacles” are not solved problems. Even things like a model of where the robot is, with respect to the surroundings: totally an open problem. How to know where your manipulator is in space, and how to get it somewhere else; open problem. Obviously beer fetching robots need to do all kinds of motion planning. Any potential solution will be ad-hoc and useless for the general case of, say, fetching a screw from a bin in the machine shop.
- Multiaxis singularities -this one blew my mind. Imagine you have a robot arm bolted to the ground. You want to teach the stupid thing to paint a car or something. There are actual singularities possible in the equations of motion; and it is more or less an underconstrained problem. I guess there are workarounds for this at this point, but they all have different tradeoffs. It’s as open a problem as motion planning on a macro scale.
- Simultaneous Location and Mapping. SLAM for short. When you enter a room, your brain knows exactly where your body is, and makes a map of the surroundings. Robots have a hard time with this. There are any number of solutions to the problem, but ultimately the most useful one is to make a really good map in advance. Having a vague or topological map or some kind of prior as to the environment: these are all completely different problems which seem like they should have a common solution, but don’t. While there are solutions to some problems available, they’re not general and definitely not turn-key to where there would be a SLAM module you can buy for your robot. I could program my beer robot to know all about my room, but there’s always going to be new obstacles (a pair of shoes, a book) which aren’t in its model. It needs SLAM to deal.
- Lost Robot Problem. Related; if I wake up, and my friends moved my bed to another room; we’ll all have a laugh. Most robots won’t know what to do if it loses track of its location. It will need a strategy to deal with this. The strategies are not general. It’s extremely likely I turn on my beer robot in different positions and locations in the room, and it will have to deal with that. Now imagine I put it somewhere else in the apartment building.
- Object manipulation and haptic feedback. Hugely not done yet. The human hand is an amazing thing, and robot manipulators are nowhere near being able to manipulate with haptic feedback or even simply manipulate real world objects based on visual recognition. Even something like picking up a stationary object with a simple graspable plane is a huge unsolved problem people publish on all the time. My beer robot could have a special manipulator designed to grasp a specific kind of beer bottle, or a lot of models of shapes of beer bottles, but if I ask the same robot to fetch me a carrot or a jar of mayo, I’m shit out of luck.
- Depth estimation. A sort of subset of object manipulation; you’d figure a robot with binocular vision, or even simply the ability to poke at an object and see it move is something pretty simple to do. It’s very much an open problem. Depth estimation is a problem for my beer-fetching robot, even if the beer is in the same place in the refrigerator every time (the robot won’t be, depending on its trajectory).
- Position estimation of moving objects. If you can’t know how far away an object is, you’re sure going to have a hard time estimating what a moving object is doing. Lt. Data ain’t gonna be playing baseball any time soon. If my beer robot had a human-looking bottle opener, it would need a technology like this.
- Affordance discovery how to predict what an object you interact with will do when you interact with it. In my example; the robot would need a model for how objects are likely to behave in moving them aside in searching my refrigerator for a beer bottle.
- Scene understanding: this one should be obvious. We’re just at the point where image recognition is useful: I drove an Audi on the autobahn which could detect and somewhat adhere to the lines on the highway. I’m pretty sure it eventually would have detected the truck stopped in the middle of the road in front of me, but despite this fairly trivial “you’re going to turn into road pizza” if(object_in_front) {apply_break} level of understanding, it showed no evidence of being capable of this much reasoning. Totally open problem. I’ll point out that the humble housefly has no problem understanding the concept of “shit in front of you; avoid,” making robots and Audi brains vastly inferior to the housefly. Even putting the obvious problem aside; imagine if your robot is tasked with getting me a beer out of the refrigerator and there is a bottle of ketchup obscuring the beer. The robot will be unable to deal. Even with a 3-d model of the concept of beer bottle and the ketchup bottle which is absurdly complex to program the robot with.
several of the above problems illustrated
There’s something called the Moravec paradox which I’ve mentioned in the past.
“it is comparatively easy to make computers exhibit adult level performance on intelligence tests or playing checkers, and difficult or impossible to give them the skills of a one-year-old when it comes to perception and mobility”
Robotics embodies the Moravec paradox. There’s a sort of corollary to this that people who work in the tiny field of “actual AI” (as opposed to ML ding dongs who got above their station) used to know about. This was before the marketing departments of google and other frauds made objective thought about this impossible. The idea is that intelligence and consciousness arose spontaneously out of biological motion control systems.
I think the idea comes from Roger Sperry, but whatever, it used to be widely known and at least somewhat accepted. Those biological motion control systems exist even on a microscopic level; even unicellular creatures like the paramecium, or primitive animals without real nervous systems like the hydra are capable of solving problems that we can’t do even in the general case with the latest NVIDIA supercomputer. While robotics is a noble calling and the roboticists solve devilishly hard problems, animal behavior ought to give a big old hint that they’re not doing it right.
Guys like Rodney Brooks seemed to accept this and built various robots that would learn how to walk using primitive hardware and feedback oriented ideas rather than programmed ideas. There was even a name for this; “Nouvelle AI.” No idea what happened to those ideas; I suppose they were too hard to make progress on, though the early results were impressive looking. Now Dr Brooks has a blog where he opines hilarious things like flying cars and “real soon now” autonomous vehicles are right around the corner.
I’ll go out on a limb and say I think current year Rodney Brooks is wrong about autonomous vehicles, but I think 80s Rodney Brooks was probably on the right path. Maybe it was too hard to go down the correct path: that’s often the way. We all know emergent systems are super important in all manner of phenomena, but we have no mathematics or models to deal with them. So we end up with useless horse shit like GPT-3.
It’s probably the case that, at minimum, a genuine “AI” would need to have a physical form and be capable of interacting with its environment. Many of the proposed algorithmic solutions to the problems listed above are NP-hard problems. To me, this implies that crap involving computers such as we use is wrong. We do approximately solve NP-hard problems in other ways all the time; you can do it with soap bubbles, but the design of the “computer” is vastly different from the von Neumann machine: it’s an analog machine where we don’t care about infinite accuracy.
You can see some of this in various proposed neuromorphic computing models: it’s abundantly obvious that nothing like stochastic gradient descent or contrastive divergence is happening in biological neurons. Spiking models like a liquid state machine are closer to how a primitive nervous system works, and they’re fairly difficult to simulate on Von Neumann hardware (some NPC is about to burble “Church Turing thesis” at me: don’t). I think it likely that many robot open problems could be solved using something more like a simulacrum of a simple nervous system than writing python code in ROS.
But really, all I know about robotics is that it’s pretty difficult.