Garett MacGowan

Demystifying machine-learning systems | MIT News | Massachusetts ...

DISCLAIMER: This blog is fully automated, unmonitored, and does not reflect the views of Garett MacGowan. The ML model may produce content that is offensive to some readers.

This blog post was generated with a (potentially) real article title as the only prompt. A link to the original article is below.

Original Article

Photo by Kvistholt Photography on Unsplash

Generated: 2/21/2022

Demystifying machine-learning systems | MIT News | Massachusetts ...

(Phys.org)—A new MIT study of a machine-learning algorithm used to classify images suggests that a technique called reinforcement learning (RLC) may be more natural for a robot to learn from data than a more common and traditional approach known as supervised learning (SLC). If used correctly, RLC can learn tasks automatically from its experiences, whereas SLC requires that a human provide an algorithm with example data of known outcomes before the robot starts to learn, and later provide specific guidance as the robot practices on its own.

By taking an approach that is more directly applicable to robotics, the researchers hope RLC may one day be used to train a robot to understand simple everyday tasks—like pouring a glass of water or cooking a meal—without guidance.

"For decades people have tried to develop robots that can make these types of things for you, to help you out, with these big, expensive and highly trained machines," says Peter Stone, a research scientist at MIT's Computer Science and Artificial Intelligence Laboratory and the paper's lead author. "But we can't rely on these systems until we start to get them to take some responsibility [for completing these tasks on] their own—for example, pouring a glass of water or fixing a meal."

To demonstrate the method, the researchers trained a robot to play a game. In the game, the robot had to move from a start position to a target, guided only by an image taken by a robot camera—but was not told where the target was. To learn to figure out where that target might be, a human trainer would show it a sequence of images and tell it to follow a specific "policy"—the directions used to reach the goal from the start position.

Then, the researchers programmed the robot in such a way that the machine would "reinforce" reaching the goal through what's called "policy transfer": instead of going only through the steps needed to get there, the algorithm found the route to the target after analyzing several different versions of its goal-directed policies, which are the algorithm's "memory" for figuring out what actions to take or which actions are likely to produce its goal.

Over many rounds of experimentation, the team found that the robot learned to find the best solution most quickly, but only after starting out with a policy that initially produced errors—that is, policies whose steps took it in a different direction from the correct one.

The researchers say this approach is a natural fit for robotics because of the ability of an algorithm to adapt to new information, but may not be suitable for all applications in which SLC is the norm, such as teaching an elderly person to navigate a simple task like grocery shopping. "We wanted it to be as natural as possible," says co-author Anirban Roy, an associate professor of computer science and engineering who works with a team of researchers from Columbia University, Carnegie Mellon University, the National Institute for Informatics and other institutions.

Instead of giving the robot directions, he says, the algorithm would instead take what it already knows and see what it might do with that knowledge, and from there, figure out a best course of action.

"These algorithms, they work really well for solving problems, but they can be very difficult to design," says Michael J. Dourado, an assistant professor of computer science at Rochester Institute of Technology, who was not involved in the research. "This gives a very strong alternative approach to learning from mistakes."

In the future, the researchers say, robots could automatically use the technique to learn common activities they might perform on their own. For example, in one experiment described in the paper, the robot looked at a simple cooking task—preparing a meal in the kitchen—and saw that its camera recorded several different positions in which it thought it might have dropped food. As it learned to cook based on how those positions might result in poor-quality cookware, the robot's recipe improved. It was trained in the same way as if it were a teacher trying to teach a child something, but without the teacher's benefit of seeing the child's errors.

"If the robot can look at all these images and learn from them," Dourado says, "that can be very helpful, especially in the context of robots where you're really trying to design them to be more self-sufficient. It's really a key insight."

Machine learning

Much of what we now recognize as artificial intelligence, from speech recognition to learning, is a "supervised learning" process, in which information is presented in a way for the machine to learn.

In contrast, "unsupervised learning" involves a robot, having no knowledge that there is a correct outcome to reach, making up its own method to achieve the goal: the robot decides, without guidance or the expectation of a reward, where to move next, what to do next, and so on.

Unsupervised learning has been used to solve problems in robotics and other fields, and for the most part has worked for things like navigation. But researchers have said there are several obstacles to making it generally useful: such approaches typically need very large datasets, and even when they manage to get good results, they sometimes require high-performance computing to speed up the process.

Another major drawback is that they are generally useful only for simple tasks. "They're great at learning low-level behavior, but they don't learn high-level behavior" that requires a sequence of high-level decisions, "which are really what a robot is going to have to learn to do," Roy says.

For an unsupervised algorithm, the robot must find its way by watching what happens when it does some simple action, but the process it uses to get somewhere may not always be repeatable if the action is taken in a different way, or it takes a different amount of time to perform, and so on. So it might learn the right answer to the question of "how do you eat a sandwich"? but not to "what color is a blueberry?" or "what is the weather like today?"

Supervised learning, in contrast, presents information in the way a robot might be able to learn it from an outside source: for example, a teacher gives an example of an action and the robot is told that, if that action is performed, the person receiving it will be happy. The robot then learns how to perform that action for itself.

Supervised learning, despite its obvious shortcomings, has been a major force in how computers perform functions that have not been learned—like recognizing images or analyzing the human genome.

Although machines might eventually have their own capacity to teach themselves to accomplish a wide variety of tasks, the research presented in this study suggests that unsupervised learning might be a more natural fit for a robot to learn from data.

"For years, we've been trying to get a robot to do something very simple," says G. Peter Hagelund, a professor in the computer science and engineering department at the University at Buffalo. "If that's the thing you are trying to do, it is really interesting to see that unsupervised learning—the idea that it just has to do it on its own—was able to learn from the images, given that it had no additional information to work with."

Hagelund was not involved in the work but says the "machine learning" used by this experiment can in some ways be considered an "unsupervised learning" process, in that there is no need for the robot to be told how to do the task in advance. The robot knows it must reach a goal, but it is not told how.

"The idea of self-supervision is that you need to figure out what you want to do after you've seen it, and then you can just go and do it," Hagelund says. "That's similar to how animals learn through experience."

In the work, Roy says, what the robot learned was that it could use multiple possible routes to any given destination to find one that went toward the goal. It learned to navigate away from and toward other possible goals, all of them having the same potential for the robot's purpose—the goal of finding a goal.

Roy says that the research team is already working on a follow-up study that would use a different approach to test the approach in a more challenging environment. Their approach involves having the robot take images of a different scene it encountered during a task, and the robot having a "memory" representation of the image containing information about objects and where they were in it—information that could be used to learn to do a goal-directed task.

"You still need to program [a robot], but this is in a way a more natural programming approach," Roy says. "We can let [it] learn the representation in an automatic fashion."

This new technique, Roy explains, might also be useful to teach robots to work with other people, with a goal of improving their social skills.

"I tend to think this represents a major shift in the field," Hagelund says. "We used to think that robot learning had to be programmed from the outside in, from first principles, and only the robot could figure out the best way to do things. It turns out the robot can come up with its own rules for the robot to play."

Hagelund says the work has implications if robots are to become increasingly integrated into daily life. He envisions a world in which the robot learns by trial and error, solving problems on its own, and is able to help solve problems other humans face on a daily basis.

"You can envision the robot starting to learn a bit about the things it sees, and perhaps it is learning to drive a car," Hagelund says.

Garett MacGowan