IT/기술

중도 성향

Award-Winning Researcher Trains Robots to Make Educated Guesses

IEEE Spectrum

Award-Winning Researcher Trains Robots to Make Educated Guesses

Yen-Ling Kuo always wanted to understand how things worked. When she was growing up in Taiwan, reading the story of Michael Faraday in elementary school piqued her curiosity about the natural world. During that time, she was introduced to Logo, a computer program with a turtle cursor to help children learn basic coding through hands-on experimentation.

It was Kuo’s introduction to programming logic.

Yen-Ling Kuo

Employer

University of Virginia in Charlottesville

Title

Assistant professor of computer science

Member grade

Member

Alma maters

National Taiwan University; MIT

In high school she learned the capacity computers held. She could write programs that completed tasks independently, she realized.

“Once I discovered how powerful computers could be,” she says, “I knew I wanted to focus on using them to solve real-world problems.”

Kuo, an IEEE member, never lost her interest in the “how” behind processes and tools. Her curiosity, combined with a stint working at a Silicon Valley company, led her to focus on innovations that live at the intersection of cognitive and computer sciences.

Kuo, now an assistant professor of computer science at the University of Virginia in Charlottesville, last year received the IEEE Robotics and Automation Society’s inaugural Outstanding Women in Robotics and Automation Early Career Contribution Award. The award is part of the IEEE-RAS Women in Engineering’s Outstanding Women in Robotics and Automation (WiRA) Paper Awards, which promote excellence and recognize the impact that female researchers have on robotics and automation fields at different stages in their academic careers.

Kuo’s winning paper, “Diff-DAgger: Uncertainty Estimation with Diffusion Policy for Robotic Manipulation,” demonstrates a novel method to help robots better identify and estimate uncertainty when faced with scenarios on which they’ve not been trained. The method reduces the amount of human supervision, improves a robot’s rate of successful task completion, and opens up a path to introduce more complex models with bigger data demands into interactive robot learning.

She says her research will help people working in the robotics and automation fields more efficiently collect the data needed for effective model training.

Silicon Valley’s impact

Kuo earned bachelor’s and master’s degrees in computer science at the National Taiwan University, in Taipei, in 2009 and 2012. As she was nearing completion of her master’s degree, she did what many computer science graduates do: She pursued a summer internship at a tech company.

She spent the summer of 2011 at Google’s campus in Kirkland, Wash., working on the company’s comparison ads project.

When her internship ended, she joined the MIT Media Lab as a visiting student, working on the Open Mind Common Sense project with Henry Lieberman.

As she was considering pursuing a Ph.D., a call from Google changed her plans. The company offered her a full-time role as a software engineer.

“I viewed the job offer as a positive development,” she says. “I believe it can never hurt your future research career to get some real-world experience under your belt.”

She was hired in 2012 and helped build techniques that incorporate computer vision and natural language processing to improve the customer shopping search experience. She led the company’s Shop the Look initiative, a predecessor to Google’s current AI-powered shopping experience. The project connected social media content with search results, something the company had struggled to do in the past.

Kuo and her team were tasked with building a connection between the natural language people use to describe an item and an image that matches the searcher’s intent. It was at a time when the neural network—using deep learning models to power Google products—was gaining momentum at the company. Integrating neural network tools into her work was a requirement—which raised questions for Kuo.

“I was applying the neural network tools,” she says. “But I didn’t have 100 percent certainty about how they actually worked.”

She considered how she could become more knowledgeable about deep learning models. It was a full-circle moment. She decided that after nearly four years at Google, it was time to earn a Ph.D. in computer science. She returned to MIT in 2016.

The question that changed everything

Boris Katz, one of Kuo’s Ph.D. advisors, is a principal research scientist and the head of the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL)’s InfoLab. He also led the creation of the START Natural Language System, the world’s first Web-based question-answering system.

When the two met, Katz asked Kuo why she wanted to pursue a doctorate degree. She explained her interest in understanding how neural networks work and in using that knowledge to connect the physical world with human language.

He suggested she attend a summer course at MIT’s Center for Brains, Minds, and Machines, a research initiative that ran from 2013 through 2025. CBMM’s objective was to bring together computer scientists, cognitive scientists, and neuroscientists to understand how human intelligence works. The goal was to use the resulting insights to establish an engineering practice to build artificial intelligence systems.

For Kuo, it was a chance to better understand human intelligence and identify ways it could be replicated in machines.

“It was an opportunity for me to interact with other scientists and gain insight into how people learn, understand, and figure things out in the world,” she says. “I saw it as a very useful and inspiring way to incorporate those ideas into my own research work.”

During her Ph.D. studies, she was a research assistant at CSAIL. The experience helped shape her doctoral research, which focused on building AI systems that apply past learning to new situations. She developed machine learning models to support the efforts, including language understanding and social interactions.

She completed her Ph.D. in computer science in 2022 with a minor in cognitive science.

After graduation, she continued her work and collaboration at CSAIL, particularly on projects that involved the “theory of mind” concept.

Theory of mind spurs innovation

Theory of mind isn’t new, having originated with primatologists studying chimpanzees in the late 1970s. The theory recognizes that others have their own thoughts, beliefs, and perspectives. It’s a skill that allows humans to infer someone’s mental state and predict their behavior without verbal communication.

“It’s like when college roommates are moving into their dorm. They may not talk too much, but they work together naturally to coordinate their activities and accomplish goals,” Kuo says. “They can infer and mentally interpret each other’s behaviors and signals to make decisions and complete tasks without words.”

She brought her theory of mind research to the University of Virginia when she joined as an assistant professor in 2023.

Kuo conducts her research in UVA Engineering’s multidisciplinary cyberphysical Link Lab. Her broad focus is on developing computational models that help robots interpret both direct data and silent signals, from language and movements to a person’s gaze. If successful, it could give robots the same sort of physical and theory of mind reasoning capabilities that power physical and social interactions among humans.

“There are no computational frameworks yet available that will translate this kind of understanding into a robot efficiently,” she says.

She adds that the process to get there begins with improving how robots learn to perform tasks.

The evolution of robot learning

Historically, one way robots learned was to mimic humans. A researcher would manually guide a robot through a task, like cutting an apple, and it would repeat the movements. The robot was successful until the environment changed, such as when its hand was in a different position or the apple was at a different angle. The robot was then faced with a situation for which it hadn’t been trained. Without any data available to help it correct course, the robot would start making small errors that eventually led to a full system crash.
This diagram describes how the robotic gripper’s visual perception and tactile sensing prevents a potato chip from breaking.Xuhui Kang, Yen-Ling Kuo, et al.

To solve the problem, researchers developed the dataset aggregation (DAgger) method. As a robot performed a task, a researcher was on standby to provide real-time corrections during unexpected scenarios. The correction data was continuously added to the robot’s model, teaching it how to recover from mistakes.

To reduce the human monitoring effort, robot-gated DAgger was created to enable bots to query humans when the machines became uncertain.

The most popular approach to make the query decision is to train multiple models to consider when determining a course of action. If the models all agree, the robot proceeds. If they don’t agree, the robot is likely to get stuck and ask for help.

Although the multiple model approach was widely adopted, it has limitations. Practically speaking, as models become more complex, it is hard or impossible to train multiple copies. A more fundamental issue is that disagreement among models doesn’t always imply uncertainty; it could just mean there are different ways to accomplish a task.

The Diff-DAgger solution

That is the gap Kuo’s research team closed with the novel Diff-DAgger research. The approach builds on diffusion policy, a technique that helps robots account for different ways a task can be performed.

The new method repurposes diffusion loss, the signal a robot uses to improve its model during training, as a real-time confidence check. During task execution, the robot computes the signal and compares it against values from its training data using a statistical test. The signal spikes when the robot faces an unfamiliar situation and is uncertain how to proceed. The signal stays silent when the robot’s current action is close to what it learned before.

The spike represents the robot’s ability to self-diagnose and predict an imminent failure. Human intervention is triggered only when the signal spikes. No spike means the robot can be left to complete its decision-making process on its own.

Kuo’s team achieved significant results: Failure prediction rates were improved by 39 percent. Task completion rates were increased by 20 percent, and tasks were completed nearly eight times faster.

Her research at UVA gained attention from the National Science Foundation, which honored her last year with a Career Award, the foundation’s flagship grant for early-career researchers. The five-year US $665,000 grant supports her research that builds computational models for human-robot interactions through theory of mind reasoning.

She also received the Toyota Research Institute’s Young Faculty Researcher Award to teach cars to reason about interactions on the road and with the driver.

As service robots and self-driving vehicles become more available, such works are likely to make interactions between humans and robots more intuitive and useful.

Kuo ultimately wants to build more robust robots that are able to integrate into a social space with humans by engaging with us through grounded interactions, she says.

The impact of IEEE

Like many IEEE members, Kuo was introduced to the organization as a student. In 2018 she submitted her first paper, “Deep Sequential Models for Sampling-Based Planning,” to the IEEE/Robotics Society of Japan International Conference on Intelligent Robots and Systems while pursuing her Ph.D. at MIT. Her IEEE involvement grew alongside her professional career.

“It was a natural segue to transition from student to a full IEEE member,” she says. Today she is an active volunteer with the IEEE Robotics and Automation Society, a reviewer for submitted papers, and a presenter and panelist at conferences.

She says one of the best parts of attending conferences is having the opportunity to engage with students. She also enjoys participating as a panelist at luncheons, she says, because it gives her one-on-one time with student attendees. She can share her knowledge and offer insights as they prepare to embark on their career.

Her goal in the coming years, she says, is to broaden her involvement with IEEE initiatives and branch out to other technical committees. Sharing knowledge and learning from others is essential to anyone’s career growth, she says, and “IEEE offers a great opportunity for both.” ...

전문 보기

이 뉴스, 독자들은 어떻게 느꼈나요?

첫 반응을 남겨보세요

로그인하면 감정 반응에 참여할 수 있어요.

Award-Winning Researcher Trains Robots to Make Educated Guesses

IEEE Spectrum

It was Kuo’s introduction to programming logic.

Yen-Ling Kuo

Employer

University of Virginia in Charlottesville

Title

Assistant professor of computer science

Member grade

Member

Alma maters

National Taiwan University; MIT

In high school she learned the capacity computers held. She could write programs that completed tasks independently, she realized.

“Once I discovered how powerful computers could be,” she says, “I knew I wanted to focus on using them to solve real-world problems.”

She says her research will help people working in the robotics and automation fields more efficiently collect the data needed for effective model training.

Silicon Valley’s impact

She spent the summer of 2011 at Google’s campus in Kirkland, Wash., working on the company’s comparison ads project.

When her internship ended, she joined the MIT Media Lab as a visiting student, working on the Open Mind Common Sense project with Henry Lieberman.

As she was considering pursuing a Ph.D., a call from Google changed her plans. The company offered her a full-time role as a software engineer.

“I viewed the job offer as a positive development,” she says. “I believe it can never hurt your future research career to get some real-world experience under your belt.”

“I was applying the neural network tools,” she says. “But I didn’t have 100 percent certainty about how they actually worked.”

The question that changed everything

For Kuo, it was a chance to better understand human intelligence and identify ways it could be replicated in machines.

She completed her Ph.D. in computer science in 2022 with a minor in cognitive science.

After graduation, she continued her work and collaboration at CSAIL, particularly on projects that involved the “theory of mind” concept.

Theory of mind spurs innovation

She brought her theory of mind research to the University of Virginia when she joined as an assistant professor in 2023.

“There are no computational frameworks yet available that will translate this kind of understanding into a robot efficiently,” she says.

She adds that the process to get there begins with improving how robots learn to perform tasks.

The evolution of robot learning

To reduce the human monitoring effort, robot-gated DAgger was created to enable bots to query humans when the machines became uncertain.

The Diff-DAgger solution

She also received the Toyota Research Institute’s Young Faculty Researcher Award to teach cars to reason about interactions on the road and with the driver.

As service robots and self-driving vehicles become more available, such works are likely to make interactions between humans and robots more intuitive and useful.

Kuo ultimately wants to build more robust robots that are able to integrate into a social space with humans by engaging with us through grounded interactions, she says.

The impact of IEEE

전문 보기

이 뉴스, 독자들은 어떻게 느꼈나요?

첫 반응을 남겨보세요

로그인하면 감정 반응에 참여할 수 있어요.

Award-Winning Researcher Trains Robots to Make Educated Guesses

이 뉴스, 독자들은 어떻게 느꼈나요?

관련 뉴스

'tech' 카테고리 뉴스

SpaceX president Gwynne Shotwell just gave another hint at a Tesla merger

The world’s first trillionaire is a killer

I Won't Buy You a Coffee

IEEE Spectrum의 다른 기사

Why Orbital Data Centers Are Harder Than Silicon Valley Thinks

Defining Autonomy for Wellness Robots in Senior Care

EPICS in IEEE’s Awards Honor Outstanding Students and Faculty

Award-Winning Researcher Trains Robots to Make Educated Guesses

이 뉴스, 독자들은 어떻게 느꼈나요?

관련 뉴스

'tech' 카테고리 뉴스

SpaceX president Gwynne Shotwell just gave another hint at a Tesla merger

The world’s first trillionaire is a killer

I Won't Buy You a Coffee

IEEE Spectrum의 다른 기사

Why Orbital Data Centers Are Harder Than Silicon Valley Thinks

Defining Autonomy for Wellness Robots in Senior Care

EPICS in IEEE’s Awards Honor Outstanding Students and Faculty