Daumé Part of Team Receiving ACL Test of Time Award
A University of Maryland expert in natural language processing has been recognized by the Association for Computational Linguistics (ACL) for a paper he co-authored 10 years ago that uses computer vision to produce descriptions of images.
Hal Daumé III, a professor of computer science with joint appointments in the Language Science Center and the University of Maryland Institute for Advanced Computer Studies, was part of a team honored with the ACL Test of Time Award.
This prestigious honor recognizes up to four papers each year for their long-lasting impact on the field of natural language processing and computational linguistics.
“Midge: Generating Image Descriptions From Computer Vision Detections,” was published in 2012 and has been citied more than 400 times.
The paper introduces a novel generation system—nicknamed “Midge”—that composes human-like descriptions of images from computer vision detections.
Midge generates a well-formed description of an image by filtering attribute detections that are unlikely and placing objects into an ordered syntactic structure, called syntactic trees. To train the system on how people describe images, the researchers used a data set of 700,000 Flickr images with associated descriptions.
The research team’s results showed that Midge outperformed state-of-the-art systems, automatically generating some of the most natural image descriptions to date. Years later, image captioning has become a prestigious subfield, due in part to the success of this paper.
“We did this work just as ‘language and vision’ research was blossoming, and this team with a broad mix of expertise really allowed us to push the envelope in what was possible at the time,” says Daumé.
He adds that with today’s incredible advances in language and vision technology—both in image captioning and image generation—that it was humbling to see how far the field has evolved in the past decade.
“Although some of the techniques and ideas we employed in the Midge paper may, through today's eyes, seem a bit dated, I hope that they can inspire some new directions in longer-form text generation today,” he says.
Daumé and the other co-authors were formally recognized with the ACL award during a May 25 awards ceremony at the 60th Annual Meeting of ACL (ACL 2022), held this year in Dublin, Ireland.
—Story by Melissa Brachfeld
Margaret Mitchell, a computer scientist at HuggingFace who works on algorithmic bias and fairness in machine learning, was the lead author on the paper. Other authors were Jesse Dodge, a research scientist at the Allen Institute for AI; Amit Goyal, a senior applied scientist at Amazon; Kota Yamaguchi, a research scientist at CyberAgent Inc.; Karl Stratos, an assistant professor of computer science at Rutgers University; Xufeng Han, who received his doctorate in computer science from University of North Carolina at Chapel Hill; Alyssa Mensch, an associate technical staff member in the Artificial Intelligence Technology and Systems Group at MIT Lincoln Laboratory, Alex Berg, an associate professor of computer science at UC Irvine; and Tamara L. Berg, an associate professor of computer science at University of North Carolina at Chapel Hill.