To most of us, communication is a single task. But in reality, it’s not and if you’re a machine trying to replicate dialog, you need to be good at lots of tasks like answering questions, completing sentences and even having small talk. It’s common for research in each of these areas to be done independently, to the detriment of anyone trying to put the pieces together to create a conversational AI. The Facebook AI Research (FAIR) lab’s open source ParlAI serves as a home for dialog research, addressing this deficiency by making it easy to train models to complete multiple tasks with an assortment of commonly used datasets.
Pulling a dataset into a workflow with ParlAI is as easy as throwing down some command line. This gives researchers quick access to benchmarking datasets like SQuAD, bAbI tasks and WebQuestions. This isn’t to say that the AI research community was unable to do this work before, but FAIR is trying to incentivize teams to regularly bring more datasets into their work. ParlAI also connects to Amazon Mechanical Turk so researchers can collect new data seamlessly.
Jason Weston, a researcher at Facebook’s AI Research (FAIR) lab, told me in an interview that some of the inspiration for ParlAI came from watching researchers make progress on the WebQuestions dataset only to see the work largely ignored when it became clear that it was overly specialized and not applicable to other tasks.
One of the challenges of following AI research is that it’s really hard to read papers at their face value. In nearly every paper, researchers claim to have achieved a new state of the art after benchmarking their fancy model against commonly used tests.
The problem is that there are so many factors that can lead to a given outcome that achievements really only have value if they can be reproduced. ParlAI takes some of the work out of reproducing research to instill healthier habits for the AI community. The FAIR team hopes to add in its own leaderboard in the future to help make sense of progress in the ecosystem.
ParlAI is similar in form to other training and testing solutions like OpenAI’s Gym and DeepMind’s Lab. But while Gym and Lab are optimized for reinforcement learning, ParlAI is focused squarely on dialog. Some of the supervised learning that underpins work in the dialog space is less sexy than trendy reinforcement learning, but its incredibly fundamental to the field of machine learning.
FAIR plans to use its own ParlAI internally for research. Facebook’s work in dialog underpins many of its services, the most obvious one being “M,” its human + AI-powered assistant. Eventually, Weston tells me that a service like M might be able to learn from talking to people and receiving feedback, much like how babies and young children learn.
But the only way to get there is to break artificial silos and combine research to solve large scale problems. You can find ParlAI on GitHub — the FAIR team will be maintaining it into the foreseeable future.
Featured Image: Buena Vista Pictures