AI researchers have made significant advancements in building models that can generate text that mimic the natural language. State-of-the-art technology performs so well that it is sometimes hard to distinguish their output from the text written by a person. An essential next step is to make these models generate the fluent and grounded text in real-world knowledge.
KILT (Knowledge Intensive Language Tasks) helps AI researchers and enthusiasts build models that can better leverage real-world information to accomplish a broad range of tasks. Uniting 11 widely used public data sets, KILT represents five different tasks:
- Open-domain question answering
- Slot filling
- Entity linking
- Dialog generation.
Thus KILT becomes the first benchmark to aggregate the data sets representing such a wide variety of the knowledge-intensive tasks. These 11 data sets are united in a single format and are grounded in an available preprocessed collection of the entire Wikipedia corpus. The above is done because preprocessing large corpora is a time-consuming process, and it can have a large effect on the models’ downstream performance. Mapping all data sets to a single corpus makes research work more convenient and enables balanced evaluation across different models with increased accuracy.
Mapping all the data sets to the same corpus and using a unified format makes it much easier to explore the transfer learning and multitask learning approaches.
Alignment of all the data sets in KILT with a recent Wikipedia snapshot, i.e., a single knowledge source, can help catalyze research into unified and task-agnostic architectures for knowledge-intensive tasks. It also eases the experiment with different task-specific solutions.
Considering the individual output and the specific information used to produce it, we evaluate how models perform on knowledge-based tasks. The KILT benchmark includes mapping the correct knowledge that can solve the task, i.e., provenance information. For several jobs, we make the provenance annotation more comprehensive with an annotation campaign. Together, the provenance and the output make it possible to assess a model’s ability to justify a prediction and accuracy.
To conclude, the natural language processing models are used in real-world AI applications nowadays. KILT facilitates the research needed to improve these models and ultimately to build machines with in-depth knowledge of our (real) world.