The wikifier system was mainly tested on Wikipedia articles, by taking the links out and trying to put them back in automatically. In addition, the system was also tested on news stories from the AQUAINT corpus, to see if it would work as well "in the wild" as it did on Wikipedia. The stories were automatically wikified, and then inspected by human evaluators. We have shared the news stories here, as an extra dataset for others to evaluate their systems.
You can download the stories as a directory of xml files. These files only contain links that were manually created, or automatically created and manually checked. In other words, they only contain manually-verified links and can be used as ground truth.
You can also browse the stories below. These pages contain good and bad links, to show how the system behaved.
- Blue links were created automatically, and identified as correct by at least two of three evaluators.
- Red links were created automatically, but identified as incorrect by at least two of three evaluators.
- Green links were created manually by the evaluators, because the automatic system failed to. Dark green indicates that only one evaluator felt the link was missing, while Light green indicates that all five identified it