Why would you want to do this?

If you have ever browsed Wikipedia (and who hasn't), then you have probably experienced the feeling of being happily lost, browsing from one interesting topic to the next and encountering information that they would never have searched for. Wikipedia's articles are peppered with hundreds of millions of links, which explain the topics being discussed, and create an environment where serendipitous encounters with information are commonplace.

The work described here aims to bring the same explanatory links—and the accessibility and serendipity they provide—to all documents.

Hasn't this been done already?

You may have seen other wikifier systems, like this one from University of North Texas. Our system is fairly different. It doesn't just use Wikipedia as a source of information to link to, but also as training data for how best to do it. In other words, it has been trained to make the same decisions as the people who edit Wikipedia. Try them both out and see the difference for yourself.

Where can I read more?

This newspaper article provides nice non-technical overview of the project. You can also get the full implementation and evaluation details from this conference paper.

Can I make this part of my own system?

Yes! In fact, there are a few options:

The wikify web service is machine-readable, so feel free to call it from within your code. It can be made to return results directly (without the surrounding web interface) by appending &wrapInXml=false to the url (like this). You can also make it return XML (with additional information) by appending &xml (like this).

There are several other parameters available for specifying the format of the document, how many links to create, etc. Check here for details.

You can also host the service yourself, or deal directly with the source code. It has been incorporated into the Wikipedia Miner Toolkit and released open source under the GNU Public Licence.