More than 9 million broken links on Wikipedia are now rescued

Internet Archive Blogs:

For more than 5 years, the Internet Archive has been archiving nearly every URL referenced in close to 300 wikipedia sites as soon as those links are added or changed at the rate of about 20 million URLs/week.

And for the past 3 years, we have been running a software robot called IABot on 22 Wikipedia language editions looking for broken links (URLs that return a ‘404’, or ‘Page Not Found’). When broken links are discovered, IABot searches for archives in the Wayback Machine and other web archives to replace them with. Restoring links ensures Wikipedia remains accurate and verifiable and thus meets one of Wikipedia’s three core content policies: ‘Verifiability’.

It’s an interesting problem when you think about it. So often when you visit old web pages, there are tons of broken links to websites that have either taken them down or no longer exist at all. Wikipedia have an enormous amount of work to do to keep dead links at a minimum in their pages, so it’s cool to see how they are going about fixing them.