How To Download Wikipedia For Offline Use

December 6, 2014, by Ken Jorgustin

download-wikipedia

If you would like to have your own offline 15 million page copy of the entire Wikipedia (the online encyclopedia), here’s how to do it:


 
Should the internet ever become unavailable to you, or if you would simply like a stand-alone portable encyclopedia for reference on your computer or on a portable USB thumb drive, it is possible (and free) to download your own copy of the latest Wikipedia.

It’s not for the faint of heart, as it will require a significant download from the internet. But if you have high speed internet and unlimited bandwidth, it can be done in several hours (at best), but probably a bit longer (or even longer than a bit longer 😉 ).

I recently did this with a DSL connection (I’m rural) which tops out at 7 Mbps, and it took most of the business day to complete the download.

 
The following method requires the use of a downloaded copy of the Wikipedia database and a free stand-alone program, WikiTaxi, (which enables you to read, search, and browse the Wikipedia database offline). WikiTaxi does not require ‘installation’ to your computer. It’s a stand-alone program, making it convenient to be kept on a large capacity USB thumb drive which you can take anywhere…

Also, this method does not download the associated Wikipedia images (which would be unimaginably HUGE), although all else (~15 million pages) and internal links remain intact.

 
So there are two downloads. One for WikiTaxi (a small download) and one for the Wikipedia data (a very large download).

 
The WikiTaxi download link is located at the upper left corner of their page.

The Wikipedia download page is here.
However the only file that you will download from that page is this one, named,
enwiki-latest-pages-articles.xml.bz2 (approximately 11 GB)

 
If you are going to download all this to a USB thumb drive (probably the best thing to do for convenient portability), then be sure that the drive capacity is sufficient. A commonly available capacity these days (for a USB thumb drive) is 32GB.

Given the statement above, I would download the large Wikipedia database file directly to the USB thumb drive (otherwise, on your PC, I would simply create a folder named Wikipedia and download everything there).

Download time: If you have a 15 Mbps internet connection, in theory, it would take you a bit less than 2 hours to download. However your speed may vary and the Wikipedia server may vary it’s speed too (which I had noticed slowing down at times).

 
Okay, assuming you’ve now downloaded the correct file from Wikipedia and also the WikiTaxi file (which you need to ‘unzip’ since it’s compressed), then here’s what to do next:

Note: For convenience, I downloaded the Wikipedia database file to the same folder as the WikiTaxi files – all on a USB thumb drive.

 
The following step has to be done one time (import the Wikipedia database into WikiTaxi).

Run ‘WikiTaxiImporter.exe’. Enter or Browse for the Wikipedia database file (enwiki-latest-pages-articles.xml.bz2) and click ‘Import Now’. This will convert the Wikipedia database into a WikiTaxi databse. The conversion process takes a little while…

Note: Regarding storage space requirements, after importing the database the newly created WikiTaxi database will be nearly 18GB. If you had originally downloaded the Wikipedia database to the same USB thumb drive, both together will require that you have at least 30GB of space.

Here’s a convenient 32GB USB thumb drive (which I own).

Now that you’ve imported the database into WikiTaxi, you don’t need to keep the file (enwiki-latest-pages-articles.xml.bz2) anymore (although I kept it since it was such a big download – just in case…).

 
Now that you’re finished, to run offline Wikipedia simply click on ‘WikiTaxi.exe’, which brings up it’s interface. Simply enter any search term at the top, and the articles will be displayed.

 
Note: Wikipedia is an open source editable encyclopedia. While some people like to pick on the fact that a small portion may contain errors due to open-source human input (deliberate or otherwise), these errors are often quickly corrected by others. I have found the information contained within to be extremely helpful over the years, and I’ve rarely found issues that I felt were outrageous or downright incorrect. It’s a tremendous resource. The ability to carry around a 15 million page encyclopedia resource on a USB thumb drive would have been unimaginable not that long ago (I’m showing my age 😉 ).

For the preparedness-minded, it’s nice to have a tangible resource asset like this at your fingertips – which does not require an internet infrastructure (although you’ll still need a PC and electricity).

Since I recently went through this procedure, I thought I would share it with the rest of you in case any of you out there are interested.