Installing and Configuring Stata

It has been a while since I used Stata last time, admittedly. Nonetheless, at some point I used it extensively, and I definitely satisfy the 10,000 hours criterion for mastery. In the process, I had to install Stata on a number of Linux-based systems, and a number of small issues re-emerged again and again. So I decided to write down a guide for future self, as well as for anyone else who might be interested.

In what follows, I assume that you need to install Stata 13 or newer on a Linux system to which you only have command-line access, most likely over ssh. You have a CD from which you can install Stata on Windows or OS X machine, assuming those have a CD drive, which aren't that common these days. You will need sudo priviliges on the Linux host to install Stata if you want to install it for all users, but it's not necessary if you primarily seek to install it for yourself only.

Start by copying the entire CD contents into a folder on your local machine. Then rsync this folder to the Linux host, e.g. /home/username/stataInstall. Make a folder into which you want to install Stata. The default is /usr/local/stataXX, where XX represents version number, and you almost certainly will need sudo rights to create it. Navigate to this directory using cd and start the installation via [sudo] /home/username/stataInstall/install (sudo is optional, depending on installation location).

You will have to answer a few questions, which are mostly self-explanatory. I assume that by now nobody will want to have a 32-bit installation, since a cap of 4GB for a data set can be restrictive these days. Once the installation concludes, you will need to run ./stinit to activate Stata with your key. Once that process concludes, you should be able to invoke Stata via /usr/local/stataXX/stata-mp.

Even though the installation process is finished, there are a few more steps you should take. First, you should get the updates via the update all command from within Stata. Note that if you installed Stata into /usr/local/stataXX, you will have to invoke the executable with sudo stata-mp if you want the update to succeed.

Second, you will want to add Stata folder to PATH, so that you could invoke the stata-mp command from anywhere. In bash shell this is done via export PATH=$PATH:/usr/local/stataXX: command, which you may want to add to your .bashrc file. (If you have sudo rights and want Stata to be in PATH for all users, then you need to edit the /etc/profile file instead.) If, like me, you prefer zsh to bash, the command is slightly different: PATH=$PATH:/usr/local/stataXX:; export PATH.

Third, and this is where most people trip themselves, you need to make sure that Stata has ample disk space for its temp files. Since it is limited to having a single dataset in memory at any point, many Stata commands require making temporary disk dumps to work. On many Linux hosts the default path for storing temp files is /tmp, which is almost always mounted under the root of the filesystem, i.e. on the primary physical/logical hard drive, where space is usually limited. (To see the overview of the file system, use the df -h command.) There is even a Stata FAQ on how to accomplish this, but, it a nutshell, all you need is to define another environment variable TMPDIR in your .bashrc file and make sure it is pointing to a directory on a disk with lots of space. The above link tells you how to test if the mapping was successful.

This pretty much sums up the process. As a bonus, it is possible to archive the entire contents of /usr/local/stataXX after installation was concluded and updates were deployed. This archive can then be copied to another machine to avoid interactive installation. You will still have to set up the environment variables per above, but this can be done via a shell script. And, of course, make sure that you do not violate the number of concurrent users allowed by your Stata license.