Deutsch (Deutschland) English (United States)

Data tools for Stata

Thanks to Stata’s quite versatile data format, NEPS Scientific Use Files (SUFs) provide additional information with easy-to-use tools. The Research Data Center (RDC) provides a package of additional Stata programs (“ado files”), in order to present this additional information to the user as clearly as possible.

Multi-lingual data sets

The Stata files from the RDC offer multi-lingual variable labels and value labels (currently: German and English). Stata users can easily switch between these languages using Stata's label language command.

. label language
Language for variable and value labels

Available languages:

Currently set is: . label language de
To select different language: . label language <name>
(more output omitted)
. label language en

Data signatures

Upon publishing, a signature of each and every data set is generated and accordingly saved. Stata users can use Stata’s datasignature confirm command to check if the data set has been modified since its publication.

. datasignature confirm
(data unchanged since 05aug2013 15:52)


As mentioned before, the LIfBi RDC provides Stata programs as Stata packages.

Installation and updating

The package can be easily installed through Stata’s built-in installation mechanism:

net install nepstools, from(
checking nepstools consistency and verifying not already installed...
installing into ...
installation complete.

Updates should be carried out regularly, since the program package is updated regularly as well.

adoupdate nepstools , update

In case you do not have access to Stata’s regular package installation mechanism, the package can be manually dowloaded as a ZIP file.

It is highly recommended to first read this page carefully as well as the Stata help files for each of the provided programs  “help <Programm>”.

nepsuse – simplified use of NEPS-SUF datasets

This program eases use of datasets from NEPS-SUFs by autonomously concatenating the file name from its parts (Starting Cohort, version number, access level). All parameters may be manually entered as options or specified as global macros within Stata.

The Stata code

. use "/path/to/file/SC6_pTarget_D_6-0-1.dta"
. * ... any analysis commands ...*
. use "/path/to/file/SC6_Methods_D_6-0-1.dta" , clear
. * ... more analysis commands ...*

can be shortened to read:

. global NEPSuse_cohort SC6
. global NEPSuse_level D
. global NEPSuse_version 6.0.1
. nepsuse "pTarget"
. * ... any analysis commands ...*
. nepsuse "Methods" , clear
. * ... more analysis commands ...*

Details on usage can be found at the shipped Stata help file.

nepsmiss – Recoding missing values

This program automatically recodes all of the numeric missing values from the NEPS SUFs (-97, -98, etc.) into Stata’s “Extended Missings” (.a, .b, etc.). In contrast to Stata’s built-in commands mvencode and mvdecode, value labels are correctly recoded and a standard list of missing values has already been predefined.

. nepsmiss t731454
Recoded 9450 values in total

This process can be reverted with the option

. nepsmiss t731454 , reverse
Recoded 9450 values in total

We generally recommend using nepsmiss to recode all of the missing values after any data preparation and prior to an analysis of the working data set (“nepsmiss _all”) .
Example data set before and after using nepsmiss:

ID_t wave t731454
8010851 2 -97
8012254 1 -54
8002388 2 -98
8012254 2 5
8002388 1 1
ID_t wave t731454
8010851 2 .b
8012254 1 .c
8002388 2 .a
8012254 2 5
8002388 1 1

infoquery – Display additional metadata

As you may already know, additional metadata is saved to NEPS SUF's. The metadata is saved as characteristics in the Stata data sets. The program infoquery is meant to display these attributes directly in Stata’s dialog mode.

. label language en
. infoquery t514001
query result for variable t514001:




[ITEMBAT] I would like to begin by asking you a few questions about how
satisfied you are with various aspects of your life. Please answer using a
scale of 0 to 10 “0” means that you are totally and utterly dissatisfied;
“10” means that you are entirely satisfied. You can indicate your opinion
using the numbers ‘1’ to ‘9’.

How satisfied are you currently with your life in general?

charren – Rename variables to survey names

As part of the data preparation fot the NEPS SUFs, the used variable names from each of the survey instruments are renamed as SUF variable names. However, some users, especially those how have become accustomed to the original names, may want to reverse this data preparation step. The program charren easily allows switching between these versions, by using the saved metadata in the Stata data sets.

. charren t514001 , to(NEPS_instname) verbose
Info: will rename t514001 to zufrie1

Note that a reverse-search is also possible: Even if the user only knows one instrument name, the program will reliably find the respect SUF variable and rename it correctly

. charren zufrie1 , to(NEPS_instname) verbose
zufrie1 is not a variable name in current dataset; searching for
zufrie1 in specified search space
Info: will rename t514001 to zufrie1