Deutsch (Deutschland) English (United States)

Regional Data

Data of the National Educational Panel Study contains the following locations (Please use the NEPSplorer to receive more information about the variables):

Label Starting Cohort Dataset Variable
Place of birth     3 4 5 6 pTarget t700101
Residence     3 4 5 6 pTarget t751001
  1 2 3 4     pParent p751001
History of residence           6 spResidence th21111
Secondary residence     3 4   6 pTarget t751011
Place of work     3 4 5 6 spEmp ts23237
Institution (panel frame)   2 3 4     CohortProfile tx80109
Institution (episodes)     3 4 5 6 spSchool ts11202
  1 2 3 4     spParentSchool p723030
    2           pParent pb11610
      3 4     pTarget tx44401
          5   pTarget tg15207
University entrance qualification acquisition         5   spSchool tg2232b
Place of measure     3 4 5   spVocPrep ts13105
Place of vocational training     3 4 5 6 spVocTrain ts15207
Educator: place of study     3 4     pEducator e537110
Educator: place of Staatsexamen     3 4     pEducator e537170


All of those locations are collected during the interview and are given by the respondent. We surveyed the place name, thus, the smallest regional unit available is the town or city. Smaller regional entities are only available within the scope of Microm- or infas geodata (see below).

The place name is recoded into the municipal key (amtlicher Gemeindeschlüssel, AGS, 8 digits, as at 12/31/2013) during Scientific Use File preparation. Out of data protection issues, the full key is not made available to the community. Researchers have access to the following derived regional entities across the three access modes:

Starting Cohort Download RemoteNEPS On-Site
SC1 Federal State Federal State Federal State
Administrative Region
Administrative District
SC2-SC4 -- Federal State Federal State
Administrative Region
Administrative District
SC5 -- Federal State
Administrative Region*
Administrative District*
Federal State
Administrative Region
Administrative District
SC6 Federal State Federal State
Administrative Region
Administrative District
Federal State
Administrative Region
Administrative District

* exception: place of higher education institution


For all analyses with regional data, please observe the conditions of the Data Use Agreement (see § 2 sentence 5 and § 5), in particular the handling of federal state variables.

Matching of Regional Data

If you like to link your own or self-researched regional data (e.g. from official statistics) to the NEPS data, you can do so using the above availabilities. Please consider the dating of the key variable (12/31/2013), as during local government reforms, the key is subject to change. To use your own data inside RemoteNEPS, you must first import it into our system (see here how). To use data On-Site, please get in touch with a staff member of the RDC.

If you want to link data using a key not available in a specific access way (e.g., districts in SC2-SC4 inside RemoteNEPS or municipality in all Starting Cohorts), this is also possible. In that case, the RDC will handle the matching of the data, so you do not need access to the specific key variable.

To ensure a simple and fast provision of the matched data, please prepare your data as follows:

  1. Create a dataset in Stata format (alternatively: csv).
  2. The first variable of the dataset should contain the municipal key (AGS), or parts of it (e.g., district code). Please choose a numeric data type (no string variable, ignore leading zeros). Again, consider the used municipal key is time-variant and may be affected by territorial reforms. Currently, we use the municipal key based on the status as of 12/31/2013 (in older SUF-Releases it is based on the status as of 2006).
  3. The format of the subsequent variables can be chosen as required (even string variables).
  4. Use at most 8 characters for the variable names; use no umlauts or special characters.
  5. Please make sure that variable and value names contain sufficient information.
  6. If you like working inside RemoteNEPS (does not apply for On-Site access): the attributes in the data file may not identify the regional unit uniquely, nor may any combination of values. Please be aware that regional data that identifies municipalities or districts, even without the regional key, will not be matched. In case you are unsure if such uniqueness is given, use for example the Stata command duplicates report varname1 varname2 ...; this verifies whether the variable combination varname1 varname2 is a unique identifier and therefore a key (the command gives the frequencies of the variable combinations; make sure there are no unique values). If you are not able to reduce your data to satisfy this condition, you are invited to work On-Site instead. We have no restrictions on the data there.

Your data file should then look like this (using fictional attributes type and status):

district type status
1001 A 0
1002 B 0
1003 A 1
... ... ...
16077 C 0


Please email this dataset to;  including the following information (if you prefer to exchange the data by other means, please contact us directly):

  • Your username (nu..) and the number of your Data Use Agreement (DUA).
  • The Starting Cohort(s) you are interested in.
  • Which places you want the matching to be done on (see table above).

The RDC will then review the received data. Please be aware that we will not match any data if we see any data protection regulations violated (even if the above statements are fulfilled). In that case, we will reach out to you to find a solution.

The result contains the IDs of the respondents and your attributes. The regional key is removed from the file. Accordingly, the number of rows in your dataset multiplies (depending on how many of the respondents are assigned to the same district). The example above might now look like this:

ID_t wave type status
402301 1 A 0
402301 2 A 0
402301 3 B 0
402302 1 B 0
402302 2 B 0
402303 1 A 1
402303 2 C 0
... ... ... ...


This dataset is provided to you in a project folder inside our Remote- or On-Site-System. You can then use the respondents’ ID to merge your data to our data, e.g. using the CohortProfile dataset:

	. use CohortProfile.dta
	. merge 1:1 ID_t wave using "your_datafile.dta"

Please note in the example that one person can be assigned to different places in different waves. Therefore you need the variable combination ID_t wave as an unique identifier (this even gets more complex when merging to episode data).


Microdata (Microm and infas geodata)

The NEPS Scientific Use Files already contain some small-scale regional indicators from the companies Microm and infas geodata. To find out more about those datasets, please consult the respective documentation (see Microm here, infas geodata here). These data files are available On-Site only.

In contrast to the above places, the source of the regional coding here was the real postal address of the respondents. Therefore, the regional indicators are available on scales more detailed than the municipality (smallest entity is the house level). Please note that the real identity of the small scale entity is unknown to us, so those can not be used to merge external data.

Besides this, you might be interested in the fact that the Microm data contains an identificator for the regional level. With this, you are able to detect which respondents reside in the same region. See more about this in the above mentioned documentation.