===================================================
**
** NEPS STARTING COHORT 4 - RELEASE NOTES a.k.a CHANGE LOG
** changes and updates for release NEPS SC4 14.0.0
** (doi:10.5157/NEPS:SC4:14.0.0)
**
===================================================


* Known Issues *
	- a Data Manual is not yet published; please have a look at the manual of Starting Cohort 3 instead
	- two FAIR power achievement variables have a valid range of negative values (fag9000l, fag900ls); the original 
		values were divided by 100 to avoid conflicts with the defined NEPS missed values; i.e. an original value 
		of -300 is found as -3.00 in the dataset

CohortProfile:
	- variable 'Current type of school (reconstructed)' [t723080_g1] for wave 6 currently does not correctly distinguish
		between branches in schools with several educational programs; this leads to the school type of students in
		such schools being erroneously classified as 'School with several educational programs: Unclear' [value 8]

pTarget & pTargetCATI:
	- CAUTION: the variables t421020 t428000 t428010 t428030 t428040 t428120 t428130 t428140 t428150 t428170 t428180 
		t428190 t428210 t428300 t42825a t42825b t42825c t428050 that exist in both datasets MUST NOT be simply merged 
		into one dataset, as different scales are used in the two datasets; simply merging them into one dataset 
		without prior processing would lead to serious artifacts 	

	*** Solutions to fix this problem ***

	- in the following examples we modify the data from pTargetCATI, but it is also possible the other way around
	- it should be noted that when merging the data, the value labels of the using dataset are overwritten with the value 
		labels of the master dataset
	- the master dataset is the dataset you open (use masterdataset.dta); the using dataset is the dataset you merge with 
		the master dataset (merge 1:1 ID_t wave using usingdataset.dta)
	- this is especially important when using option 2; the modified data must be the using dataset as long as you don't 
		modify the value labels as well

	* Option 1: Renaming * 
	(easy but does not unify variables)

	use "SC4_pTargetCATI_D_14-0-0.dta", clear

	// using a simple loop to rename variables suffixed with _cati
	foreach var of varlist 	t421020 t428000 t428010 t428030 t428040 t428120 t428130 ///
		t428140 t428150 t428170 t428180 t428190 t428210 t428300 t42825a t42825b t42825c t428050 {
		rename `var' `var'_cati
	}
	merge 1:1 ID_t wave using "SC4_pTarget_D_14-0-0.dta" // merging data without any further ado

	// unifying variables manually e.g. t428000: replacing values of t428000 with values of t428000_cati if valid information 
		is only available for t428000_cati  
	replace t428000 = t428000_cati if ((t428000 < 0 | missing(t428000)) & (t428000_cati >=0 & !missing(t428000_cati)))   
	
	// a loop would look like this - it makes data edition a lot faster (t421020 can't be unified/recoded properly)
	foreach var of varlist t428000 t428010 t428030 t428040 t428120 t428130 ///
		t428140 t428150 t428170 t428180 t428190 t428210 t428300 t42825a t42825b t42825c t428050 {
		replace `var' = `var'_cati if ((`var' < 0 | missing(`var')) & (`var'_cati > =0 & !missing(`var'_cati)))  
	}

	* Option 2: Recoding values before merging *
	- variables are recoded according to their number of categories and saved as tempfile, pTarget is opened and altered 
		CATI-data from the tempfile are merged to it
	- the 'recode' command recodes only the values, not the value labels; therefore it is more convenient to open the 
		non-modified dataset and merge the modified dataset than modifying the data and value labels manually

	use "SC4_pTargetCATI_D_14-0-0.dta", clear
	rename t421020 t421020_cati //t421020 can't be recoded properly, must be renamed
	foreach var of varlist 	t428000 t428010 t428030 t428040 t428120 t428130 ///
		t428140 t428150 t428170 t428180 t428190 t428210 t428300 t42825a t42825b t42825c t428050 {
		levelsof `var' if `var' >= 0 & !missing(`var')
		if `: word count `r(levels)''== 4 recode `var' (1=4)(2=3)(3=2)(4=1)  // var with four categories recoded this way
		if `: word count `r(levels)''== 5 recode `var' (1=5)(2=4)(4=2)(5=1)  // var with five categories recoded this way
	}
	tempfile cati_tmp  // save data as temporary dataset
	save "`cati_tmp'", replace
	use "SC4_pTarget_D_14-0-0.dta", clear
	merge 1:1 ID_t wave using "`cati_tmp'"  


====================================================
* Changes introduced to NEPS:SC4 by version 14.0.0 *
====================================================

Basics
	- GENERAL NOTE: This dataset is available to get a first idea of the sample; the data of this data set should *not* be 
		used for substantive analyses!
	
CohortProfile
	- specification of the education type variables tx28301, tx28302, tx28303, tx28304 using information from the spell data 
		of the biography dataset
	
FurtherEducation
	- more information on courses from the spell datasets spCourses, spFurtherEdu1 and spVocTrain has been integrated in a 
		consolidated format in the dataset FurtherEducation

ParentMethods
 	- the value labels of px80216 'Interview: relationship of respondent to previous respondent' were incorrect in version 
		13.0.0 of the Scientific Use File; this problem has now been fixed with the new version 14.0.0

pTargetCATI
	- COVID-19 RELATED VARIABLES IN WAVES 13 AND 14: due to different time references in the question texts, two separate 
		corona modules A and B were implemented in the current Scientific Use File => module A for "temporary dropouts" in 
		wave 13 (respondents who did *not* participate in wave 13) and module B for standard respondents (respondents who 
		took part in wave 13); this distinction is reflected in the dataset by two variable versions => variables originating 
		from the "temporary dropouts" (module A) are identified by the suffix _v1, whereas variables with information from 
		the standard respondents (module B) do not have a suffix; please check the SUF instruments for details
		>> list of renamed COVID-19 variables: 

		varname in  ||  varname in        ||  varname in 
		SUF 13-0-0  ||  SUF 14-0-0        ||  SUF 14-0-0
		            ||  Corona module A   ||  Corona module B
		============||====================||======================     
		th18000     ||     th18000_v1     ||     th18000
		th18001     ||     th18001_v1     ||     th18001
		th18002     ||     th18002_v1     ||     th18002
		th18010     ||     th18010_v1     ||     th18010
		th18011     ||     th18011_v1     ||     th18011
		th18012     ||     th18012_v1     ||     th18012
		th18013     ||     th18013_v1     ||     th18013
		th18014     ||     th18014_v1     ||     th18014
		th18015     ||     th18015_v1     ||     th18015
		th18016     ||     th18016_v1     ||     th18016
		th18017     ||     th18017_v1     ||     th18017
		tm00055     ||     th18020_v1     ||     th18020
		tm00056_O   ||     th18021_v1O    ||     th18021_O
		tm00063     ||     th18030_v1     ||     th18030
		tm00090     ||     th18040_v1     ||     th18040
		tm00091     ||     th18041_v1     ||     th18041
		tm00092     ||     th18042_v1     ||     th18042
		tm00093     ||     th18043_v1     ||     th18043
		tm00094     ||     th18044_v1     ||     th18044
		tm00095     ||     th18045_v1     ||     th18045
		tm00096     ||     th18046_v1     ||     th18046
		th18047_O   ||     th18047_v1O    ||     th18047_O	
		th18048     ||     th18048_v1     ||     th18048
		tm00114     ||     th18050_v1     ||     th18050
		tm00115_O   ||     th18051_v1O    ||     th18051_O
		th18060     ||     th18060_v1     ||     th18060
		th18061     ||     th18061_v1     ||     th18061
		th18062     ||     th18062_v1     ||     th18062
		th18063     ||     th18063_v1     ||     th18063
		th18064     ||     th18064_v1     ||     th18064
		th18065     ||     th18065_v1     ||     th18065
		th18066     ||     th18066_v1     ||     th18066
		th18070     ||     th18070_v1     ||       <NA>
		th18071     ||     th18071_v1     ||     th18071
		th18080     ||     th18080_v1     ||     th18080
		th18081     ||     th18081_v1     ||     th18081
		th18082     ||     th18082_v1     ||     th18082
		th18083     ||     th18083_v1     ||     th18083
		th18090     ||     th18090_v1     ||     th18090
		th18100     ||     th18100_v1     ||     th18100
		th18101     ||     th18101_v1     ||     th18101
		th18102     ||     th18102_v1     ||     th18102
		th18103     ||     th18103_v1     ||     th18103
		th18110     ||     th18110_v1     ||     th18110
		th18120     ||     th18120_v1     ||     th18120
		th18121     ||     th18121_v1     ||     th18121
		th18122     ||     th18122_v1     ||     th18122
		th18123     ||     th18123_v1     ||     th18123
		th18130     ||     th18130_v1     ||     th18130
		th18131     ||     th18131_v1     ||     th18131
		th18140     ||     th18140_v1     ||     th18140
		th18141     ||     th18141_v1     ||     th18141
		th18142     ||     th18142_v1     ||     th18142
		th18143     ||     th18143_v1     ||     th18143

	- NEW LINKING VARIABLE: the additionally generated variable th14599_g2 for matching main employment episodes with 
		spEmp has been newly included in the dataset; th14599_g2 contains values from splink for this purpose, but not 
		for each wave; the old variable th15029 has been removed from the dataset, as its information became obsolete 
		due to new variable th14599_g2
		>> Stata code for performing the matching:

		use "SC4_spEmp_D_14-0-0.dta", clear
		keep if subspell == 0
		tempfile spEmp_tmp
		save "`spEmp_tmp'", replace
		use "SC4_pTargetCATI_D_14-0-0.dta", clear
		rename th14599_g2 splink
		merge m:1 ID_t splink using "`spEmp_tmp'"
	 
	- the former variable t430080 has been splitted into t430080_v1 [Perceived personal discrimination: apprenticeship 
		refusal] and t430080 [Perceived personal discrimination: job rejection]
	- the variables t32401b, t32501b and t32304b on social capital have been renamed to t32408b, t32508b and t32308b

xTargetCompetencies
	- the value label of variable scg9061s_c has been corrected
	- for numerous variables, the labels in English and German have been adjusted
	- the following variables were renamed in version 14: 

		varname in v13      ||   varname in v14
		====================||========================
		sca51010_sc4a14_c   ||   sca51020_sc4a14_c
		rsci0001_c          ||   rsci0001_sc4g9_c
		rsci0002_c          ||   rsci0002_sc4g9_c
		rsci0003_c          ||   rsci0003_sc4g9_c
		rsci0004_c          ||   rsci0004_sc4g9_c
		rsci0005_c          ||   rsci0005_sc4g9_c
		rsci0006_c          ||   rsci0006_sc4g9_c
		rsci0007_c          ||   rsci0007_sc4g9_c
		rsci0008_c          ||   rsci0008_sc4g9_c
		rsci0009_c          ||   rsci0009_sc4g9_c
		rsci0010_c          ||   rsci0010_sc4g9_c
		rsci0011_c          ||   rsci0011_sc4g9_c
		rsci0012_c          ||   rsci0012_sc4g9_c
		rsci0013_c          ||   rsci0013_sc4g9_c
		rsci0014_c          ||   rsci0014_sc4g9_c
		rsci0015_c          ||   rsci0015_sc4g9_c
		rsci0016_c          ||   rsci0016_sc4g9_c
		rsci0017_c          ||   rsci0017_sc4g9_c
		rsci0018_c          ||   rsci0018_sc4g9_c
		rsci0019_c          ||   rsci0019_sc4g9_c
		rsci0020_c          ||   rsci0020_sc4g9_c
		rsci0021_c          ||   rsci0021_sc4g9_c
		rsci0022_c          ||   rsci0022_sc4g9_c
		rsci0023_c          ||   rsci0023_sc4g9_c
		rsci0024_c          ||   rsci0024_sc4g9_c
		rsci0025_c          ||   rsci0025_sc4g9_c
		rsci0026_c          ||   rsci0026_sc4g9_c
		rsci0027_c          ||   rsci0027_sc4g9_c
		rsci0028_c          ||   rsci0028_sc4g9_c
		rsci0029_c          ||   rsci0029_sc4g9_c
		rsci0031_c          ||   rsci0031_sc4g9_c
		rsci0030_c          ||   rsci0030_sc4g9_c
		rsci0032_c          ||   rsci0032_sc4g9_c
		rsci0033_c          ||   rsci0033_sc4g9_c
		rsci0034_c          ||   rsci0034_sc4g9_c
		rsci0035_c          ||   rsci0035_sc4g9_c
		rsci0036_c          ||   rsci0036_sc4g9_c
		rsci0037_c          ||   rsci0037_sc4g9_c
		rsci0038_c          ||   rsci0038_sc4g9_c
		rsci0039_c          ||   rsci0039_sc4g9_c
		rsci0040_c          ||   rsci0040_sc4g9_c
		rsci0041_c          ||   rsci0041_sc4g9_c
		rsci0042_c          ||   rsci0042_sc4g9_c
		rsci0043_c          ||   rsci0043_sc4g9_c
		rsci0044_c          ||   rsci0044_sc4g9_c
		rsci0045_c          ||   rsci0045_sc4g9_c
		rsci0046_c          ||   rsci0046_sc4g9_c
		rsci0047_c          ||   rsci0047_sc4g9_c
		rsci0048_c          ||   rsci0048_sc4g9_c
		rsci0049_c          ||   rsci0049_sc4g9_c
		rsci0050_c          ||   rsci0050_sc4g9_c
		rsci0051_c          ||   rsci0051_sc4g9_c

	- the following variables were deleted in version 14 as they were duplicates:
		stg12nh01_c 
		stg12nh02_c 
		stg12nh03_c 
		stg12nh04_c 
		stg12nh05_c 
		stg12eg01_c
		stg12eg02_c 
		stg12eg03_c 
		stg12eg04_c 
		stg12eg05_c 
		stg12eg06_c 
		stg12eg07_c
		stg12mt01_c 
		stg12mt02_c 
		stg12mt03_c 
		stg12mt04_c 
		stg12mt05_c 
		stg12cmt06_c
		stg12cw01_c 
		stg12cw02_c 
		stg12cw03_c 
		stg12cw04_c 
		stg12cw05_c 
		stg12cw06_c
		stg12cw07_c 
		stg12pd01_c 
		stg12pd02_c 
		stg12cpd03_c 
		stg12pd04_c 
		stg12pd05_c
		stg12pd06_c 
		stg12pd07_c

xTargetSpecialNeedsCompetencies
	- for numerous variables, the labels in English and German have been adjusted
	- the following variables were renamed in version 14: 

		varname in v13  ||  varname in v14
		================||================
		mpa9re01_sc6    ||  mpg9re01_sc6
		mpa9re01_sc5    ||  mpg9re01_sc5
		mpa9re02_sc6    ||  mpg9re02_sc6
		mpa9re02_sc5    ||  mpg9re02_sc5
		mpa9re03_sc6    ||  mpg9re03_sc6
		mpa9re03_sc5    ||  mpg9re03_sc5
		mpa9re04_sc6    ||  mpg9re04_sc6
		mpa9re04_sc5    ||  mpg9re04_sc5
		mpa9re05_sc6    ||  mpg9re05_sc6
		mpa9re05_sc5    ||  mpg9re05_sc5
		mpa9re_sc6      ||  mpg9re_sc6
		mpa9re_sc5      ||  mpg9re_sc5

TargetMethods & CohortProfile:
 	- information about the linkage with administrative data from the IAB in the variables tx80130, tx80131, tx80132 (TargetMethods) 
		and tx80533 (CohortProfile) has been updated; it now refers to the data version 'NEPS-SC4-ADIAB 7521 v1' 


===================================================
* Changes introduced to NEPS:SC4 by version 13.0.0 *
===================================================

spVocTrain:
	- the data error concerning the variables ts15207_g4O "Municipality of training facility (district)" and derived from 
		it the variables ts15207_g3O "Municipality of training facility (administrative district)", ts15207_g2R 
		"Municipality of training facility (federal state)" and ts15207_g1 "Municipality of training facility (west/east)"
		in the survey waves 9, 10, 11, and 12 has been fixed with the current release
	- due to an erroneous assignment within the survey, incorrect values were written into the variable ts15207_g4O 
		"Municipality of training facility (district)" for episodes of a college or university degree 
		(ts15201==7, 8, 9, 10*), a doctorate (ts15201==15), or a habilitation (ts15201==16)
	- episodes of other types of vocational training (ts15201==1 to 6, 11 to 14, 17) did not show any problem in the 
		mentioned variables

All episode data:
	- the rules for harmonizing episode data have been slightly modified (see Data Manual of Starting Cohort 3) and caused changed 
		values in the harmonized data row (subspell=0) for some cases
	- all missing values '-29 = value from the last sub-episode' in spell data files have been replaced by the respective value; 
		the data now also contains information that was not asked directly from the respondent but was necessary for the 
		interview and filtering control; these values thus represent the last known value and can be used to track the filtering

Biography:
	- episodes with missing information in both the start and the end date variables are no longer included in this dataset

FurtherEducation:
	- this generated dataset has been newly added to the Scientific Use File with this release; it provides information on the 
		respondents' participation in further education measures from the datasets 'spFurtherEdu1' and 'spCourses'
	
TargetMethods & CohortProfile:
 	- information about the linkage with administrative data from the IAB in the variables tx80130, tx80131, tx80132 (TargetMethods) 
		and tx80533 (CohortProfile) has been updated; it now refers to the data version 'NEPS-SC4-ADIAB 7520 v1' 


===================================================
* Changes introduced to NEPS:SC4 by version 12.0.0 *
===================================================

All episode data:
	- date information on endings (*12y, *12m, *12c) in harmonized episodes (subspell==0) were reset to the last 
		non-revoked information; revoked subspells are identified by spms==-20

CohortProfile:
	- variable tx8601y/m ('date of interview') were renamed into tx8600y/m to ensure consistency across the starting 
		cohorts

pTargetCATI:
	- variable t428060 ('Part of German society') was renamed into t428020
	- variable t516101 ('Politics complicated') was renamed into t516106
	- variable t32408b ('Social capital - info voc. train - number of persons') was renamed into t32401b
	- variable t32508b ('Social capital - effort voc. train - number of persons') was renamed into t32501b
	- variable t32308b ('Social capital - help with application - number of persons') was renamed into t32304b
	- variables inty/m now are now situated in CohortProfile as tx8600y/m
	- variables th21310 and th21311 were dropped from the Scientific Use File
	- for variable t428050 ('Sense of Belonging: People in Germany'), an incorrect coding (1 instead of 5) was corrected 
		for N=54 cases
   
pTargetCORONA:
	- variable t514008 ('Satisfaction with study/training/school') was renamed into t514010
	- variable t527102 ('Sport frequency (corona period)') was renamed into t527104

spPartner:
	- variable ts30027_R ('Country of residence Partner (abroad) (LAT)') was renamed into ts30027_g1R

Weights:
	- new variable naming scheme introduced, from w_t12357891011 (old) to w_t1to11 (new)
	- meaning of variable w_t12 changed from longitudinal weight for waves 1 and 2 (former SUF releases) to cross-sectional 
		weight for wave 12 (current SUF release)


===================================================
* Changes introduced to NEPS:SC4 by version 11.0.0 *
===================================================

spPartner, spFurtherEdu1, spFurtherEdu2:
	- three new datasets with spell information on partnership history and further education course have been incorporated 
		into the Scientific Use File for the first time 

pTargetCorona:
	- a new dataset with information from an additional CAWI survey (May 2020) on Corona related topics has been incorporated in 
		this SUF release

pTargetCATI:
	- the variables t66800l_g1, t66800m_g1, t66800n_g1, t66800q_g1 and t66800r_g1 were newly generated to represent the measurement 
		of the Big Five in wave 10 with 21 instead of 11 items; a total of 10 items of the previous Big Five instrument from 
		the waves 3 and 5 are part of the new measurement; only the variable t66800k is not included in the new instrument, so 
		that the original index variable t66800b_g1 is not filled for wave 10
	
	- versioning of items: 
		former variable t515020 renamed to t515030_v1
		former variable t515021 renamed to t515031_v1
		former variable t515022 renamed to t515032_v1
		former variable t515023 renamed to t515033_v1
		former variable t515024 renamed to t515034_v1
		former variable t515025 renamed to t515035_v1
		former variable t515026 renamed to t515036_v1

	- the variables t51503* refer to occupations selected to be primary; the variables t51503*_v1 refer to current occupations
		(current in this specific wave)

	- the coding/polarity of the categories in variable t428050 has changed since wave 10 because the field instrument was processed 
		in this way:
		- in SUF releases before version 11.0.0 the coding was: 1="very stringly"; 2="strongly"; 3="average"; 4="hardly"; 5="not at all"
		- in SUF releases since version 11.0.0 the coding is: 1="not at all"; 2="hardly"; 3="average"; 4="strongly"; 5="very strongly"
		
xTargetSpecialNeedsCompetencies:
	- data on metacognition from wave 2 (mpa9re*) and wave 10 (mpa10re*) were added


===================================================
* Changes introduced to NEPS:SC4 by version 10.0.0 *
===================================================

EditionBackups:
	- this new dataset has been incorporated into the Scientific Use File for the first time; it contains raw values 
		before data edition


===================================================
* Changes introduced to NEPS:SC4 by version 9.1.1 *
===================================================

CohortProfile:
	- The Education type-variables (tx28301-tx28304) had been corrected and updated

pTargetCATI:
	- variables 'First combination of subjects applied for - subject #' [tf40231, tf40232, tf40233] had not been encoded 
		appropriately in versions 9.0.0 and 9.1.0; this has been fixed, integrating a total of five different variants
		for encoding fields of studies after destatis (German Federal Statistical Office) [_g1, _g2]
		and ISCED-97 fields of education [_g3, _g4, _g5] classifications

	- variables 'Second combination of subjects applied for - subject #' [tf40235, tf40236, tf40237] had not been encoded 
		appropriately in versions 9.0.0 and 9.1.0; this has been fixed, integrating a total of five different variants
		for encoding fields of studies after destatis (German Federal Statistical Office) [_g1, _g2]
		and ISCED-97 fields of education [_g3, _g4, _g5] classifications

pTargetCAWI:
	- variable 'Reference subject learning environment' [t242400] had not been encoded appropriately in versions
		9.0.0 and 9.1.0; this has been fixed, integrating a total of five different variants for encoding fields of
		studies after destatis (German Federal Statistical Office) [_g1, _g2] and ISCED-97 fields of
		education [_g3, _g4, _g5] classifications

spVocTrain:
	- variables 'Subject of studies, doctorate, habilitation 1' [ts15404], 'Subject 2' [ts15405], and 'Subject 3' [ts15406]
		had not been encoded appropriately in versions 9.0.0 and 9.1.0; this has been fixed, integrating a total of
		five different variants for encoding fields of studies after destatis (German Federal Statistical Office)
		[_g1, _g2] and ISCED-97 fields of education [_g3, _g4, _g5] classifications



===================================================
* Changes introduced to NEPS:SC4 by version 9.1.0 *
===================================================

CohortProfile:
	- An additional Dummy-variable has been generated to identify employed persons or someone in professional measure
		tx28301 -- Education type - Pupil
		tx28302 -- Education type - Apprentice
		tx28303 -- Education type - Student
		tx28304 -- Education type - employee/in professional measure
		BUT: In release 9.0.0 are implemented only the variables tx28301, ts28302 and tx28303 for wave 9 and in 9.1.0 the same variables only for wave 8,
		and tx28304 for wave 9. This will be corrected in 9.1.1. In the interim it's possible to combine the variables of the two releases.
	- Panel frame-variable (tx80230) has been corrected (error occured only in version 9.0.0)


Spell-files:
	- variable 'dauertan' has been modified.In previous releases this variable has been recoded. This process makes the
		reconstruction of the selection of filters in many cases almost impossible, so this process has been revoked.

spEmp:
	- two new variables have been implemented to reconstruct the selection of filters
		tf23912 -- under 18
		tf23913 -- under 21 and no completed vocational training

xTargetCompetencies:
	- new English-Items have been implemented and the old Items have been erased
	- one new ICT-Item has been implemented and new WLEs' have been calculated
	

===================================================
* Changes introduced to NEPS:SC4 by version 9.0.0 *
===================================================

General:
	- data from waves 8 and 9 have been incorporated into this release

xTargetSpecialNeedsCompetencies:
	- starting with this release, data from competency assessments in special needs schools will be disseminated;
		thus, the new dataset file 'xTargetSpecialNeedsCompetencies' has been introduced;
		with this release, it offers competency data from assessments in wave 1 for domain "DGCF" only,
		but will be consecutively expanded in future major releases

Biography:
	- in version 7.0.0, the generated dataset Biography erroneously did not contain all 3,291 school episodes that
		had been retrospectively reported via a separate module solely in CATI interviews of wave 4;
		this has been fixed; in version 7.0.0, the episodes from version 6.0.0's Biography dataset
		can be inserted instead, for instance by using the following Stata syntax (file paths have to be adjusted):
		* -------------------------BEGIN Stata-------------------------------
		local biography_700 "Z:/SUF/Download/SC4/SC4_D_7-0-0/Stata14/SC4_Biography_D_7-0-0.dta"
		local biography_600 "Z:/SUF/Download/SC4/SC4_D_6-0-0/Stata14/SC4_Biography_D_6-0-0.dta"
		local spschool_700 "Z:/SUF/Download/SC4/SC4_D_7-0-0/Stata14/SC4_spSchool_D_7-0-0.dta"
		tempfile insertepisodes
		// identif
		use `"`biography_700'"' , clear
		preserve
		keep if sptype==22
		generate subspell=0
		merge 1:1 ID_t splink subspell using `"`spschool_700'"' , keep(using) nogenerate
		drop if subspell!=0
		keep if spms==-21
		assert `c(N)'==3291
		keep ID_t splink
		isid ID_t splink
		merge 1:1 ID_t splink using `"`biography_600'"' , keep(match) nogenerate
		save `"`insertepisodes'"'
		restore
		append using `"`insertepisodes'"'
		sort ID_t splink
		* ---------------------------END Stata--------------------------------
		note that this workaround will not remove data edition gap episodes that may have been automatically inserted
		into the Biography dataset as replacements for these episodes


===================================================
* Changes introduced to NEPS:SC4 by version 7.0.0 *
===================================================

General:
	- data from wave 7 have been incorporated into this release

CohortProfile:

	- in version 6.0.0, CohortProfile variable "Individual tracking: Type of school (PAPI)" [tx80232], suffered a coding error; this has been fixed;
		in version 6.0.0, the fix can be manually applied using the following piece of sophisticated Stata syntax:
		. recode tx80232 (8=10) (7=9) (6=8) (5=7) (4=6) (3=5) (2=4) (1=3)

pTarget:
	- the concept of reflecting migrational background in NEPS SUFs has been improved in order to also represent migrants in 3.75th generation;
		thus, the older variables on migrational background [t400500_g1,t400500_g2,t400500_g3] in the pTarget dataset have been renamed using
		the "v1" suffix [t400500_g1v1,t400500_g2v1,t400500_g3v1], and the new ones have been introduced
	- changed variable name of Item t531020 to t513020 for consistency reasons

pTargetCATI:
	- changed variable name of item t32401b to t32408b for consistency reasons

pParent:
	- the concept of reflecting migrational background in NEPS SUFs has been improved in order to also represent migrants in 3.75th generation;
		thus, the older variables on migrational background [p400500_g1,p400500_g2,p400500_g3] in the pParent dataset have been renamed using
		the "v1" suffix [p400500_g1v1,p400500_g2v1,p400500_g3v1], and the new ones have been introduced

spSibling:
	- starting with this release, ISCED, CASMIN and years of education were derived for siblings when possible [p732313_g1,p732313_g2,p732313_g3]

xTargetCompetencies:
	- wave 7 data contain competency test data from school leavers for the first time
	- enrichment of former wave competency data with additional WLEs and Sumscores for several domains


===================================================
* Changes introduced to NEPS:SC4 by version 6.0.0 *
===================================================

General:
	- starting with this release, all NEPS Scientific Use Files will ship with an additional, unicode-enabled Stata data set version;
		this version is only readable in Stata version 14 or younger, and is placed in the subdirectory "Stata14"
	- translation for all meta data (variable and value labels, question texts, etc) have been revised and completed
	- meta data for all variables have been revised and updated where appropriate
	- additional waves 5 and 6 have been incorporated into the data

CohortProfile:
	- variable "State of participation/attrition" [tx80220] erroneously did not reflect attrition of finally dropped out cases in version 4.0.0,
		but they were categorized as "temporary drop out"; this has been fixed

pCourseClass:
	- linkage recommendation variable [ex20100] in had been erroneously calculated across waves, not inside each wave, in version 4.0.0;
		this has been fixed

pCourseGerman:
	- linkage recommendation variable [ex20100] in had been erroneously calculated across waves, not inside each wave, in version 4.0.0;
		this has been fixed

pCourseMath:
	- linkage recommendation variable [ex20100] in had been erroneously calculated across waves, not inside each wave, in version 4.0.0;
		this has been fixed

pEducator:
	- due to a merging error, version 4.0.0 of teh dataset did not contain information about educators from wave 1; this has been fixed

pInstitution:
	- version 4.0.0 contained a wrong value and variable label of variable [h228002], which is erroneously equal to labels of variable [h228000];
		the correct variable label is: "School: number of schools of the same type in the vicinity"; this has been fixed

pTarget:
	- the filename of the dataset "pTargetPAPI" has been changed to "pTarget", as it now contents additional information from the CAWI interview in wave 5
	- please refer to the field and methods report online for details on the wave 5 CAWI survey
	- variables concerning expected income [t513011-t513020] in the "pTargetPAPI" data set in version 4.0.0 have been erroneously renamed to match variable names
		from questions in newer waves, but with different wording; this has been fixed
	- PAPI variables containing the result of automated multi-answer coding suffered from an error in the recoding routine for version 4.0.0 and
		were erroneously only filled with missing values; this has been fixed
		variables affected from this issue are:
		"Mother tongue"					[t41000a] in the "pTargetPAPI" data set
		"Second language"				[t410010] in the "pTargetPAPI" data set
		"Nationality"					[t40115a] in the "pTargetPAPI" data set
		"Mother: Mother tongue"				[t41010a] in the "pTargetPAPI" data set
		"Father: Mother tongue"				[t41012a] in the "pTargetPAPI" data set

	- a set of PAPI variables suffered from a meta data rename error in version 4.0.0 and contained erroneous labels; this has been fixed
		variables affected from this issue are (complete set of variables including unaffected variables):
		"Items at home: desk" 				[t34006a] in the "pTargetPAPI" data set
		"Items at home: room" 				[t34006b] in the "pTargetPAPI" data set
		"Items at home: education software" 		[t34006c] in the "pTargetPAPI" data set
		"Items at home: classic literature" 		[t34006d] in the "pTargetPAPI" data set
		"Items at home: poetry books" 			[t34006e] in the "pTargetPAPI" data set
		"Items at home: art (paintings)" 		[t34006f] in the "pTargetPAPI" data set
		"Items at home: books for homework" 		[t34006g] in the "pTargetPAPI" data set
		"Items at home: dictionary" 			[t34006h] in the "pTargetPAPI" data set

pTargetCATI:
	- when generating variable "Global self-esteem" [t66003a_g1] in the pTargetCATI data set for version 4.0.0, variable "Global self-esteem: competence" [t66003d] erroneously had been ignored;
		this has been fixed; t66003a_g1 can be re-generated in version 4.0.0 data using the following Stata syntax:
		* -------------------------BEGIN Stata-------------------------------
		nepsmiss t66003a t66003b t66003c t66003d t66003e t66003f t66003g t66003h t66003i t66003j
		tempvar t66003b_r t66003e_r t66003f_r t66003h_r t66003i_r rowmissings
		recode t66003b (1=5) (2=4) (3=3) (4=2) (5=1), generate(`t66003b_r')
		recode t66003e (1=5) (2=4) (3=3) (4=2) (5=1), generate(`t66003e_r')
		recode t66003f (1=5) (2=4) (3=3) (4=2) (5=1), generate(`t66003f_r')
		recode t66003h (1=5) (2=4) (3=3) (4=2) (5=1), generate(`t66003h_r')
		recode t66003i (1=5) (2=4) (3=3) (4=2) (5=1), generate(`t66003i_r')
		egen `rowmissings'=rowmiss(t66003a `t66003b_r' t66003c t66003d ///
		`t66003e_r' `t66003f_r' t66003g `t66003h_r' `t66003i_r' t66003j)
		egen `target_variable'=rowtotal(t66003a `t66003b_r' t66003c t66003d ///
		`t66003e_r' `t66003f_r' t66003g `t66003h_r' `t66003i_r' t66003j) if `rowmissings'==0 & wave==3
		replace `target_variable'=-54 if wave!=3
		label variable `target_variable' "Global self-esteem"
		replace `target_variable'=-55 if missing(`target_variable')
		* ---------------------------END Stata--------------------------------

	- SUF version SC4-4.0.0 contained the variable [tf11153] 	it has been renamed to [tf11154] in SUF version SC4-6.0.0
	- SUF version SC4-4.0.0 contained the variable [t31035a_v1]	it has been renamed to [t31035a] in SUF version SC4-6.0.0
	- SUF version SC4-4.0.0 contained the variable [t31035a] 	it has been renamed to [t31035b] in SUF version SC4-6.0.0
	- SUF version SC4-4.0.0 contained the variable [t31135a_v1] 	it has been renamed to [t31135a] in SUF version SC4-6.0.0
	- SUF version SC4-4.0.0 contained the variable [t31135a] 	it has been renamed to [t31135b] in SUF version SC4-6.0.0

pParent:
	- as wave 5 data makes this a panel dataset, the filename has changed from "xParent" to "pParent"

xTargetCompetencies:
	- competency data for wave 3 test "English as a foreign language" could not be delivered in due time for version 4.0.0 by the
		responsible data editors; integration into the "xTargetCompetencies" dataset followed with the version 6.0.0 release;
		this also slightly affects "State of participation/attrition" [tx80220] if some respondents participated in testing,
		but not in the survey

Biography:
	- additional spells of type "data edition gap" have been inserted to fill gaps between
		(a) the eighth birth day and the first reported episode and
		(b) the most recently reported episode and the most recent interview date

===================================================
* Changes introduced to NEPS:SC4 by version 4.0.0 *
===================================================

General:
	- SPSS data sets now ship with the same "VARIABLE ATTRIBUTES" as Stata data sets' "characteristics"
	- metadata for all datasets has been revised and updated where appropriate
	- variables now ship with a characteristic 'NEPS_instname' attached in Stata datasets, reporting the variable name used in the survey
	- wave 3 and 4 data has been fully integrated into the data
	- several bugfixes and enhancements have been integrated into this new release, influencing various variables;
		only the most important ones are listed in this change log
	- meta data in all data sets have been revised and updated where appropriate
	- data from the first two surveys of persons leaving the school system for vocational education have been added (waves 3 and 4)
	- data from the third survey in the school system have been added (wave 3)

CohortProfile:
	- a variable indicating the current type and track of school per class has been generated (t723080_g1)

pTargetPAPI:
	- as wave 3 data makes this a panel dataset, the filename has changed from "xTarget" to "pTargetPAPI"
	- variables indicating migrational background (t400500_g1 through _g3) have been added

spParentSchool:
	- as from wave 3 on there are several sources for biography episodes, those originating
		from the parent interview are prefixed by "Parent", changing the filename from "spSchool" to "spParentSchool"
	- the interview process in parent's interviews does not guarantee unique ids for parents;
		thus, the identifier in this dataset is no longer "ID_p", but the target person's "ID_t"
	
spParentGap:
	- as from wave 3 on there are several sources for biography episodes, those originating
		from the parent interview are prefixed by "Parent", changing the filename from "spGap" to "spParentGap"
	- the interview process in parent's interviews does not guarantee unique ids for parents;
		thus, the identifier in this dataset is no longer "ID_p", but the target person's "ID_t"

RepWeights:
	- replication weights have been transferred to a separate dataset "RepWeights"

pEducator:
	- as wave 3 data makes this a panel dataset, the filename has changed from "xEducator" to "pEducator"

pInstitution:
	- as wave 3 data makes this a panel dataset, the filename has changed from "xInstitution" to "pInstitution"

pCourseClass:
	- as wave 3 data makes this a panel dataset, the filename has changed from "xCourseClass" to "pCourseClass"

pCourseGerman:
	- as wave 3 data makes this a panel dataset, the filename has changed from "xCourseGerman" to "pCourseGerman"

pCourseMath:
	- as wave 3 data makes this a panel dataset, the filename has changed from "xCourseMath" to "pCourseMath"


===================================================
* Changes introduced to NEPS:SC4 by version 1.1.0 *
===================================================

General:
	- metadata for all datasets has been revised and updated where appropriate
	
CohortProfile:
	- variable 'Gender (plausible)' [tx29001] has been generated, reflecting all reported information from surveys of target persons in waves 1 and 2,
		parental interview and lists of students wave 1 and 2 (in this logical sequence, read: information given by target persons in wave 1 superseeding
		information given by target persons in wave 2, which is superseeding information from the parental interview and so on)
	- variable 'Year of birth (plausible)' [tx2900y] has been generated, reflecting all reported information from surveys of target persons in waves 1 and 2,
		parental interview and lists of students wave 1 and 2 (in this logical sequence, read: information given by target persons in wave 1 superseeding
		information given by target persons in wave 2, which is superseeding information from the parental interview and so on)
	- variable 'Month of birth (plausible)' [tx2900m] has been generated, reflecting all reported information from surveys of target persons in waves 1 and 2,
		parental interview and lists of students wave 1 and 2 (in this logical sequence, read: information given by target persons in wave 1 superseeding
		information given by target persons in wave 2, which is superseeding information from the parental interview and so on)
	- variable 'Data available: competency test subject' [tx80522] was erroneous and has been corrected, 
                which also influences variable 'State of participation/attrition' [tx80220]
	- variables weight_design_std weight_design have been removed (moved to Weights, see below)
	
xTarget:
	- variables 'Month of birth' [t70004m], 'Year of birth' [t70004y] and 'Gender child' [t700031] are now stored as wide variables (suffixes _w1 and _w2),
		reflecting the corresponding questions had been asked both in waves 1 and 2
	- variable 'Gender (plausible)' [tx29001] has been added from CohortProfile (see detailed description there)
	- variable 'Year of birth (plausible)' [tx2900y] has been added from CohortProfile (see detailed description there)
	- variable 'Month of birth (plausible)' [tx2900m] has been added from CohortProfile (see detailed description there)
	- added string version of 'Idealistic/Realistic vocational aspirations: Preferred choice of career': [t31060a_O] [t31160a_O]
	- variable [DELIMITER1] has been generated, separating blocks of variables originating from basic questions and wave 1 questions
	- variable [DELIMITER2] has been generated, separating blocks of variables originating from wave 1 questions and wave 2 questions
	- ISEI-08: updated transcoding scheme implemented [tf00260_g14], [tf0029b_g14], [tf0021b_g14], [t31060a_g14], [t31160a_g14], [t731422_g14], [t731472_g14], [tf00070_g14], [tf0013b_g14]
	- slightly modified general transcoding scheme with minor consequence on all derived variables [*_g1 to *_g16]
        - recoding of language variables ([t41000a_*], [t41010a_*], [t41012a_*], [t410010_*]) has been revised and updated if necessary

xParent:
	- ISEI-08: updated transcoding scheme implemented [p296402_g14], [p731904_g14], [p731954_g14]
	- slightly modified general transcoding scheme with minor consequence on all derived variables [*_g1 to *-g16]

xEducator:
	- ISEI-08: updated transcoding scheme implemented [e537061_g14], [e537062_g14]
	- slightly modified general transcoding scheme with minor consequence on all derived variables [*_g1 to *-g16]

xTargetCompetencies: 
	- new variables for procedural metacognition [*_sc6] and  L1-Targetpopulation [*_sc7]
	- 146 empty cases have been removed
	- datafile has been revised and missings were tagged more precisely

Weights:
        - new datafile for weights has been generated. All weights previously stored in CohortProfile can be found here now.

xTargetMicrom, xInstitutionMicrom:
  	- new regional data (microm data) has been added