The Economic Value of Breaking Bad - CiteSeerX

18 downloads 329 Views 511KB Size Report
Apr 4, 2015 - In fact, using the NCDS data set, we can replicate the general result in earlier work that a single-dimens
The Economic Value of Breaking Bad : Misbehavior, Schooling and the Labor Market∗ Nicholas W. Papageorge† Department of Economics, Johns Hopkins University Victor Ronda‡ Department of Economics, Johns Hopkins University Yu Zheng§ Department of Economics and Finance, City University of Hong Kong April 4, 2015 Abstract: Prevailing research argues that childhood misbehavior in the classroom is bad for schooling and, presumably, bad overall. In contrast, we argue that childhood misbehavior captures underlying non-cognitive skills that are potentially valuable in the labor market. We follow work from psychology and summarize observed classroom misbehavior as two underlying latent factors. Next, we estimate a model of education decisions and labor market outcomes, allowing the impact of each of these two factors to vary by outcome. We show the first evidence that one of the factors driving childhood misbehavior, discussed in psychological literature as externalizing behavior (and linked, for example, to aggression), does indeed reduce educational attainment, but also increases earnings. This finding highlights a broader point: non-cognitive skills are not well summarized as a one-dimensional object that is either good or bad per se. Using the estimated model, we assess competing pedagogical policies. We find that policies aimed at eliminating externalizing behavior increase schooling attainment, but also reduce earnings. In comparison, policies that decrease the schooling penalty of externalizing behavior increase both schooling and earnings. Keywords: Labor, Education, Non-Cognitive Skills JEL Classification: J10 J20 I20 ∗ We gratefully acknowledge helpful comments from: Seth Gershenson, Barton Hamilton, Hans von Kippersluis, Patrick McAlvanah, Robert Moffitt, Albert Park and Richard Spady along with the seminar participants at the City University of Hong Kong, Tinbergen Institute, the Brookings Institution and Georgetown University. The usual caveats apply. † [Corresponding author] [email protected]. ‡ [email protected]. § [email protected]

1

Introduction

A burgeoning literature in economics has established that non-cognitive skills drive a wide variety of socially relevant behaviors and outcomes (Heckman and Rubinstein, 2001). These skills, which encompass character or personality, are sometimes called soft skills, character skills or social skills and have been shown to influence labor supply, earnings, health, education and partnership. Nevertheless, a critical feature of non-cognitive skills has received scant attention in previous literature. Though few would argue that better health and stronger cognition improve outcomes on most any conceivable dimension, it is generally not meaningful to think of non-cognitive skill as a one-dimensional object that is either good or bad per se. For example, some non-cognitive skills are productive in one employment sector and counterproductive in another. Still others seem to capture preferences for activities in which they simultaneously lower productivity. Moreover, these skills are relatively mutable, at least until about age 30, which suggests that interventions during childhood could be designed to modify them.1 If such interventions are designed around the role that non-cognitive skills play in economic outcomes, a clear understanding of their impacts, including sector-specific differences in their returns, is of enormous policy relevance. In this paper, we examine a widely-studied pair of non-cognitive skills, both of which are identified from teachers’ measurements of misbehavior or maladjustment among schoolchildren. The skills in question are known as externalizing behavior and internalizing behavior.2 Externalizing behavior is often linked to antisocial behavior, rule-breaking and conduct disorders. Examples of associated behaviors include displays of aggression, delinquency and hyperactivity (Aizer, 2008). Internalizing behavior, on the other hand, is linked to anxiety and depression (Duncan and Magnuson, 2011; Duncan and Dunifon, 2012). Associated behaviors include shyness, unassertiveness, fearfulness and a tendency to withdraw from social situations.3 Intuitively, these two non-cognitive skills capture different psychological and behavioral tendencies, which can potentially lead to different impacts on socio-economic outcomes. Using a longitudinal data set from Britain, the National Child Development Survey (NCDS), we estimate a model of economic decisions and outcomes, including schooling, 1

See Carneiro and Heckman (2003), Cunha and Heckman (2008) and Borghans et al. (2008). Nomenclature describing non-cognitive skills is not yet settled. Following previous work, we use “externalizing behavior” and “internalizing behavior” to describe the latent factors that are identified using childhood misbehavior. Together, these factors are described as “non-cognitive skills”. 3 Several studies have examined the relationship between these two behaviors to better known measures like the “Big 5” personality traits. There is some evidence that externalizing behaviors are related to conscientiousness, agreeableness, and openness to new experience, while internalizing behaviors are mostly related to neuroticism (Ehrler, Evans, and McGhee, 1999; Almlund et al., 2011). 2

1

partnership, fertility, employment, work experience, wages and hours. In the empirical implementation, economic decisions and outcomes are approximated as a series of linear-inparameters equations, where schooling and early life-cycle outcomes are treated as endogenous regressors in equations for later life-cycle outcomes. For example, schooling outcomes affect the partnership decision and both schooling and partnership affect labor supply and earnings. Correlation across equations is modeled as unobserved heterogeneity in the form of three latent factors capturing cognition, externalizing behavior and internalizing behavior. The distribution of these underlying factors, along with factor loadings that map these behaviors to observed measurements (i.e. test scores and teacher reports of classroom behavior) are jointly estimated together with all schooling and economic outcome parameters via simulated maximum likelihood. We also estimate the model separately for men and women to accommodate gender differences in the magnitude, direction and impact of childhood misbehavior on economic outcomes. Allowing latent factors to have different impacts across outcomes, we find that not only are the two non-cognitive skills quantitatively important at a scale comparable to cognition, but that they also have distinct impacts on each outcome. While internalizing behavior appears to be detrimental on almost all outcomes, externalizing behavior is revealed to have mixed effects. In particular, externalizing behavior reduces educational attainment, but also carries an earnings premium later on. The positive effect of externalizing behavior on earnings is significant after controlling for internalizing behavior, cognition, schooling and other decisions made earlier on in the life-cycle, such as partnership and fertility.4 In other words, we find that a penchant for breaking bad can actually signal something quite good.5 Our findings suggest that interventions designed to curb or eliminate childhood misbehaviors may be ill-conceived. A subset of children who misbehave may be expressing non-cognitive skills that signal high earning potential. If so, then efforts to reduce misbehavior can have costly unintended consequences for misbehaving individuals themselves and for the economy as a whole. For example, the Perry Preschool Program in Tennessee decreased the development of externalizing behaviors (Heckman, Pinto, and Savelyev, 2013).6 4

As we will highlight in Section 4, identifying benefits of some non-cognitive skills underlying misbehavior would not be possible if we used a single, aggregated measure of misbehavior as has typically been done in previous empirical work (for example Segal (2013)). In fact, using the NCDS data set, we can replicate the general result in earlier work that a single-dimensional non-cognitive skill capturing childhood misbehavior is bad for schooling as well as earnings once we control for schooling. 5 According to www.urbandictionary.com the definition of the term breaking bad is to “challenge conventions” or to “defy authority”. Breaking Bad is also the title of an American television show in which the protagonist is an unsuccessful chemist who reveals a striking talent for producing illicit drugs. The show offers an extreme example of how certain skills or behaviors may lead to low productivity in one sector and high productivity in another. 6 Heckman and Kautz (2013) discuss the possibility of policies to boost certain non-cognitive skills and

2

We show that a policy that lowers externalizing behavior has the potentially adverse side effect of reducing earnings. Our results also speak to problems in how schooling and certification signal labor market potential. This point goes back to Heckman and Rubinstein (2001), who assess the General Educational Development (GED) program, a test designed to separate out bright high school dropouts from other dropouts. Ironically, it selects dropouts with relatively high cognition, but with low levels of non-cognitive skill. This means that taking the GED is a “mixed signal” and individuals with a GED earn less than other high school dropouts with similar cognitive abilities. Their finding suggests that educational attainment or certification is a potentially flawed signal of a future worker’s productivity. An informative signal should be designed to accurately reflect all skills that are productive in the labor market. In our context, if externalizing behavior enhances earnings capacity, then at the very least the education process should not penalize it. To explore how schooling affects earnings through its impact on externalizing behavior, we use our model to examine two conceptually different policies, the Behavioral Policy and the Schooling Policy. A Behavioral Policy modifies the non-cognitive skill directly. It is motivated by results on interventions like the Perry Preschool Program showing that externalizing can be substantially changed among young children (Heckman, Pinto, and Savelyev, 2013). A Schooling Policy aims to alter how non-cognitive skills underlying childhood misbehavior affect educational attainment. In making this distinction, we draw on pedagogical research that discusses “control-oriented” teaching methods, which are designed to reduce externalizing behaviors, versus “relationship-oriented” methods, which are designed to strengthen the learning environment for externalizing children.7 Using our model, we assess both a Behavioral Policy, which lowers externalizing behavior, and a Schooling Policy, which reduces the schooling penalty for externalizing behavior. Results show that, while both policies can increase educational attainment, the Behavioral Policy reduces earnings relative to the baseline while the Schooling Policy enhances earnings. For men, the gain from the Schooling Policy comes from increased hours and wages, while for women it comes exclusively from increased hours.8 Segal (2008) also studies the malleability of skills related to childhood misbehavior in school. 7 For an overview of pedagogical techniques that foster a caring and positive student-teacher relationship, in particular, in dealing with student misbehavior, see Hamre and Pianta (2006). A simple example illustrates the difference in the two approaches. Young students who often initiate conversations with teachers at inopportune times could be punished for interrupting a lesson under the Behavioral Policy or could instead be given a “raincheck” and invited to initiate a discussion at an appropriate time under the Schooling Policy. The effectiveness of such practices is demonstrated by a randomized controlled trial of the My Teaching Partner-Secondary program (MTP-S), in which a web-mediated program on improving teacher-student inclass interaction has produced reliable gains in student achievement (Allen et al. (2011)). 8 As we will show when presenting results, the underlying distribution of externalizing behavior is markedly

3

Our research contributes to three separate literatures. The first incorporates non-cognitive skills into models of rational decision-making to explain economic behavior. Much of this work can be traced to Heckman and Rubinstein (2001). Building on this work, economists have studied how non-cognitive skills relate to a host of outcomes, including marriage (Lundberg, 2012, 2011), education (Bar´on and Cobb-Clark, 2010; Savelyev, 2010; Gensowski, Heckman, and Savelyev, 2011; Heckman and LaFontaine, 2010; Heckman, Pinto, and Savelyev, 2013), health (Heckman, 2012) and labor market choices and outcomes (Heckman, Stixrud, and Urzua, 2006; Wichert and Pohlmeier, 2010; Heineck, 2010; St¨ormer and Fahr, 2013; Fortin, 2008; Urzua, 2008).9 While the aforementioned papers maintain that non-cognitive skills are adequately summarized as a one-dimensional object, a few exceptions in the literature have explored how non-cognitive skills have different impacts in different sectors and for different groups. Lundberg (2012) shows that the labor market returns to personality traits vary both by tenure and by educational group. Hamilton, Pande, and Papageorge (2014) show that certain personality traits capture a preference for self versus paid employment and simultaneously carry an earnings penalty in self employment. More similar to our paper, Cattan (2011), using NLSY data, finds that gender differences in cognition and self-confidence contribute to a significant portion of the gender wage gap, especially at early career stages. In contrast, our focus is on measures of childhood maladjustment, which we find have negative impacts on schooling, but mixed effects on other labor market choices and outcomes. We also contribute to a literature linking childhood characteristics and behaviors to long-term outcomes such as education, employment and earnings. Currie (2001) and Currie (2009) show that early childhood health disparities can affect future health and labor market outcomes through a variety of mechanisms, including performance at school. These links suggest that human capital investments during childhood can have huge payoffs in adulthood (Heckman and Masterov, 2007; Doyle et al., 2009; Cunha, Heckman, and Schennach, 2010). In light of these links, it is not surprising that researchers have considered the impact of childhood non-cognitive skills, typically identified using measures of misbehavior, on labor market outcomes. Segal (2013) shows that misbehavior during the eighth grade can have a different for boys versus girls. For boys, both the mean and the variance are higher. This finding is consistent with Bertrand and Pan (2013), who study gender differences in misbehavior. The fact that there is a stronger tendency to misbehave among boys is considered troubling enough that the title of their paper begins with the phrase “The Trouble with Boys”. In contrast, the point we want to make is that the “trouble” with boys may actually be something quite valuable at the work place and that the real trouble might be in the way boys are schooled. 9 Excellent summaries of the state of this line of research are found in Borghans et al. (2008) and Almlund et al. (2011). The techniques used in this literature draw upon Goldberger (1972) and J¨oreskog and Goldberger (1975).

4

negative impact on future earnings even after controlling for schooling attainment. Similarly, Sciulli (2012) shows that adult employment outcomes are negatively related to childhood maladjustment. Among studies that use the same dataset we use, Carneiro, Crawford, and Goodman (2007) consider how non-cognitive skills measured in childhood impact a variety of outcomes in later stages of life, while Gregg and Machin (2000) and Weiss (2010) focus in particular on their impact on wages.10 These studies view misbehavior as reflecting a single factor and make policy recommendations accordingly. We show evidence that this type of simplification obscures the positive impact of externalizing behavior on outcomes, which has not been recognized in previous work, and argue for a more nuanced design of interventions. Finally, our work contributes to research in economics studying externalizing behavior. For example, Heckman, Pinto, and Savelyev (2013) show that an early childhood intervention (the Perry Preschool Program) raised earnings and that about 20% of this rise is attributable to a reduction in externalizing behavior. In contrast, we find that, for a 1958 British cohort, externalizing behavior raises earnings. To explore this difference, we consider a sub-sample of the British cohort that is selected to mimic the financially disadvantaged group studied in Heckman, Pinto, and Savelyev (2013). Among these individuals we find that externalizing behavior carries no earnings premium and may even carry a small penalty. That is, differences in the population under study matter for findings. This is reminiscent of the argument made by Lundberg (2012) that the payoff to non-cognitive skills is context-dependent and may vary by socioeconomic status. It also echoes our results on gender differences in the economic impacts of externalizing behavior and, more broadly, suggests that policies that shape behaviors may have different impacts on different groups. We return to this point in the conclusion where we discuss possible extensions to this research. The rest of the paper is organized as follows. In Section 2, we discuss externalizing and internalizing behavior. In Section 3, we describe the data set. In Section 4, we present preliminary findings. In Section 5, we describe the econometric framework, the estimation procedure along with parameter estimates. In Section 6, we show results from the structural model. In particular, we explain how skills related to childhood misbehavior affect sociodemographic and economic outcomes at various stages of the life-cycle. In Section 7, we use the estimated model to assess the effects of the Behavioral Policy versus the Schooling Policy. Section 8 concludes. 10

There is some work in psychology and sociology literature that uses the NCDS data to examine selection into occupations. Jackson (2006) shows that having low levels of internalizing behaviors is an important predictor of managerial occupations. Further examples include Farmer (1993) and Farmer (1995), who show that boys who display high levels of externalizing behavior leave school earlier, obtain fewer qualifications, and begin their careers in lower social class positions. However, these studies do not control for internalizing behavior and hence suffer from omitted variables bias. We discuss this bias when presenting preliminary results.

5

2

Externalizing Behavior and Internalizing Behavior

In our empirical analysis, we model unobserved heterogeneity as three underlying factors: externalizing behavior and internalizing behavior, both of which are non-cognitive skills, and cognition. We rely on teachers’ measures of misbehavior and maladjustment to construct the non-cognitive skills. When a child in the sample is 11-years-old, the child’s teacher is asked a series of questions regarding the child’s behavior in school. From the teacher’s responses, ten BSAG maladjustment syndrome scores are constructed, where BSAG stands for the Bristol Social Adjustment Guide.11 For a general survey on the use of BSAG maladjustment measures, see Shepherd (2013). Ghodsian (1977) was the first to show that the BSAG maladjustment syndromes variables could be described by two independent latent factors, each representing a single non-cognitive skill. This is done with principle components analysis. Intuitively, this type of analysis is used to determine the correct number of independent underlying factors that generate a set of observed variables, where the variables are seen as imperfect measures of the underlying factors. We replicate the analysis using our data in Appendix A and confirm earlier findings that two independent latent factors adequately describe the BSAG variables. The first factor corresponds to anxious, aggressive, outwardly-expressed or externalizing behavior and includes maladjustment syndromes like hostility towards adults and restlessness. The second factor corresponds to withdrawn, inhibited or internalizing behavior and includes maladjustment syndromes like depression. The two factors have been studied extensively by psychologists researching child development and, of late, by some economists (Blanden, Gregg, and Macmillan, 2006; Aizer, 2008; Agan, 2011).12 Table 1 summarizes observed measures used to identify the three factors used in our analysis. We list each factor used in our analysis along with the observed measures we use to identify them. The first two are the externalizing and internalizing behaviors listed along with their associated measures from among the ten BSAG maladjustment measures. The third factor is cognition, which is measured using several test scores.13 While we find the fact that two underlying factors explain the BSAG variables fairly 11

These maladjustment syndromes were first developed by Stott (1958) and have been used to assess the psychological development of children ever since. The assessments have been externally validated in the sense that they have been found to be significantly positively correlated with a range of other measurements of social maladjustment from teachers, professional observers, parents and peers (Achenbach, McConaughy, and Howell, 1987) 12 Both Aizer (2008) and Agan (2011) study the impact of externalizing behavior on anti-social or criminal behavior. 13 For a general survey on the use of externalizing and internalizing behaviors, see Duncan and Magnuson (2011) and Duncan and Dunifon (2012).

6

Table 1: Latent Factors and their Measurements

Latent Skill

Measures       

Hostility Towards Adults Hostility Towards Children Anxiety for Acceptance by Adults Anxiety for Acceptance by Children Restlessness Inconsequential Behavior Writing Off of Adults and Adult Standards

Internalizing Behavior

   

Depression Withdrawal Unforthcomingness Writing Off of Adults and Adult Standards

Cognition

    

Reading Comprehension Test Score Mathematics Test Score Non Verbal Score on General Ability Test Verbal Score on General Ability Test Copying Designs Test Score

Externalizing Behavior

This table lists the three latent factors used in the empirical analysis (externalizing behavior, internalizing behavior and cognition) and the observed measures used to identify them. Measures for externalizing and internalizing behaviors are the BSAG maladjustment variables, derived from teacher measurements of misbehavior. For cognition, a series of aptitude test scores are used as measures. See Appendix A for further details.

unassailable, the mapping between the two factors and the BSAG measurements is a more delicate issue. For some measurements such as hostility towards adults and children and inconsequential behavior, both common sense and factor analysis point in the same direction. Namely, all aforementioned measurements represent outwardly expressed behaviors and are strongly related to the first factor in the factor analysis (See Appendix A). This is also the case of depression, unforthcomingness and withdrawal, all of which represent inwardly expressed behaviors and are strongly related to the second factor in the factor analysis. It is less clear for other measurements, as is the case with writing off adults and standards, which could represent an outwardly or inwardly expressed behavior and is statistically related to both groupings. In this case, we followed previous work and allow the measured behavior to

7

be related to both groupings (Ghodsian, 1977; Shepherd, 2013). We explain this process in a greater detail in Appendix A.14

3

Data

In this section, we describe the data set used in our analysis. Next, we present summary statistics on education, labor market outcomes and measures of childhood maladjustment.

3.1

The National Child Development Study

The NCDS is an ongoing longitudinal survey that began by following the universe of individuals born in the same week in 1958 in Great Britain. The data set contains information on physical and educational development, wages, employment, family life, well-being, social participation and attitudes. The NCDS is particularly well-suited for our study since it measures misbehavior in school for a large sample of children and then follows these children into adulthood. Therefore, the data set allows us to relate measures of misbehavior in elementary school to adult decisions such as partnership and fertility along with labor market outcomes. To date, there have been eight surveys to trace all the members of the cohort still living in Great Britain. Surveys occurred when subjects were born and at ages 7 (1965), 11, 16, 23, 33, 42 and 50 (2008). For estimation, we focus on information gathered at birth and the first five sweeps, covering ages 7-33. The NCDS initially contained information on 18,555 births. At the second wave, 15,356 of the original sample of babies remained as respondents (7,899 males and 7,457 females). By the fifth survey, at age 33, the interview sample included 11,407 individuals. In constructing our analysis sample, we keep respondents with valid information on educational attainment at age 33 and cognition and behavioral measures at age 11, which leaves us with 9,047 individuals. We also drop individuals with missing information on parents’ education and family characteristics at early ages, relationship status, fertility decisions and employment status at age 33. Further, we drop individuals with missing information on their employment history or who are reported as employed but have missing information on earnings. The resulting analysis sample has complete information on 5,999 individuals, of whom 2,889 are men and 3,110 are women. 14

We also performed the analysis under specifications where ambiguous measures are assigned to different factors. Results remain largely unchanged.

8

3.2

Summary Statistics

Summary statistics for the analysis sample are found in Table 2.15 In the United Kingdom, students’ progression beyond age 16 is based on a series of national examinations. At age 16, students in our sample have the option to leave school without any qualifications, to study and obtain a Certificate of Secondary Education (CSE), or to study and obtain the Ordinary Levels (O-Levels). The CSEs are administered for the less academically prepared students, while the O-Levels are more academically demanding.16 For those students that decide to stay in school at age 16, another set of examinations is available, the Advanced Levels (A-Levels). Students who are successful in their A-Levels are able to continue to higher education and obtain either a higher education diploma (after two years of study) or a bachelor’s degree (after three years of study). At the postgraduate level, students can obtain a higher degree: Master of Philosophy (MPhil) or Doctor of Philosophy (PhD). Given this system, individuals in our sample can sort into six mutually exclusive schooling levels: no formal education, CSE, O-Levels, A-Levels, higher education (including diploma and the bachelor’s) or higher degree (including MPhil and PhD). According to Table 2, educational attainment for women in the analysis sample is far lower than it is for men. For example, roughly half the men obtain O-Level qualifications or less, whereas nearly two-thirds of women do. This disparity reflects the generation from which the sample is drawn as it is a 1958 cohort. In addition, about twice as many men as women who reach A-Levels and stop schooling thereafter (19% versus 10%). In other words, men are more likely to take exams that would permit matriculation at a university and then fail to or choose not to actually go to a university. Moving to other adult outcomes, although women and men are equally likely to have a partner, women are 20% more likely to have children, which suggests that the women in the sample have children at a younger age than the men. Together, these gender differences suggest that our empirical model should account for gender differences in the impact of observables and latent behaviors on schooling decisions. The education disparity between men and women is reflected in hourly wages and weekly earnings differences. Conditional on working, hourly wages for men average about 9.59 15

Summary statistics for all individuals observed at age 11 are found in Tables S1 and S2 in Appendix A. We refer to this larger sample as the “full sample”. We compare the summary statistics of the analysis sample to that of the full sample to see if attrition is selective on observables. The main difference between the two samples is that individuals who do not attrit are more likely to attain higher education. To make sure that our main results are not affected by attrition, we redo the reduced-form analysis with the full sample and obtain similar results. We also re-estimate a simplified version of the structural model, excluding some adult outcomes such as fertility and partnership, so that we can use a larger sample. Main results do not change appreciably. Results from these robustness tests are available from the authors. 16 CSEs and O-Levels were replaced by the General Certificates of Secondary Education (GCSE) in 1986.

9

pounds versus 6.26 pounds for women. However, education only offers a partial explanation for the wage and earnings disparity. From the same table, by the time they reach age 33, women have accumulated around 25% fewer months of experience than men. Indeed, we show in Figure 1 that, at each education level, men have higher wages, work longer hours and earn more. Nevertheless, there is some evidence that returns to schooling are proportionally higher for women than men (Panel 1(d) of Figure 1). Women with a higher degree earn 3 times as much as women with no formal education. For men, this ratio is 1.75. This difference may reflect actual differences in returns as well as gender differences in selection into schooling based on cognitive or non-cognitive skill, which our model will account for. In Table 3, we present averages by gender for each BSAG maladjustment. The BSAG measures range from 0 to 15, with higher numbers indicating a higher prevalence of a particular maladjustment syndrome. The means are usually low due to a clustering around zero and fairly low values in general. Nonetheless, there are significant differences across gender. Girls tend to misbehave with lower frequencies than boys. Specifically, boys present higher average scores on all BSAG measures except for “anxiety for acceptance by adults”. For “inconsequential behavior” and “ anxiety for acceptance by children”, the average for boys is roughly double what it is for girls. These findings are consistent with those documented by Duncan and Magnuson (2011), Duncan and Dunifon (2012) and Bertrand and Pan (2013). Gender differences in misbehavior are well-documented and a common topic of study in several fields, including developmental psychology and economics. Nonetheless, in contrast to previous work on the topic, we address the possibility that some measures of maladjustment may reflect skills with relatively high returns in the labor market.

4

Preliminary Evidence on Misbehavior and Earnings

In this section, we present preliminary evidence that misbehavior is bad for schooling, but not necessarily bad for earnings. Specifically, we show that once we consider externalizing and internalizing behaviors separately, externalizing behavior has a positive impact on earnings. This finding holds under a variety of model specifications, which we discuss below.17 We end this section with a discussion of shortcomings of our preliminary analysis, which we use to motivate the structural model we specify and estimate in Section 5. 17

Results for some specifications are in Appendix B.

10

4.1

Reduced-Form Analysis on Misbehavior and Earnings

We first regress log weekly earnings at age 33 onto cognition, classroom misbehavior and dummy variables for whether the individual is female and lives in London (Column [1] in Table 4). We compute a measure for misbehavior by summing up the scores for the externalizing and internalizing behaviors. Whereas cognition raises earnings, childhood misbehavior, when aggregated, does in fact appear to lower earnings, in line with Segal (2013). Further, the coefficients indicate that females in the sample earn less than males and that those living in London earn more. Next, we follow literature from developmental psychology and view childhood misbehavior as reflecting two distinct factors. This means we construct measures of each factor for each individual given their reported misbehavior and test scores. In particular, we start from Table 1, which associates each skill with a list of relevant measures from the data. For each skill, we apply the regression scoring method (Thomson, 1951). The regression scoring method generates regression (or scoring) coefficients that are obtained by multiplying the matrix of factor loadings by the inverse of the measurement correlation matrix.18 Next, we use the resulting coefficients as weights to compute an average “score” for each individual and for each factor.19 Next, we include computed scores as additional regressors to predict log weekly earnings at age 33. Considering the two factors one at a time, we find that externalizing behavior appears to have no impact on earnings, whereas internalizing behavior significantly lowers earnings (Columns [2] and [3] of Table 4, respectively). Nevertheless, once we control for cognition, externalizing and internalizing behaviors together, externalizing behavior has a positive impact on earnings (Column [4]). That is, externalizing behavior captures a non-cognitive skill that carries an earnings premium. Moreover, this result is reinforced when we control for schooling attainment (Column [5]) and when we control for schooling along with fertility, partnership and experience (Column [6]). Running the regression for males and females separately, we find the positive impact of externalizing behavior on earnings to be more pronounced for females than for males (see Columns [7] and [8] for males and females, respectively). In Table 5, we show that both externalizing and internalizing behaviors have a negative 18

Other popular methods include Bartlett factor scores and Anderson-Rubin Scores. The regression scoring method is similar to how we will treat observed measures of misbehavior in our structural model. The difference is that the regression scoring method assigns a score for each behavior to each individual in a separate first step. In our structural estimation, the mapping between underlying factors and observed measurements will be jointly estimated with all other model parameters. 19 The scoring coefficients to the measures associated with each skill are discussed in Appendix A.

11

impact on schooling. From a multinomial logistic regression where the outcome variable is one of the six schooling levels, cognition has a positive impact on schooling, while both externalizing and internalizing behaviors lead individuals to sort into lower levels of education. Some previous work relies on this finding to argue that the total impact of misbehavior on earnings is negative since misbehavior lowers schooling, which in turn implies lower earnings. This line of argument misses the direct positive impact of externalizing behavior on earnings shown in Table 4.20 The results in Tables 4 and 5 are robust to a number of different specifications. Details on these additional specifications are found in Appendix B. However, we list key findings here. As a first robustness check we control for selection into education using the Lee (1983) and Dubin and McFadden (1984) methods for selection bias correction. We find that formally controlling for selection into schooling does not alter our results in any important way. If anything, the positive relationship between externalizing behaviors and earnings becomes stronger after we control for selection into schooling. We also explore the sensitivity of our results to the regression scoring method used to construct the unobserved factors. An alternative and simpler approach is to sum up the measurements associated with each latent factor. The results, which can be seen in Appendix B, are strikingly similar to the results in Tables 4 and 5, which suggests that our results are robust to the construction of the unobserved skills.21 Another relevant concern with the results in this section is that we do not allow for non-linearities or interactions in the impact of unobserved factors. For this reason, we estimate the relationship between externalizing behaviors and earnings nonparametrically and show that the impact of externalizing on earnings is basically linear.22 As a last robustness check, we re-do the analysis in Table 4 allowing for a quadratic term and interactions among the factors. We find no evidence that interactions or non-linearities are relevant. 20 Similarly, in Agan (2011), crime is regressed on externalizing behavior and schooling, but the effect of externalizing behavior on schooling is not modeled. In that type of framework, it is difficult to ascertain the impact of policies affecting how externalizing behavior modifies the schooling decision. 21 The only difference is that standard errors are slightly larger when we sum up the observed measurements. This is not surprising as summing reported behaviors to capture underlying factors is not designed to eliminate measurement error of the underlying factors. In contrast, the regression scoring method explicitly treats reports and scores as mis-measurements of the underlying factors. 22 We also separately regress earnings on those individuals with externalizing behaviors one standard deviation above the mean and those below that level in order to test for the possibility that our results are driven by the few individuals with externalizing behaviors in the tails. We do not find any difference in the returns to externalizing for the two groups.

12

4.2

Shortcomings of Preliminary Analysis

In summary, our preliminary results suggest that misbehavior is generally bad for schooling, but not necessarily bad for labor market outcomes. Moreover, robustness checks detailed in Appendix B confirm that results are not driven by selection into schooling, individuals with either very high or very low levels of externalizing behavior or by the manner in which we construct the latent factors. However, an important shortcoming of our preliminary analysis is that it is not well-equipped for an analysis of the different mechanisms through which externalizing and internalizing behaviors affect schooling decisions and adult outcomes, including labor market outcomes for men and women. To illustrate this point, one obvious explanation for the link between externalizing behavior and earnings is through occupational sorting. To explore this avenue, we first run a multinomial probit regression linking non-cognitive skills to one of three occupational groups (unskilled, skilled and professional/managerial) controlling for gender, educational attainment, number of children by age 33 and partnership status at 33. As is shown in Panel [1] of Table 6, externalizing behavior does not appreciatively affect occupational outcome. We further regress earnings on the non-cognitive skills controlling in addition for experience and employment status at age 33, by occupational groups. Panel [2] in the same table reports the result. There is no evidence for differences in returns to externalizing for different occupational groups and this finding holds for females and males separately. Therefore, for our sample, we conclude that once we control for education, occupational sorting based on externalizing behavior is not likely to drive the correlation between externalizing behavior, schooling and earnings.23 We therefore leave the occupation choice out of our main analysis. However, further preliminary analysis shows that non-cognitive skills do appear to affect partnership and fertility along with labor market measures, including accumulated work experience, employment, hours worked and wages. Therefore, we envision an empirical model that permits joint estimation of equations describing these decisions and outcomes. We now turn to the specification of the structural model.

5

Model and Inference

In this section, we specify and estimate a structural model of schooling decisions and adult outcomes, including partnership, fertility, labor supply, hours worked and wages. We start by describing the timing of different decisions and outcomes. This informs which variables are 23

We do, however, find that self employment is predicted by externalizing behavior and so agents in the model may choose no employment, self employment or paid employment.

13

included as endogenous regressors in each outcome equation. Second, we specify the system of equations used to approximate decisions and outcomes in the empirical implementation of the model. We also show how correlation across equations is modeled as unobserved heterogeneity in the form of three latent factors capturing cognition, externalizing behavior and internalizing behavior. Third, we discuss how simulated maximum likelihood is used to jointly estimate the parameters in each equation, the distributions of latent factors and parameters that govern how latent factors affect observable childhood measurements like test scores and teacher reports of misbehavior. We end this section with a discussion of parameter estimates.

5.1

Conceptual Model and Timing

We model individual decisions as a five stage problem. In Stage 1, agents learn their endowment of non-cognitive and cognitive skills. In Stage 2, agents choose schooling levels by comparing the pecuniary and non-pecuniary returns to each possible schooling level to the costs associated with attaining each level. These pecuniary returns include discounted earnings given optimal future employment and hours decisions, while the non-pecuniary returns include, for example, the impact of schooling on the likelihood of finding a partner or having children. After completing their education, in Stage 3, agents choose whether or not to have a partner and have children, again taking into account the impact of these choices on future utility. Stage 4 encompasses early years in the labor market, which lead to variation in accumulated human capital measured as months of on-the-job experience reported at age 33. At Stage 5, when agents are 33-year-old, they make employment and hours decisions, conditional on the three latent skills and the wage offer they expect to get. The wage offer is itself modeled as a function of an individual’s previous work history, education, non-cognitive skills and cognition.24 There are three key differences between estimates of the structural model we present here and the preliminary estimates discussed in Section 4. First, estimates here are from an empirical model that explicitly accounts for selection into schooling levels. Second, the structural model accounts for other economic outcomes that are affected by the latent skills and that also have an impact on earnings such as partnership, fertility, work experience, employment, hours worked and wages. Third, we jointly estimate the mapping between BSAG scores and underlying factors, where scores are treated as mis-measurements of the factors. 24

For reasons we explained in Section 4, we do not model occupational sorting.

14

5.2

Implementation of the Model

There are three latent skills affecting education decisions and labor market outcomes: externalizing behavior, internalizing behavior and cognition. For each latent skill, we have a set of imperfect measurements that are used to identify the corresponding skill (see Table 1). We denote the k-th measurement of skill j ∈ {1, 2, 3} for individual i with gender n ∈ {0, 1} as mijkn , where n = 1 denotes male and n = 0 denotes female. mijkn is specified as: mijkn = mjk + αjkn fij + εijkn

(1)

where mjk is the measurement mean for the whole sample and is not allowed to vary between genders, fij is the value of the latent skill j for individual i, αjkn is the factor loading of the latent skill j onto the k-th measurement of that skill and is allowed to vary by gender, and εijkn is an error term capturing mis-measurement. The latent factors fij are drawn from normal distributions, the parameters of which can vary by genders n: fij ∼ N (µjn , σjn ) , n ∈ {0, 1}

(2)

Further, the model assumes that cov(fij , εijkn ) = 0 ∀k (latent skills are independent of measurement errors), cov(fij , fij 0 ) = 0 for j 6= j 0 (latent skills are independent random variables) and cov(mijkn , fij 0 ) = 0 for j 6= j 0 , ∀k (latent skill j does not affect the measurement of latent skill j 0 ).25 We approximate the decisions and outcomes for stages 2-5 as a series of linear-in-parameters regressions, each of which includes the three latent factors. Recall from our discussion of the timing of the model that these factors are taken as given after Stage 1 and therefore included as explanatory variables. Moreover, certain outcomes that occur in earlier stages are included as regressors in choices and outcomes that occur in later stages. One example is schooling, which is determined at Stage 2, but is also included as an explanatory variable in equations explaining partnership and fertility along with experience, employment, hours and wages, all measured at age 33. To simplify exposition, in Table 7 we list which variables are included as regressors in each equation. Note that the model is implemented separately by gender, i.e. all parameters are allowed to differ across genders. For simplicity we suppress the gender subscript n in the equations below. It is important to understand how the model accounts for selection into schooling and 25

Our results are robust to allowance for more flexible distributional assumptions on the measurement error for each latent factor. In particular, we have permitted mixed normal distributions with two components and obtain qualitatively similar results.

15

employment groups. It is evident that individuals with higher expected wage offers are more likely to self-select into the employed group and that individuals with higher expected wages from a higher degree will self-select into higher education groups. Our model controls for self-selection through the three unobserved latent skills. The logic is that these latent skills capture the unobserved heterogeneity that jointly affects schooling and labor market outcomes, including having a partner, work experience, labor supply and earnings. The key identifying assumption is that once we have controlled for latent cognitive and non-cognitive skills, the remaining unobservables are uncorrelated across equations. We approximate the schooling problem by a linear-in-parameters model with an alternativespecific Extreme Value Type I disturbance, so that the probability that agent i chooses education level s ∈ {1, ..., 5} (with s = 0 or no formal education as the baseline) is given by: Pi (s) = ΛS

Xi,S βs +

3 X

! (3)

αjs fij

j=1 P3 where ΛS (·) = 1+P5 exp(Xexp(·) and Xi,S is the vector of observable characteristics i,S βl + j=1 αjl fij ) l=1 that affect the schooling decision. βs is the vector of returns associated with Xi,S for schooling level s, αjs is the return to latent skill j for schooling level s, both of which are relative to the baseline outcome of no formal education. Xi,S contains a number of variables that are excluded from other equations, including the number of children in the household at age 11, whether the mother studied beyond the minimum schooling age and whether the father studied beyond the minimum schooling age along with average class size and average preparation of students in the same class as the subject at age 11. These exclusion restrictions help us to separately identify the direct impact of unobserved skills on labor market outcomes from their indirect impact through their effect on schooling choices.

The fertility decision, measured as the number of children, nci , is approximated by: log(E (nci )) = Xi,N C βN C +

5 X s=0

γs,N C 1i [s] +

3 X

αj,N C fij

(4)

j=1

where Xi,N C is a vector of individual characteristics and coefficients βN C map these to fertility. γs,N C highlights that we include schooling dummy variables to capture how individuals with different levels of education s might have distinct preferences over the number of children. 1i [s] is an indicator taking the value 1 if individual i has education level s and is equal to 0 otherwise. Similarly, αj,N C captures how preference over the number of children varies among individuals with different levels of latent skills. We assume fertility follows a Poisson distribution. For this reason, we take the log of the number of children in equation (4). 16

The partnership decision is approximated by another linear-in-parameters model with an alternative-specific Extreme Value Type I disturbance: Pi (pa = 1) = Λ Xi,P A βP A +

5 X

γs,P A 1i [s] +

s=0

3 X

! αj,P A fij

(5)

j=1

exp(·) and Xi,P A are observables with coefficients βP A . Again γs,P A and αj,P A where Λ(·) = 1+exp(·) have the usual interpretation.

Early labor market outcomes are captured by the number of months employed by age 33, denoted expi : expi = Xi,EXP βEXP +

5 X

γs,EXP 1i [s] +

s=0

3 X

αj,EXP fij + εi,EXP .

(6)

j=1

Here, Xi,EXP are observables that affect the number of months employed and include earlier outcomes such as having a partner and fertility. εiEXP is a normally distributed idiosyncratic disturbance. At age 33, individuals choose whether or not to be employed, receive a wage offer and then decide on the number of working hours. The employment decision, Ei ∈ {0, 1, 2}, is such that Ei = 1 means employed and Ei = 2 means self-employed. It is approximated by a linear-in-parameters specification with each alternative subject to an Extreme Value Type I disturbance: ! 3 5 X X γse 1i [s] + αje fij (7) Pi (e) = ΛE Xi,E βe + s=0

j=1

exp(·) P3 where ΛE (·) = 1+P2 exp(X β +P , Xi,E also includes partnership and 5 i,E l l=1 s=0 γsl 1i [s]+ j=1 αjl fij ) fertility. Log-hourly-wages for individual i, denoted yi , are modeled with a linear specification and a normally distributed disturbance:

yi = Xi,Y βY +

5 X

γs,Y 1i [s] +

s=0

3 X

αj,Y fij + εi,Y .

(8)

j=1

Here, Xi,Y is a vector of observables that include partnership, fertility and months of experience. Finally, the logged weekly working hours decision is modeled in a similar fashion as: 5 3 X X hi = Xi,H βH + γs,H 1i [s] + αj,H fij + εi,H . (9) s=0

j=1

where βH captures how partnership, fertility and experience (included in the vector of ob17

servables Xi,H ) might affect the number of hours worked in a usual week. We summarize the parameters to be estimated into a vector denoted Φ: Φ = (β, γ, α, Ξ)

(10)

where β denotes the set of coefficients of the vectors of observables absent the schooling level in equation (3)-(9), γ is the set of coefficients governing returns to schooling, α is the set of returns to latent skills and Ξ are coefficients of the measurement system described in equations (1) and (2). Now, we explain how Φ is estimated via simulated maximum likelihood.

5.3

Estimation

We estimate the model parameters described in the previous section via simulated maximum likelihood. There are three main steps to the estimation procedure. First, at each set of parameter value suggestions, indexed by g and denoted Φ(g) , and for each individual i, we simulate cognition and non-cognitive skills, the probability of schooling, probability of employment, probability of having a partner, number of children, experience, hours and wages K times, where K represents the number of draws of unobservables for each individual. Second, we compute each individual’s average likelihood contribution, where the average is taken over the K draws. Third, we sum over average likelihood contributions from each individual and compute the log, which yields the value of the simulated log likelihood function, the negative of which is then maximized. The model is estimated separately for males and females. The simulation procedure begins as follows. We draw a block matrix of size K × I × J from a standard normal distribution. Here J is the number of latent skills, i.e. 3, I is the number of individuals and K is the number of draws per individual. Next, at each parameter (g) suggestion Φ(g) and for each individual i and draw k, we simulate vectors of latent skills fijk , j ∈ {1, ..., J}. Given our assumptions on the shocks and using the parametric specification described in Section 5.2, we compute the density functions corresponding to each decision or outcome. That is, we compute the probabilities of individual i choosing schooling level (g) (g) (g) s Pik (s), of choosing employment type e Pik (e) and of having a partner Pik (pa) given parameter suggestion (g) and draw k. Similarly, we compute the probability of observing (g) (g) (g) (g) (g) (g) wages yik as f Y (yik ), hours worked hik as f H (hik ), number of children ncik as f N C (ncik ) (g) (g) and experience level expik as f EXP (expik ) for individual i, parameter suggestion (g) and (g) draw k. Last, we compute f M (mik ), the density function for all the classroom behavior 18

measurements for individual i, draw k and parameter suggestion (g). The corresponding likelihood function for individual i and parameter suggestion (g) is: (g)

Li

      (g) (g) (g) M NC EXP f m × f nc × f exp ik ik ik k=1  1(e?i =1)  1(e?i =1) (g) (g) × f H hik × f Y yik Q5 Q1 Q2 (g) (g) (g) 1[s=s?i ] 1[pa=pa?i ] 1(e=e∗i ) × P (s) × P (pa) × s=0 ik pa=0 ik e=0 Pik (e) =

1 K

PK

(11)

where s?i represents the observed schooling choice, e?i the observed employment state and pa?i the observed marital status by age 33 of individual i in the data. Once we have constructed (g) Li for each individual i, we take the log of each individual’s contribution and sum over individuals to obtain the log-likelihood: l(g) =

I X

  (g) log Li

(12)

i=1

Using both simplex and gradient methods, we evaluate l(g) at different values in the parameter space, indexing these suggestions by (g), and continue until a maximum is found.

5.4

Parameter Estimates

We now discuss the parameter estimates along with the simulated gender specific distributions of the three latent skills. It is important to bear in mind that, in the empirical implementation, later economic outcomes are modeled as functions of latent skills, schooling decisions and other earlier outcomes, which are themselves functions of latent skills. Therefore, the total or cumulative effect of skills on an outcome like partnership or wages includes the direct impact along with the indirect effects that run through earlier decisions and outcomes. Here, we discuss parameter estimates for each equation, which solely capture the direct impacts of skills on each choice or outcome. In Section 6, we use the estimated model to simulate the cumulative effects of latent skills on each economic outcome. We plot the gender specific distributions of the latent skills in Figure 2, with Panel (a) devoted to cognition, Panel (b) internalizing behaviors and Panel (c) externalizing behaviors. Whereas the distributions for cognition and internalizing are strikingly similar for males and females, the distribution of externalizing behavior differs by gender. In particular, the mean and variance are higher for boys than for girls. This means that boys, on average, score more highly on externalizing and that there is more heterogeneity among boys than girls.26 26

Estimates of the measurement system for latent skills are found in Table S3 in Appendix A. Corrected

19

Next, we ask how these latent skills affect schooling and labor market outcomes. Parameter estimates from each of the equations in the structural model are found in Tables S4-S10 in Appendix A. Beginning with schooling (Table S4), both internalizing and externalizing behavior lower schooling attainment, which is consistent with the preliminary results discussed in Section 4. For males, externalizing behavior has a stronger negative effect. For females both behaviors play an important role. Note that externalizing females have an especially difficult time finishing a higher degree. Cognition, mothers’ education and fathers’ education all raise schooling, whereas the number of children in the household at age 11 lowers it. The direct impacts of the latent skills and schooling on partnership and fertility by age 33 are presented in Tables S5 and S6. For males, externalizing behavior has a positive, albeit sometimes insignificant, direct effect on partnership and fertility, while internalizing behavior has a stronger and negative effect on the two outcomes. On the other hand, educational attainment strongly increases the chance of partnership but has a small negative impact on fertility. These results suggest that the latent skills affect the partnership and fertility outcome for males mostly through the schooling decision and internalizing behavior. In contrast, for females, both externalizing and internalizing behaviors directly lower the probability of partnership considerably, while schooling does not seem to matter. As for fertility, latent skills have negligible direct effects, while schooling strongly reduces the number of children by age 33. As we will show in the following section, both the direct and indirect effects of latent skills have far-reaching implications for the cumulative effects of latent skills on female labor market outcomes. Next, consider months of work experience by age 33 (Table S7). For males, high externalizing and internalizing behaviors have moderately negative effects on experience, whereas for females, higher internalizing and low cognition directly lower work experience. For both men and women, the impact of schooling on months of employment is not monotonic, which reflects how lower schooling levels permit early entry into the labor market. Partnership in general predicts more experience. For women, in addition, fertility predicts much lower experience: each child subtracts nearly 2 years of experience for a woman conditioning on schooling and latent skills. This is not surprising given that the cohort was born in 1958 and so had children in an era when women were primarily responsible for childcare. Finally, we examine the impact of latent skills and earlier life-cycle outcomes on labor for measurement errors, the results on factor loadings highlight two key points. First, inconsequential behavior and hostility towards adults are the measurements that load most heavily onto externalizing behavior, whereas unforthcomingness and depression have the heaviest loadings on internalizing behavior. This reinforces the view that childhood misbehavior captures two very different non-cognitive skills. Second, “Writing Off of Adults and Adult Standards” figures strongly in both non-cognitive skills, suggesting that it is a behavioral tendency common to both externalizing and internalizing schoolchildren.

20

market outcomes at age 33 such as employment status, hours and wages. Starting from employment at age 33 (Table S8), we find that externalizing males and internalizing females are less likely to be employed. Interestingly, externalizing females are more likely to be employed. These direct effects from the non-cognitive skills, though modest, are bigger than the positive direct effect from cognition. The impact of schooling is mixed, with mid-level educational attainment predicting higher employment than the lowest and highest levels. Across the board, partnership predicts higher (and children predict lower) employment. In terms of hourly wages (Table S9), educational attainment has a large and positive impact. Cognition by itself increases wages for both genders by a small amount and internalizing behavior carries a moderate penalty for male workers.27 Turning to weekly hours (Table S10), we find that externalizing behavior has a significant and sizable direct impact on hours for both male and female workers, and especially for females. The only other significant predictor of hours is fertility for female workers, which is negative. In summary, parameter estimates show that, for male workers, externalizing behavior lowers educational attainment, thereby leading to lower wages. At the same time, it raises hours worked conditional on schooling outcomes. Therefore, the total effect of externalizing behavior on earnings depends on the relative strengths of these countervailing dynamics. For female workers, externalizing behavior implies a further earnings penalty through increased fertility, which also lowers hours worked. In the next section, we use the estimated model to quantify the relative magnitudes of these cumulative effects.

6

Cumulative Effects of Externalizing Behavior

In the previous section, we reported estimated coefficients for each equation. These coefficients capture the direct effects of each skill, e.g., the direct impact of externalizing behavior on fertility. In this section, we are interested in the cumulative effects of externalizing behavior on economic decisions and outcomes, including earnings. Specifically, we are interested in the relative magnitudes of the various pathways through which externalizing behavior affects individuals. For example, externalizing behavior affects schooling and wages directly and schooling has an additional impact on wages. Hence, the total effect of externalizing behavior on wages includes the direct effect along with the impact that works through the ef27

Figures S1 through S2 present evidence on the goodness-of-fit for log weekly wages and educational attainment for both males and females. The model does well in fitting the wage distribution for both genders, though it performs slightly worse for males. The model mean earnings prediction is higher than the mean earnings in the data, but just slightly so. The reason is that the model under-predicts the right tail, so it compensates with a higher mean value. This problem is less severe for females, where the density on the right tail is smaller.

21

fect of externalizing on schooling. Understanding the relative magnitudes of these pathways is necessary to properly assess policies designed around non-cognitive skills. Finally, note that we focus on externalizing behavior in this section since estimated coefficients suggest that this skill illustrates how non-cognitive skills can have mixed effects.28 To understand the cumulative effect, we simulate average decisions and outcomes for a sample of individuals, varying the level of externalizing behavior. We endow this simulation sample with the same exogenous variables as those observed in the estimation sample, holding the internalizing and cognition skills fixed at the sample mean. To compute relative magnitudes of the various direct and indirect effects of externalizing behavior, we conduct a series of decompositions. These decompositions amount to a series of counterfactual policy simulations where different subsets of the pathways through which externalizing behavior affects outcomes are effectively shut down. For example, we might hold everything but schooling fixed to compute the magnitude of the effect of externalizing behavior that works through its impact on educational attainment. Finally, we note that all simulations and decompositions are done for males and females separately. Figure 3 illustrates the decomposition of the total effect of externalizing on earnings into its direct effect, its effect from schooling, its effect from fertility and its effect from everything else (e.g. partnership, experience and employment). The left-hand-side y-axis shows weekly earnings in levels and the right-hand-side y-axis shows percent changes in earnings. First, the direct effect on earnings for both males and females is positive. Next, we add the effect of schooling, which means that we allow varying levels of externalizing behavior to affect schooling decisions and allow schooling decisions to affect all other outcomes, including fertility, partnership, labor supply, and so on. Notice that adding the schooling effect attenuates the positive impact of externalizing on earnings. However, the relationship is still positive. This result is in line with our reduced form estimates, which suggest that externalizing behavior increases earnings even when we control for schooling. Next, we permit externalizing behavior to affect fertility. This does not affect the relationship for males, but does further attenuate the impact of externalizing on female earnings. High-externalizing females tend to have more children and each child lowers earnings. Finally, we plot the cumulative effect, which includes the partnership decision, labor supply and experience. Again, the relationship between earnings and externalizing behavior is largely unchanged. In summary, high-externalizing males and females earn more, despite the negative impact on schooling and fertility (for females). To quantify these direct and indirect effects, we calculate the percentage change in the 28

It would be straightforward to use the model to assess the cumulative impacts of all latent skills.

22

average labor market outcomes for individuals in our sample from when they are highexternalizing (75th percentile of the sample) to when they are low-externalizing (25th percentile of the sample). The results for males and females are found in the top and bottom panel of Table 8 respectively. The first lines in the two panels give the total percentage difference in the likelihood of being employed and in the average hours, wages and earnings between a sample of high- and low-externalizing individuals. These are the cumulative effects of externalizing on earnings where all channels are operative. High externalizing males are 3.1% less likely to be employed, work 2.8% more hours and earn 0.6% higher wages, while high externalizing females are 4.7% more likely to be employed, work 6.9% more hours and earn 3.7% lower wages. For both genders, the positive cumulative effects mask even bigger direct effects from externalizing on earnings. For men, the direct effect is 5.2% (versus a cumulative effect of 3.5%). For women, the direct effect is as high as 9.3% (versus a cumulative effect of 2.9%). Allowing externalizing to affect schooling outcomes substantially attenuates the direct impact of externalizing on earnings: The percentage difference in earnings between high- and lowexternalizing males drops to 3.4% and that between high- and low-externalizing females falls to 6.8%. For females, activating the fertility channel further decreases the percentage difference in earnings to 4%. These results are in line with Figure 3. Comparing the decomposition results across genders, two main differences stand out. First, while adding fertility and partnership to the schooling channel only increases the earnings premium of externalizing slightly from 3.4% to 3.6% for men, it substantially reduces the earnings premium from 6.8% to 3.6% for women. To understand the direction of the changes, in Figure 4, we plot the cumulative effect from externalizing on partnership and fertility for males and females. Note that while externalizing raises the probability of partnership for men, it lowers the probability of partnership for women. Recall from Tables S7 to S10 that partnership predicts more experience, higher likelihood of employment, higher wages and longer hours for both genders. Hence, activating the partnership channel increases the earnings premium of externalizing for men and decreases it for women. Even though the cumulative effect of externalizing on fertility is positive for both genders, higher fertility substantially lowers employment, wages and hours for women while having virtually no effect on labor market outcomes for men (Figure 5). Therefore, activating the fertility channel has little effect on the earnings premium for men but has a significant negative effect for women. The second main difference lies in the way selection into employment works. Note that when we add the effect of externalizing through selection into employment, average earnings (which are conditional on employment) rise for men but drop for women. This reflects 23

the change in the composition of employed workers. Specifically, high externalizing men are less likely to work so increasing externalizing drives some relatively low earners out of employment, thereby raising average earnings conditional on employment. The opposite is true for women. High-externalizing women are more likely to work (see parameter estimates in Table S8). As such, the sample of high externalizing women includes relatively low earners who select into employment, driving down average earnings. However, in neither case does this compositional change drive our main result. Even in the absence of it, high-externalizing men earn 3% more than low-externalizing men and high-externalizing females earn 3.3% more than their low externalizing counterparts.

7

Policies Affecting Childhood Misbehavior

In this section, we discuss the effects from two kinds of policies on schooling and earnings, the Behavioral Policy and the Schooling Policy. Conceptually, assessing these two policies amounts to addressing a longstanding debate in the pedagogical literature on teaching methods to address childhood misbehavior. The Behavioral Policy is designed to capture authoritative strategies that rely on control-based and punitive interventions to suppress externalizing behavior.29 In contrast, the Schooling Policy is designed to assess outcomes under a counterfactual regime where externalizing behavior is not penalized at school. This policy is related to an early idea developed by Kounin (1970) that effective classroom management should not be reactive but proactive in the sense that it should be established through engaging students in well-prepared and well-run activities so as to prevent disruptive behaviors from occurring. More recently, schooling policies reflect pedagogical movements that advocate warm and supportive student-teacher relationships especially for students with behavioral problems. The aim is to maintain students’ interest in their academic and social pursuits, which should lead to better grades and more positive peer relationships (Hamre and Pianta, 2006; Allen et al., 2011). Before using the estimated model to assess the Behavioral Policy versus the Schooling Policy, we discuss theoretical predictions. To do so, we formalize the conceptual model discussed in Section 5.1. Suppose an agent makes the schooling decision s, taking as given his cognition c and externalizing behavior e. To save on notation, we suppress variables describing internalizing behavior.30 The benefit from schooling is modeled as the sum of a 29

We should point out that these types of strategies are rarely recommended by education experts. However, empirical research shows that they are widely practiced (Jack et al., 1996). Recall the adage, “Do not smile until Christmas”. 30 The fact that cognition and externalizing both raise earnings, but that cognition lowers the cost of school while externalizing raises it leads to the possibility of selection on cognition. The idea is that if externalizing

24

pecuniary benefit Y (s, c, e) and a non-pecuniary benefit B(s, c, e). The pecuniary benefit is the expected present value of lifetime earnings upon finishing school. It assumes optimal employment and labor supply over the life-cycle. The non-pecuniary benefit captures the value of partnership and the utility or return from having children. It likewise assumes optimal future decisions regarding partnership and fertility. Both pecuniary and non-pecuniary benefits depend on the level of schooling, cognition and externalizing behavior. Assume Y (s, c, e) is increasing in s and both Y (s, c, e) and B(s, c, e) are strictly concave in s.31 The cost of schooling, C(s, c, e) satisfies standard assumptions: it is strictly increasing and convex in s and satisfies the Inada conditions. Note that the cost includes the monetary and psychological costs of obtaining s in school, given skills. Any potential “penalty” on schooling that occurs on the labor market is absorbed in the non-pecuniary benefit of schooling. This conceptual distinction is important for our policy experiments. We can write down this agent’s schooling problem as follows: max V (s, c, e) = Y (s, c, e) + B(s, c, e) − C(s, c, e). s≥0

The first order condition for the optimum states: ∂C(s∗ , c, e) ∂Y (s∗ , c, e) ∂B(s∗ , c, e) + = . ∂s ∂s ∂s

(13)

The optimal schooling choice is therefore defined implicitly by the FOC above as a function of c and e. In this model, the Behavioral Policy is equivalent to reducing the “endowment” of an agent from (c, e) to (c, e), for some minimum level e. Let the schooling and earnings outcome for an agent who is endowed with (c, e) under the Behavioral Policy be denoted s˜(c, e) and y˜(c, e) ≡ Y (˜ s, c, e), respectively. On the other hand, the Schooling Policy amounts to defining b c) = C(s, c, e). Let the schooling and earnings outcome for agent the cost of schooling as C(s, (c, e) under the Schooling Policy be denoted sb(c, e) and yb(c, e). Finally, suppose that earnings Y (s, c, e) are increasing in c as well as e, which is supported by our empirical results on the direct effects of cognition and externalizing on earnings. increases the marginal cost of schooling and cognition lowers it, then for a given level of schooling, higherexternalizing individuals will have relied on higher cognition to compensate for the higher marginal cost of schooling due to externalizing. Therefore, if a manager faces two job candidates with the same schooling attainment, then the one who struggled through school due to higher externalizing is expected to be smarter. In results available in Appendix C, we develop this point both theoretically and empirically. Empirically, we show that higher externalizing individuals at a given schooling level have higher cognition. 31 We allow the possibility that B(s, c, e) is negative (in which case it becomes effectively a cost) or decreases in s, since we do not have a strong prior about it and key results are not contingent upon it.

25

We first establish that the Schooling Policy increases educational attainment as well as earnings relative to those in the baseline as long as the marginal cost of schooling is increasing in externalizing behavior (Proposition 1). Next we compare the outcomes from the Schooling Policy to those from the Behavioral Policy (Proposition 2). We establish conditions under which the Schooling Policy delivers higher schooling and earnings relative to those under the Behavioral Policy. We will show later that this is indeed the relevant case based on the policy simulations from our empirical model. 2

∂ C > 0. Let the schooling (or earnings) outcome in the baseline Proposition 1. Suppose ∂s∂e be denoted s∗ (c, e) (or y ∗ (c, e)). Let the schooling (or earnings) outcome under the Schooling Policy be denoted sb(c, e) (or yb(c, e)). Then we have,

sb(c, e) ≥ s∗ (c, e), for all (c, e), and yb(c, e) ≥ y ∗ (c, e), for all (c, e), where the equalities are obtained if and only if e = e. Proof. Consider the following slightly modified first order condition of the baseline model: ∂Y (s, c, e) ∂B(s, c, e) ∂C(s, c, eb) + = . ∂s ∂s ∂s Let the implied optimal schooling be s(c, e, eb). Take derivative of s with respect to eb: 2

∂ C ∂s(c, e, eb) ∂s∂e = − ∂ 2 C < 0, ∂b e ∂s2

(14)

under the stated assumptions. Note that under the Schooling Policy sb(c, e) = s(c, e, e) and in the baseline s∗ (c, e) = s(c, e, e). Then (14) implies that sb(c, e) ≥ s∗ (c, e) with equality obtained when e = e. Since Y (s, c, e) is increasing in s, yb = Y (b s, c, e) ≥ y ∗ = Y (s, c, e). We have thus shown that under reasonable assumptions on the cross partial of the cost function, the Schooling Policy promotes education attainment and improves earnings relative to the baseline. The following Proposition 2 compares the outcome of schooling and earnings under the Schooling Policy and the Behavioral Policy. Proposition 2. Suppose

∂2Y ∂s∂e

2

∂ B + ∂s∂e ≥ 0. Let the schooling (or earnings) outcome under the

26

Behavioral Policy be denoted s˜(c, e) (or y˜(c, e)). Then we have, sb(c, e) ≥ s˜(c, e), for all (c, e), and yb(c, e) ≥ y˜(c, e), for all (c, e), where the equalities are obtained if and only if e = e. Proof. Consider the first order condition for the schooling choice under the Schooling Policy: b s, c) ∂Y (b s, c, e) ∂B(b s, c, e) ∂ C(b + = . ∂s ∂s ∂s Differentiating both sides of the above with respect to e and rearranging: 2

∂ Y + ∂b s(c, e) = − ∂s∂e 2 ∂ Y ∂e + ∂s2

∂2B ∂s∂e ∂2B ∂s2

≥ 0,

(15)

under the stated assumptions. Under the Behavioral Policy, s˜(c, e) = sb(c, e) ≤ sb(c, e) for all (c, e), due to (15). The equality holds only if e = e. The result on earnings follows since earnings are increasing in both schooling and externalizing. We implement the Behavioral and Schooling Policies using the estimated model in two counterfactual simulations. Specifically, we simulate the decisions and outcomes for the 2,889 males and 3,110 females in our sample. For each individual, we fix both internalizing behavior and cognition at the median level in the sample and take 1,000 independent draws of externalizing behavior. Next, we simulate the model of schooling, partnership, fertility, experience, wage and hours for the male and female samples separately. We report average outcomes on the first rows of the three panels in Table 9 for the male, the female and the combined samples separately. These are the outcomes under the “Baseline”. To conduct the “Schooling Policy” we shut down the influence of externalizing behavior on schooling choices and retain the direct effects from this skill on all labor market outcomes. Under the “Behavioral Policy” we decrease the externalizing of each individual to a level so as to replicate the schooling gain under the “Schooling Policy”. For men, both policies raise educational attainment relative to the baseline. They increase the percentage of men sorting into the high-education group (an A-Level degree or higher) from a baseline level of 47.9% to 61.3%. However, the effects of the policies on earnings are quite different. While the Behavioral Policy reduces earnings from a baseline level of 428.22

27

pounds to 390.00 pounds, the Schooling Policy raises earnings to 449.20 pounds. Contrasting the outcomes under the Behavioral Policy with those under the Schooling Policy, we recognize two sources of gains from preserving externalizing behavior. Externalizing male workers work more hours (i.e. 44.75 under the Schooling Policy versus 41.59 under the Behavioral Policy) and they are more productive as reflected by their hourly wages (i.e. 10.05 under the Schooling Policy versus 9.39 under the Behavioral Policy). For women, the story is slightly different. It is still true that both policies raise educational attainment and the Schooling Policy improves the earnings outcome relative to the baseline while the Behavioral Policy lowers earnings. But the improvement in earnings under the Schooling Policy for women comes exclusively from working longer hours. To be specific, the percentage of women with high education increases from a baseline level of 31.9% to 38.8% under either policy. While the Behavioral Policy reduces earnings from the baseline level of 179.44 to 167.88 pounds, the Schooling Policy raises earnings to 196.77 pounds. Under the Schooling Policy, female workers work longer hours (29.93 versus 24.32 under the Behavioral Policy) but at a lower wage rate (6.31 versus 6.64 under Behavioral Policy). This is a result of how the externalizing works its way through earlier life-cycle outcomes to affect wages and hours later on. Externalizing behavior tends to decrease partnership and increase fertility, significantly lowering early career experience for women. Therefore, under the Schooling Policy, where these effects are at play, female workers receive lower wages at age 33. On the other hand, externalizing behavior directly raises hours so strongly that it dominates the negative effects from lower partnership and increased fertility on hours, and the Schooling Policy achieves more hours so much so that it more than compensates for the decrease in wages. In a similar experiment, we explore heterogeneity in the effects of the two policies. We vary the level of cognition for all individuals in our sample from 0 to 100 at 20 equal steps. For each level of cognition, we compute average simulated earnings and plot them against cognition in Figure 6 for males and females separately. This is the solid line under “Baseline”. To conduct the Behavioral Policy experiment, we reduce externalizing for all individuals by two standard deviations and simulate the model. Earnings under the Behavioral Policy are then given by the dotted line in the same figure. To conduct the Schooling Policy experiment, we assign to each individual the same schooling level achieved under the Behavioral Policy and retain the direct effects from skills on all labor market outcomes. Essentially, the Schooling Policy replicates the schooling outcome that the Behavioral Policy achieves without changing the underlying factors themselves. The resulting earnings under the Schooling Policy are plotted with a dashed line. For men, the policies have similar impacts for all students regardless of their level of 28

cognition. For women, the impacts of the policies vary by cognition. The Behavioral Policy has a strong negative effect on earnings for the less cognitively endowed women but has a negligible impact for those with stronger cognition. The effect of the Schooling Policy is the opposite. It strongly increases earnings for the highly cognitively endowed women but has a negligible effect for the less able women. These results follow from non-linearities in the impact of externalizing on schooling for women. High externalizing women face a high cost of obtaining a higher degree (Table S4). Moreover, women with a higher degree are especially productive in the labor market (Table S9). Once we remove the cost of schooling for the externalizing women, more women obtain a higher degree and face an increase in productivity in the labor market. For low-cognition women, removing the schooling penalty for externalizing does not result in much gain in educational attainment, and therefore leads to little improvement in earnings.

8

Conclusion

Non-cognitive skill consists of multiple dimensions and each of these dimensions can have mixed effects on an array of economically important outcomes. In this paper, we show evidence that externalizing behavior, capturing aggression or hostility, despite its detrimental impact on schooling, leads to higher earnings. In light of this finding, we consider two policies, one of which (the Behavioral Policy) aims to eliminate externalizing behavior. We show that an alternative policy (the Schooling Policy), designed to attenuate the negative impact of externalizing behaviors on educational attainment, leads to higher earnings for both men and women. This finding illustrates the danger of designing policies that ignore how non-cognitive skills have mixed effects. We also emphasize heterogeneity in the impact of non-cognitive skills by gender. Further research could extend our analysis, exploring heterogeneity across socioeconomic groups. Such heterogeneity likely explains why our preliminary findings differ from some previous work suggesting that externalizing behavior can reduce earnings (Heckman, Pinto, and Savelyev, 2013). We briefly explore such differences by considering a sub-sample of our analysis sample that is selected to be similar to the sample of economically disadvantaged families considered in Heckman, Pinto, and Savelyev (2013).32 We then regress earnings on the measures of cognition and the two non-cognitive skills, for children with and without financial 32

More specifically, we construct a sub-sample of children from households that self-reported having financial difficulties when the child was between 7 and 16 years old. To that end, we create a variable indicating whether an individual appeared to be living in a household experiencing poverty in 1965 or if a member of the household self-reported having financial difficulties in the 12 months prior to being observed in either 1969 or 1974. Summary statistics for this sub-sample are found in Appendix E.

29

difficulties separately. For the former group, externalizing behavior has a negative though often insignificant impact. For the latter group, it has a significantly positive impact. Results from these regressions are presented Table 10.33 A broader implication from these results is that the payoff to different non-cognitive skills are context-dependent, as is argued in Lundberg (2012). Future research could explore how policy should be designed in light of this type of heterogeneity.34 Another important direction would be to consider other economically important outcomes that are driven by non-cognitive skills and that also affect earnings, such as criminal activity. For example, in a supplementary analysis found in Appendix D, we find a strong relationship between externalizing behavior and police involvement.35 Similar logic applies here: There is a potential role for policies that encourage educational attainment among high-externalizing individuals, which could lower criminal activity without eliminating the labor market benefits of externalizing behavior.

References Achenbach, Thomas M., Stephanie H. McConaughy, and Catherine T. Howell. 1987. “Child/Adolescent Behavioral and Emotional Problems: Implications of Cross-Information Correlations for Situational Specificity.” Psychological Bulletin 101 (2):213–232. Agan, Amanda Y. 2011. “Non-Cognitive Skills and Crime.” Mimeo, Princeton University Department of Economics and the Industrial Relations Section. Aizer, Anna. 2008. “Neighborhood Violence and Urban Youth.” NBER Working Paper. Allen, Joseph P., Robert C. Pianta, Anne Gregory, Amori Yee Mikami, and Janetta Lun. 2011. “An Interation-Based Approach to Enhancing Secondary School Instruction and Student Achievement.” Science 333. Almlund, Mathilde, Angela Lee Duckworth, James Heckman, and Tim Kautz. 2011. “Personality Psychology and Economics.” Handbook of the Economics of Education 4 (1). Bar´ on, Juan D and Deborah A Cobb-Clark. 2010. “Are Young People’s Educational Outcomes Linked to their Sense of Control?” IZA Working Paper. Bertrand, Marianne and Jessica Pan. 2013. “The Trouble with Boys: Social Influences and the Gender Gap in Disruptive Behavior.” American Economic Journal: Applied Economics 5 (1):32–64. 33 When allowing for interactions between externalizing behavior and having financial difficulties, we find that the coefficient of the interaction term is significantly negative. These results are available in Appendix D. 34 In Appendix D, we also discuss an alternative potential source of differences in findings: negative effects of externalizing on earnings in the right tail of the distribution coupled with low-income groups occupying the right tail. We rule out this explanation for our sub-sample. 35 Externalizing behavior strongly predicts future contact with the police. However, we also find that police involvement does not appear to greatly affect an individual’s labor market prospects, at least not enough to overwhelm the benefit of externalizing behavior. That is, controlling for police involvement, externalizing behavior still has a positive average impact on earnings.

30

Blanden, Jo, Paul Gregg, and Lindsey Macmillan. 2006. “Accounting for Intergenerational Income Persistence: Non-Cognitive Skills, Ability and Education.” CEE Discussion Papers, Centre for the Economics of Education, LSE No. 0073. Borghans, Lex, Angela Lee Duckworth, James J Heckman, and Bas Ter Weel. 2008. “The Economics and Psychology of Personality Traits.” Journal of Human Resources 43 (4):972–1059. Carneiro, P. and J. J. Heckman. 2003. “Human Capital Policy.” In Inequality in America: What Role for Human Capital Policies?, edited by J. J. Heckman, A. B. Krueger, and B. M. Friedman. MIT Press, pp. 77–240. Carneiro, Pedro, Claire Crawford, and Alissa Goodman. 2007. “The Impact of Early Cognitive and NonCognitive Skills on Later Outcomes.” CEE Discussion Papers, Centre for the Economics of Education, LSE No. 0092. Cattan, Sarah. 2011. “The Role of Workers’ Traits in Explaining the Early Career Gender Wage Gap.” Mimeo, University of Chicago. Cunha, F. and J. J. Heckman. 2008. “Formulating, Identifying and Estimating the Technology of Cognitive and Non-Cognitive Skill Formation.” Journal of Human Resources 43:738–782. Cunha, Flavio, James J Heckman, and Susanne M Schennach. 2010. “Estimating the Technology of Cognitive and Noncognitive skill formation.” Econometrica 78 (3):883–931. Currie, J. 2009. “Healthy, Wealthy, and Wise: Socioeconomic Status, Poor Health in Childhood, and Human Capital Development.” Journal of Economic Literature 47 (1):87–122. Currie, Janet. 2001. “Early Childhood Education Programs.” The Journal of Economic Perspectives 15 (2):213–238. Doyle, Orla, Colm P Harmon, James J Heckman, and Richard E Tremblay. 2009. “Investing in Early Human Development: Timing and Economic Efficiency.” Economics & Human Biology 7 (1):1–6. Dubin, Jeffrey A and Daniel L McFadden. 1984. “An econometric Analysis of Residential Electric Appliance Holdings and Consumption.” Econometrica :345–362. Duncan, Greg J and Rachel Dunifon. 2012. “Introduction to ‘Soft-Skills’ and Long-Run Labor Market Success.” Research in Labor Economics 35:309–312. Duncan, Greg J and Katherine Magnuson. 2011. “The Nature and Impact of Early Achievement Skills, Attention Skills, and Behavior Problems.” In Whither Opportunity? Rising Inequality, Schools, and Children’s Life Chances. Russell Sage Foundation, 47–69. Ehrler, David J, J Gary Evans, and Ron L McGhee. 1999. “Extending Big-Five Theory into Childhood: A Preliminary Investigation into the Relationship Between Big-Five Personality Traits and Behavior Problems in Children.” Psychology in the Schools 36 (6):451–458. Farmer, Elizabeth MZ. 1993. “Externalizing Behavior in the Life Course The Transition From School to Work.” Journal of Emotional and Behavioral Disorders 1 (3):179–188. ———. 1995. “Extremity of Externalizing Behavior and Young Adult Outcomes.” Journal of Child Psychology and Psychiatry 36 (4):617–632. Fortin, Nicole M. 2008. “The Gender Wage Gap among Young Adults in the United States The Importance of Money versus People.” Journal of Human Resources 43 (4):884–918.

31

Gensowski, Miriam, James Heckman, and Peter Savelyev. 2011. “The Effects of Education, Personality, and IQ on Earnings of High-Ability Men.” Mimeo, University of Chicago. Ghodsian, M. 1977. “Children’s Behaviour and the BSAG: Some Theoretical and Statistical Considerations.” British Journal of Social and Clinical Psychology 16 (1):23–28. Goldberger, Arthur S. 1972. “Structural Equation Methods in the Social Sciences.” Econometrica :979–1001. Gregg, Paul and Stephen Machin. 2000. “Child Development and Success or Failure in the Youth Labor Market.” In Youth Employment and Joblessness in Advanced Countries. University of Chicago Press, 247–288. Hamilton, Barton H, Nidhi Pande, and Nicholas W Papageorge. 2014. “The Right Stuff? Personality and Entrepreneurship.” Mimeo, Johns Hopkins University. Hamre, Bridget K. and Robert C. Pianta. 2006. “Student-Teacher Relationships.” In Children’s Needs III: Development, Prevention and Intervention, edited by G. Bear and Kathleen M. Minke, chap. 5. National Association of School Psychologists, Bethesda, MD, 49–60. Heckman, James, Rodrigo Pinto, and Peter Savelyev. 2013. “Understanding the Mechanisms through Which an Influential Early Childhood Program Boosted Adult Outcomes.” American Economic Review 103 (6):2052–86. Heckman, James J. 2012. “The Developmental Origins of Health.” Health Economics 21 (1):24–29. Heckman, James J and Tim Kautz. 2013. “Fostering and Measuring Skills: Interventions that Improve Character and Cognition.” NBER working paper. Heckman, James J and Paul A LaFontaine. 2010. “The American High School Graduation Rate: Trends and Levels.” The Review of Economics and Statistics 92 (2):244–262. Heckman, James J and Dimitriy V Masterov. 2007. “The Productivity Argument for Investing in Young Children.” Applied Economic Perspectives and Policy 29 (3):446–493. Heckman, James J and Yona Rubinstein. 2001. “The Importance of Noncognitive Skills: Lessons from the GED Testing Program.” The American Economic Review 91 (2):145–149. Heckman, James J, Jora Stixrud, and Sergio Urzua. 2006. “The Effects of Cognitive and Noncognitive Abilities on Labor Market Outcomes and Social Behavior.” The Journal of Labor Economics 24 (3):411. Heineck, Guido. 2010. “Does It Pay to Be Nice-Personality and Earnings in the United Kingdom.” Industrial and Labor Relations Review 64:1020–1047. Jack, Susan L., Richard E. Shores, R. Kenton Denny, Philip L. Gunter, Terry DeBriere, and Paris DePaepe. 1996. “An analysis of the Relationship of Teachers’ Reported Use of Classroom Management Strategies on Types of Classroom Interactions.” Journal of Behavioral Education 6 (1):67–87. Jackson, Michelle. 2006. “Personality Traits and Occupational Attainment.” European Sociological Review 22 (2):187–199. J¨ oreskog, Karl G and Arthur S Goldberger. 1975. “Estimation of a Model with Multiple Indicators and Multiple Causes of a Single Latent Variable.” Journal of the American Statistical Association 70 (351a):631–639. Kounin, J. S. 1970. Discipline and Group Management in Classrooms. New York: Holt, Rinehart and Winston. Lee, Lung-Fei. 1983. “Generalized Econometric Models with Selectivity.” Econometrica :507–512.

32

Lundberg, Shelly. 2011. “Psychology and Family Economics.” Perspektiven der Wirtschaftspolitik 12 (s1):66– 81. ———. 2012. “Personality and Marital Surplus.” IZA Journal of Labor Economics 1 (1):1–21. Savelyev, Peter A. 2010. “Conscientiousness, Education, and Longevity of High-Ability Individuals.” Mimeo, University of Chicago, Department of Economics. Sciulli, Dario. 2012. “Child Social Maladjustment and Adult Employment Dynamics.” Mimeo, University of Chieti-Pescara. Segal, Carmit. 2008. “Classroom Behavior.” Journal of Human Resources 43 (4):783–814. ———. 2013. “Misbehavior, Education, and Labor Market Outcomes.” Journal of the European Economic Association 11 (4):743–779. Shepherd, Peter. 2013. “Bristol Social Adjustment Guides at 7 and 11 years.” Centre for Longitudinal Studies. St¨ ormer, Susi and Ren´e Fahr. 2013. “Individual Determinants of Work Attendance: Evidence on the Role of Personality.” Applied Economics 45 (19):2863–2875. Stott, Denis Herbert. 1958. The Social Adjustment of Children: Manual to the Bristol Social Adjustment Guides. University of London Press. Thomson, G. H. 1951. The Factorial Analysis of Human Ability. University of London Press, London. Urzua, S. 2008. “Racial Labor Market Gaps The Role of Abilities and Schooling Choices.” Journal of Human Resources 43 (4):919–971. Weiss, Christoph T. 2010. “The Effects of Cognitive and Noncognitive Abilities on Wages.” Mimeo, European University Institute. Wichert, Laura and Winfried Pohlmeier. 2010. “Female Labor Force Participation and the Big Five.” ZEWCentre for European Economic Research Discussion Paper.

33

9

Tables and Figures Table 2: Summary Statistics

No Formal Education CSE O-Level A-Level Higher Education Higher Degree Hourly Wage Weekly Hours Worked Weekly Earnings Experience Self Employed In Paid Work Has a Partner Number of Children Observations

Males 0.095 (0.29) 0.11 (0.31) 0.30 (0.46) 0.19 (0.39) 0.16 (0.37) 0.15 (0.35) 9.59 (20.22) 44.07 (9.26) 414.94 (873.75) 164.24 (45.76) 0.20 (0.40) 0.92 (0.27) 0.88 (0.32) 1.37 (1.15) 2889

Females 0.13 (0.33) 0.14 (0.35) 0.38 (0.49) 0.10 (0.30) 0.14 (0.35) 0.10 (0.30) 6.26 (11.46) 27.68 (12.78) 181.80 (291.85) 123.06 (53.28) 0.11 (0.32) 0.65 (0.48) 0.87 (0.34) 1.65 (1.13) 3110

Both 0.11 (0.31) 0.13 (0.33) 0.34 (0.47) 0.15 (0.35) 0.15 (0.36) 0.12 (0.33) 8.07 (16.87) 36.58 (13.71) 308.45 (683.41) 142.89 (53.88) 0.16 (0.37) 0.78 (0.41) 0.88 (0.33) 1.51 (1.15) 5999

Summary statistics for the analysis sample of 5999 individuals. Statistics are reported separately for males (Column [1]), for females (Column [2]) and for both genders (Column [3]). For education categories, employment and partnership, entries are in the form of percentages divided by 100. Experience is measured in months and wages and weekly earnings are in 1992 Great British pounds.

34

14

50 Males Females

Males Females 45

12

Average Weekly Hours

Average Hourly Wages

40 10

8

6

4

35

30

25

20

15

10 2 5

0

0

1

2

3

4

0

5

2

3

4

(a) Wages by schooling

(b) Hours by schooling

5

3.5 Males Females

Males Females 3

Normalized Weekly Earnings

500

Average Weekly Earnings

1

Schooling Level

600

400

300

200

100

0

0

Schooling Level

2.5

2

1.5

1

0.5

0

1

2

3

4

0

5

0

1

2

3

4

5

Schooling Level

Schooling Level

(c) Earnings by schooling

(d) Normalized Earnings by Schooling

Figure 1: Gender Differences in Labor Market Outcomes: Figure 1(a) compares hourly wages by schooling level and gender, Figure 1(b) compares weekly hours worked by schooling level and gender, , Figures 1(c) and 1(d) compares weekly earnings and normalized weekly earnings by schooling level and gender.

35

Table 3: Summary Statistics - BSAG Variables

Hostility Towards Adults Hostility Towards Children Anxiety for Acceptance by Adults Anxiety for Acceptance by Children Restlessness Inconsequential Behavior Depression Withdrawal Unforthcomingness Writing Off of Adults and Adult Standards Observations

Males Females 0.87 0.60 (1.86) (1.56) 0.26 0.20 (0.78) (0.63) 0.47 0.53 (1.08) (1.18) 0.40 0.18 (0.92) (0.56) 0.23 0.15 (0.57) (0.46) 1.62 0.85 (2.14) (1.42) 1.05 0.79 (1.52) (1.36) 0.37 0.25 (0.88) (0.66) 1.51 1.43 (2.00) (2.08) 1.07 0.68 (1.77) (1.32) 2889 3110

Both 0.73 (1.72) 0.23 (0.71) 0.50 (1.13) 0.29 (0.77) 0.19 (0.51) 1.22 (1.84) 0.92 (1.44) 0.30 (0.77) 1.47 (2.04) 0.87 (1.56) 5999

Summary statistics for maladjustment syndrome scores for our sample of 5,999 individuals observed at age 11. Measures constructed using teachers’ reports of misbehavior or misconduct in school. Statistics are reported separately for males (Column [1]), for females (Column [2]) and for both genders (Column [3]). For each maladjustment syndrome, a child receives a score, which is an integer between 0 and 15, with 15 indicating a persistent display of behavior described by the maladjustment syndrome. In the table, entries are averages for each syndrome for the analysis sample.

36

Table 4: Reduced Form: Log Weekly Earnings Variable [1] Cognition .224∗∗∗ Misbehavior -.023∗ Externalizing . Internalizing . CSE . O Level . A Level . Higher Education . Higher Degree . Has a Partner . Number of Children . Experience . Female -1.030∗∗∗ London .238∗∗∗ Const. 5.630∗∗∗ Obs. 4691

[2] .234∗∗∗ . .007 . . . . . . . . . -1.023∗∗∗ .239∗∗∗ 5.628∗∗∗ 4691

[3] .220∗∗∗ . . -.040∗∗∗ . . . . . . . . -1.032∗∗∗ .238∗∗∗ 5.630∗∗∗ 4691

[4] .225∗∗∗ . .029∗∗ -.051∗∗∗ . . . . . . . . -1.027∗∗∗ .239∗∗∗ 5.629∗∗∗ 4691

[5] .092∗∗∗ . .036∗∗∗ -.039∗∗∗ .074∗ .222∗∗∗ .343∗∗∗ .557∗∗∗ .685∗∗∗ . . . -.975∗∗∗ .229∗∗∗ 5.318∗∗∗ 4691

[6] .088∗∗∗ . .049∗∗∗ -.039∗∗∗ .004 .090∗∗ .214∗∗∗ .412∗∗∗ .677∗∗∗ .170∗∗∗ -.141∗∗∗ .004∗∗∗ -.814∗∗∗ .215∗∗∗ 4.850∗∗∗ 4691

[7] .108∗∗∗ . .038∗∗∗ -.046∗∗∗ .004 .085∗ .113∗∗ .252∗∗∗ .397∗∗∗ .162∗∗∗ -.0003 .002∗∗∗ . .259∗∗∗ 5.084∗∗∗ 2280

[8] .073∗∗∗ . .050∗∗ -.019 .021 .093∗ .284∗∗∗ .512∗∗∗ .815∗∗∗ .121∗∗∗ -.287∗∗∗ .003∗∗∗ . .170∗∗∗ 4.338∗∗∗ 2411

This table contains parameter estimates from OLS regressions used to link non-cognitive skills to earnings. We regress log earnings of workers on a set of observable variables along with proxies for unobserved skills. To construct proxies for unobserved skills, we apply principal components factor analysis to all the variables used to measure that skill. Models [1]-[6] include all individuals and a gender dummy, Model [7] includes only males and Model [8] only females.

37

Table 5: Reduced Form: Schooling on Behaviors

CSE

O-Level

Cognition Misbehavior

0.851∗∗∗ -0.188∗∗∗

1.702∗∗∗ -0.267∗∗∗

Cognition Externalizing

0.859∗∗∗ -0.195∗∗∗

Cognition Internalizing

0.906∗∗∗ -0.0592

Cognition Externalizing Internalizing

0.858∗∗∗ -0.205∗∗∗ 0.0126

Cognition Externalizing Internalizing

0.865∗∗∗ -0.146∗∗ -0.0183

Cognition Externalizing Internalizing

0.845∗∗∗ -0.297∗∗∗ 0.0481

A-Level [1]

2.232∗∗∗ -0.321∗∗∗ [2] ∗∗∗ 1.726 2.263∗∗∗ -0.226∗∗∗ -0.257∗∗∗ [3] ∗∗∗ 1.743 2.270∗∗∗ -0.195∗∗∗ -0.271∗∗∗ [4] ∗∗∗ 1.701 2.229∗∗∗ -0.177∗∗∗ -0.173∗∗∗ -0.131∗∗ -0.209∗∗∗ [5] ∗∗∗ 1.519 1.945∗∗∗ ∗∗∗ -0.237 -0.231∗∗∗ -0.106 -0.171∗∗ [6] 1.856∗∗∗ 2.661∗∗∗ -0.0896 -0.0784 ∗∗ -0.169 -0.307∗∗

Higher Ed.

Higher Deg.

2.259∗∗∗ -0.388∗∗∗

3.729∗∗∗ -0.624∗∗∗

2.289∗∗∗ -0.333∗∗∗

3.764∗∗∗ -0.550∗∗∗

2.307∗∗∗ -0.296∗∗∗

3.791∗∗∗ -0.437∗∗∗

2.256∗∗∗ -0.250∗∗∗ -0.204∗∗∗

3.728∗∗∗ -0.420∗∗∗ -0.280∗∗∗

2.180∗∗∗ -0.353∗∗∗ -0.184∗∗

3.513∗∗∗ -0.407∗∗∗ -0.252∗∗

2.295∗∗∗ -0.0551 -0.246∗∗

3.977∗∗∗ -0.541∗∗ -0.310∗∗

This table contains parameter estimates from multinomial logistic regressions linking noncognitive skills to schooling. We regress education on a set of observable variables along with proxies for unobserved skills. The construction of the proxies, which relies on the principal components analysis, is detailed in Appendix A. Models [1]-[4] include all individuals and a gender dummy. Models [5] and [6] present estimates for analyses run separately for males and females.

38

Table 6: Reduced Form: Selection into Occupations

[1] Selection into Occupations Occupation: Unskilled Skilled Professional Cognition 0.193∗∗∗ 0.364∗∗∗ Externalizing 0.043 0.011 ∗∗∗ Internalizing -0.104 -0.172∗∗∗ [2] Occupations and Earnings Occupation: Unskilled Skilled Professional Cognition -.052 .071∗∗∗ .116∗∗∗ Externalizing .047∗ .045∗∗ .050∗∗ Internalizing -.005 -.024 -.075∗∗∗ Obs. 867 1996 1649 This table contains parameter estimates from multinomial probit regressions linking noncognitive skills to occupations. At age 33, individuals are categorized into three occupational groups: unskilled, skilled and professional/managerial. We report the summary statistics from the sub-sample with occupational information in Appendix E. We regress occupations on a set of observable variables along with proxies for unobserved skills. Models [1] and [2] include all individuals and a gender dummy. Controls include educational attainment, number of children by age 33 and partnership status at 33.

39

Table 7: Regressors included in each equation.

Regressors Externalizing Internalizing Cognition Class Size Class Prep. # Children in HH Mother’s Education Father’s Education London Schooling Partner Fertility Experience

School. x x x x x x x x

Partner x x x

Equations Fertil. Exper. Employ. x x x x x x x x x

x x x x x

x x x x x

x x x x

x x x x

Hours x x x

Wage x x x

x x x x x

x x x x x

For each equation in the model, we indicate with an x which regressors are included as right-hand-side (explanatory) variables. Non-cognitive skills, class size, class preparation and number of children in the household are measured at age 11. Schooling, partnership, fertility and experience are outcomes that are included as regressors in later-outcome equations.

40

0.05

0.35

Males Females

0.045

Males Females

0.3

0.04 0.25

0.035

0.03 0.2 0.025 0.15 0.02

0.015

0.1

0.01 0.05 0.005

0 −50

−40

−30

−20

−10

0

10

20

30

40

0 −15

50

−10

(a) Cognition

−5

0

5

(b) Internalizing

0.35

Males Females

0.3

0.25

0.2

0.15

0.1

0.05

0 −15

−10

−5

0

5

10

15

(c) Externalizing

Figure 2: Structural Results: Simulated distribution of the latent skills by gender. Figure 2(a) compares the distribution of Cognition for males and females. Figure 2(b) compares the distribution of the Internalizing Behavior for males and females. Figure 2(c) compares the distribution of the Externalizing Behavior for males and females.

41

10

Direct Effect Add: Schooling Add: Fertiity Cumulative Effect

440

Earnings

420

0

% Change in Earnings

5

400

−5

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

(a) Males

Direct Effect Add: Schooling Add: Fertiity Cumulative Effect

190

10

Earnings

180

5

175 0 170

% Change in Earnings

185

165 −5 160

155

−10 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

(b) Females

Figure 3: Decomposition of the Cumulative Effect: Simulated weekly earnings at age 33 conditional on employment for different levels of externalizing behavior under different mechanisms. The Cumulative Effect allows externalizing to affect all endogenous outcomes. The Direct Effect only allows externalizing to affect Earnings directly. Add: Schooling allows externalizing to affect Earnings through its effect on Schooling in addition to the direct effect. Add: Fertilty allows externalizing to affect Earnings through its effect on Fertility in addition to the effect on Schooling and the direct effect. Externalizing is set at the 50th pct for all other endogenous outcomes. For this reason, all four equations intersect at the 50th percentile.

42

Table 8: Decomposition

Males % Change in: Cumulative Effect Direct Effect Add: Schooling Add: Fertility Add: Partnership Add: Experience Add: Employment

Employment -3.055 -2.840 -3.212 -3.414 -3.055 -3.055 -3.055

Hours 2.830 2.623 2.850 2.901 2.916 2.862 2.830

Wages 0.631 2.521 0.571 0.597 0.706 0.162 0.631

Earnings 3.492 5.210 3.447 3.526 3.649 3.035 3.492

Hours 6.860 9.695 9.244 7.139 7.012 6.897 6.860

Wages -3.659 -0.361 -2.007 -2.773 -3.069 -3.239 -3.659

Earnings 2.914 9.299 6.804 4.029 3.581 3.288 2.914

Females % Change in: Cumulative Effect Direct Effect Add: Schooling Add: Fertility Add: Partnership Add: Experience Add: Employment

Employment 4.710 7.448 6.723 5.086 4.710 4.710 4.710

Decomposition of the Effect of Externalizing Behavior: Percent differences in labor market outcomes for high-externalizing (75th percentile) versus low-externalizing (25th percentile) males and females and for different combinations of direct and indirect effects. The Cumulative Effect includes the impact of externalizing on variables endogenous to labor market outcomes. The Direct Effect only includes the impact of externalizing on labor market outcomes taking endogenous variables fixed at sample averages. To understand the table, notice that the third row (“Add: Schooling”) computes the difference in average earnings where externalizing behavior has a direct effect on labor market outcomes and also on schooling. Schooling is allowed to affect all other outcomes such as fertility, partnership and the labor market outcomes.

43

0.92

0.96 High Cognition Low Cognition

High Cognition Low Cognition

0.91

Probability of Having a Partner

Probability of Having a Partner

0.94

0.9

0.89

0.88

0.87

0.92

0.9

0.88

0.86

0.84

0.82

0.86 0.8

0.85

0

10

20

30

40

50

60

70

80

90

100

0.78

0

10

20

30

Externalizing

40

50

60

70

80

(a) Externalizing and Partnership - Males

2 High Cognition Low Cognition

High Cognition Low Cognition

1.7

1.9

1.6

Number of Children

Number of Children

100

(b) Externalizing and Partnership - Females

1.8

1.5

1.4

1.3

1.8

1.7

1.6

1.5

1.2

1.1

90

Externalizing

0

10

20

30

40

50

60

70

80

90

100

Externalizing

1.4

0

10

20

30

40

50

60

70

80

90

Externalizing

(c) Externalizing and Fertility - Males

(d) Externalizing and Fertility - Females

Figure 4: Cumulative effects of externalizing behavior: Figures 4(a) and 4(b) plot the simulated probability of having a partner by level of the externalizing behavior for males and females respectively. Figures 4(c) and 4(d) plot the expected number of children at age 33 by level of the externalizing behavior for males and females respectively.

44

100

1.4 Males Females

Males Females

Normalized Weekly Hours Worked

Normalized Labor Force Participation

1.4

1.2

1

0.8

0.6

0.4

0.2

0

0

1

2

1.2

1

0.8

0.6

0.4

0.2

0

3

2

(a) Fertility and Employment

(b) Fertility and Hours

3

1.4 Males Females

Males Females 1.2

Normalized Weekly Earnings

1.2

Normalized Hourly Wages

1

Number of Children

1.4

1

0.8

0.6

0.4

0.2

0

0

Number of Children

1

0.8

0.6

0.4

0.2

0

1

2

0

3

0

1

2

Number of Children

Number of Children

(c) Fertility and Wages

(d) Fertility and Earnings

3

Figure 5: Impact of Fertility on Labor Market Outcomes by Gender: Figure 5(a) compares the simulated labor force participation for different numbers of children across genders. Figure 5(b) compares the simulated hours worked for different numbers of children across genders. Figure 5(c) compares the simulated hourly wage rate for different numbers of children across genders. Figure 5(d) compares the simulated weekly earnings for different numbers of children across genders. We normalize each labor market outcome for individuals with no children to 1 for each gender to facilitate the comparison across genders.

45

Table 9: Structural Results: Policy Counterfactuals

Males Baseline Behavioral Policy Schooling Policy

High Edu.

Exper.

Empl.

Hours

Wages

Earnings

0.479 0.613 0.613

166.801 178.126 163.140

0.916 0.971 0.924

45.020 41.592 44.745

9.521 9.388 10.047

428.216 390.000 449.204

Females Baseline Behavioral Policy Schooling Policy

High Edu.

Exper.

Empl.

Hours

Wages

Earnings

0.319 0.388 0.388

123.966 128.635 120.389

0.657 0.570 0.669

29.411 24.321 29.932

5.901 6.642 6.306

179.441 167.878 196.773

All Baseline Behavioral Policy Schooling Policy

High Edu.

Exper.

Empl.

Hours

Wages

Earnings

0.399 0.500 0.500

145.383 153.381 141.765

0.787 0.770 0.797

37.216 32.956 37.339

7.711 8.015 8.177

303.828 278.939 322.988

Estimated average outcomes for different policy counterfactuals, for the sample of males, of females and of both. Under the Schooling Policy we do not allow for externalizing to affect the schooling decision. Under the Behavioral Policy we reduce the level of externalizing for all individuals in order to match the schooling level under the Schooling Policy.

46

500

480

Baseline Behavioral Policy School Policy

460

Earnings

440

420

400

380

360

340 −1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

0.8

1

Cognition

(a) Policy Counterfactuals - Males 240 Baseline Behavioral Policy School Policy 220

Earnings

200

180

160

140

120 −1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

Cognition

(b) Policy Counterfactuals - Females

Figure 6: Policy Counterfactuals: Simulated weekly earnings for different percentiles of Cognition under different policy experiments. Under the Behavioral Policy, we reduce the level of externalizing for all individuals by 2 standard deviation. Under the Schooling Policy, we do not change the level of externalizing in the population but assign to each individual the same schooling level as predicted under the Behavioral Policy.

47

Table 10: Reduced Form: Financial Difficulties and Earnings Variable Cognition Externalizing Internalizing Female Obs. Controls Financial Dif.

[1] .220∗∗∗ .041∗∗∗ -.061∗∗∗ -1.011∗∗∗ 4544 (N) (N)

[2] .087∗∗∗ .061∗∗∗ -.062∗∗∗ -.836∗∗∗ 3809 (Y) (N)

[3] .104∗∗∗ .041∗∗∗ -.056∗∗∗ . 2099 (Y) (N)

[4] .057∗∗ .088∗∗∗ -.068∗∗∗ . 1710 (Y) (N)

[5] .191∗∗∗ -.036 -.037 -1.128∗∗∗ 866 (N) (Y)

[6] .075∗∗ -.003 -.030 -.870∗∗∗ 708 (Y) (Y)

[7] .063∗ -.009 -.024 . 359 (Y) (Y)

[8] .091 .012 -.037 . 349 (Y) (Y)

This table contains parameter estimates from OLS regressions used to link non-cognitive skills to earnings. We regress log earnings of workers on a set of observable variables along with proxies for unobserved skills. Results are separated for students who do not experience financial difficulty during development (Models [1]-[4]) and those who do (Models [5]-[8]). Models [3] and [7] include only males and [4] and [8] only females. Controls include educational attainment, number of children by age 33, partnership status at 33, experience and employment at age 33. To construct proxies for unobserved skills, we apply principal components factor analysis to all the variables used to measure that skill. Further details and results on the sample used for the above analysis are in Appendix D.

48