Assortative preferences in choice of major

Abstract The primary objective of this study is to examine the contribution of available information constrained by parents’ fields of study to the observed assortative preferences in their children’s choice of major. Comparable to panel models, we define within-family transmission functions with 1-to-2 matches (1 for each parent). Using the confidential major file of the 2011 National Household Survey from Canada, the results show that children’s choice of field of study exhibits significant assortative preferences isolated from ability sorting and unobserved differences across majors and other family characteristics. With some caution, we attribute this persisting assortative tendency to the information asymmetry across alternative majors built on by parents’ educational backgrounds within families.


Introduction
Evidence shows that expectations about earnings, employment opportunities, marriage options, job-family balance, enjoying course work, social status of available jobs, and own ability to successfully complete the study associated with each major are fundamental factors in the choice of field of study. The evidence also shows that there is a substantial error in beliefs (subjective expectations) about the population values of these determinants (Stinebrickner and Stinebrickner, 2014). When students are provided correct information, they update their beliefs and their choice of field of study (Wiswall and Zafar, 2015;Arcidiacono et al. 2012).
According to the 2013 National Graduate Survey, close to 60% of university graduates in Canada report that their parents' recommendations played a very important role in their choice of major. This is not surprising because the information and its value from different sources become more dispersed and questionable. Altonji et al. (2015), for instance, documented that Princeton pushes students to consider departments with fewer students. Some postsecondary institutions prefer a distribution of students across majors in such a way that it correlates with the distribution of faculty members in those majors. Departments in high (low) demand make their own field of study less (more) attractive when counseling students in their choice of major. As complex education choices are made under uncertainty about the achievement of choice-specific outcomes and personal preferences and abilities, parents become the least costly and most trustworthy channels of information, especially in Canada, where switching majors is not costless. 1,2 Yet, a significant assortativity (a child predictably becomes a teacher because it is his father's and/or mother's job) could also suggest systemic biases in decision-making, specially when the information about the achievement of the future major-specific outcomes is bounded by parents' fields of study. 3 What then is the parents' role in the belief formation? To understand that information is not distributed symmetrically across majors with the same value and volume, imagine that both parents are accountants and working in the finance industry. The cost of obtaining the same level of information about other majors, say on biochemistry, is obvious. This brings us to the question of how the field-of-study homogamy (FSH) and whether the parents work in related occupations affect the magnitude of information asymmetry. The following 2 empirical questions need to be answered to assess the role of information asymmetry in the choice of major more formally: How can we quantify the resemblance of fields of study between parents and children beyond a binary proposition that reflects the assortative tendency, an association that exposes the attraction of each child to their own parents' majors? How can we identify the role of information asymmetry in this assortative tendency, after removing the other factors that are not observed by the researcher, such as implicit randomness, ability sorting, and 1 As expected, studies (e.g., Hoxby and Avery, 2013) show that less well-educated parents with no specialization would not be good transmitters of information. 2 Although the system is different from one where the major is chosen at entry into the university through a centralized test or using a threshold grade point average (GPA) required for each major, students in Canada are usually accepted to universities at three main faculty levels: Arts, Science, and Business/Commerce. Each of these requires different courses to be completed in high school with competitive GPAs at grades 11 and 12. Therefore, although the majors are decided after the second year, roughly after completing 14-16 core courses within each faculty, switching majors across faculties imposes a significant cost on students and parents. 3 In addition to information asymmetry, parents could also impose their preferences on their child's educational attainment by their willingness to use financial transfers to "distort" their child's choice toward (or against) a specific field of study. Zafar (2012) investigates this issue in his recent paper titled, "Double majors: one for me, one for the parents?" individual tastes, embedded in the resemblance of majors between each parent and child? We will try to answer both questions in this paper.
This study's primary objective is to investigate the role of information asymmetry in children's attraction to their parents' field of study reflected by assortative tendencies in childparent matches. We apply conventional intergenerational transmission functions that relate the children's assortative preferences to FSH and whether parents work in their trained jobs within Canadian families. We use the confidential major file of the 2011 National Household Survey (NHS) so that the size of the data and the availability of different levels of aggregation in the Classification of Instructional Programs (CIP) allow us to develop 3 indicators: the degree of children's attraction to their parents' field of study (field of study attraction or FSA), the degree of FSH, and the degree of relatedness between each parent's field of study and occupation (field-of-study relatedness or FOR). To identify the role of information asymmetry in assortative patterns, we define quasi-likelihood transmission functions, where the response variables take on fractional values of FSA between each child (son/daughter) and parent (father/mother) as a function of FSH and FOR. Similar to the difficulties in identifying the role of expected earnings in college major choice, the challenge here is also to control for selection into each major. To tackle this problem, we define within-family transmission functions based on an assortative matching model with 1-to-2 matches (1 for each parent), inspired by Diamond and Agarwal (2016). Comparable to panel models, this allows us to reduce unobserved heterogeneity so that the results provide new and more direct evidence about the intergenerational association of field of study due to information asymmetry reflected in assortative tendencies, which is, to the best of our knowledge, the first of its kind in the literature.
The first part of our results shows that children's choice of field of study exhibits significant assortative preferences. This finding is a new contribution that reports intergenerational skill transfers as opposed to educational mobility. We also find that the assortative tendency is the highest between fathers and sons relative to all other pairs, namely, father-daughter, mother-son, and mother-daughter. This evidence becomes even stronger when we use more disaggregated CIP codes and control for educational degrees. A significant skill sorting in mating is also revealed by the FSH measures, which also indicate gender differences in the attractiveness of each major in marriage. This finding is consistent with the evidence that the gain from the marriage could be different for each spouse (Choo and Siow, 2006) and with the evidence of a substantial degree of gender heterogeneity in the preferences for each major (Wiswall and Zafar, 2015). These findings, on significant intergenerational skill transfers and greater assortative mating for skills, are in line with the concerns about the possible progressive skill stratifications and earning inequalities in societies.
In the second part, the estimation results show that higher assortativity in each childparent combination is strongly associated with greater homogamy and field-of-study relatedness in parents' jobs. The empirical approach that we apply here aims to identify the role of information boundaries in this relationship. Our findings indicate that asymmetric information is a significant contributor to children's assortative tendencies in their choice of major.
The remainder of the paper is organized as follows: Section 2 introduces the data, homogamy, assortative preferences, and occupational relatedness; Section 3 introduces the conceptual background that links the subjective expectations in choice of major to information entropy; the empirical framework is explained in Section 4; the estimation results are reported in Section 5; and we provide the concluding remarks in Section 6.

Data
This study uses the confidential major file of the 2011 NHS. We restricted the data to include only non-Aboriginal native-born individuals living in 10 provinces. We also dropped non-degree-holder parents (i.e., those with no education or an education degree that does not grant a major) and those whose field of study contains <10 workers. After these restrictions, we obtained about 2.3 million observations. The 2011 NHS enables the classification of individuals' major field of study in which the highest postsecondary certificate, diploma, or degree was granted. Statistics Canada classifies the major fields of study by using the CIP, which includes  (2016) has developed a new method to adjust the data to recover the outcomes of "missing" independent children. However, their educational outcome is measured in terms of years of schooling. We have also applied the inverse probability weights method to our subsample to address the possible selectivity problem. The results on FSA calculations do not change significantly. and gender is just one aspect of the matter. It is possible, for instance, that those individuals who live at home are more likely to attend their closest higher education institution and thus their program choice set would be limited by a regional concentration in a particular industry, in which the parents work. Although we address this problem by controlling for unobserved regional heterogeneity, another issue would be that children who have a good relationship with their parents are more likely to stay at home and may therefore hold their parents as likely role models and follow in their footsteps. If this is the case, using our sample would lead to an upward bias in the parent-child assortativeness of field of study. 6 More descriptive information about the data and our samples are provided in the following sections, which explain FSH, FSA, and FOR.

FSH measures
Assortative mating has long been documented by demographers using nonparametric log-linear models based on contingency tables of ethnicity, education, religion, and other attributes (Schwartz, 2013). Following Becker's (1973Becker's ( , 1974) theory on marriage markets, economists have also investigated assortative mating in relation to match gains and returns to marriage. Chiappori and Salanie (2016), for instance, show that educational homogamy of posterity is likely to be reinforced by increases in the human capital of parents, who are matched homogamously themselves. Bicakova and Jurajda (2016) are the first to analyze mating by field of study for European countries.
Unlike joint or conditional probabilities that define the likelihood of a match, we choose the following identity that recognizes the randomness inherent in the matching process and specifies to what extent the match is driven by assortative mating on the field of study and to what extent it reflects the marginal distributions of each major: where F and M are indicators of fields of study for the female and male mates, respectively, in matching couples. As recognized in the literature, the observed matches in a marriage market are jointly determined by the preferences of both partners. For example, Choo and Siow (2006) argue that the observed marriage patterns positively depend on the gross gains to marriage in which the individual returns could be different for each spouse, reflecting a spousal "appreciation" or "attraction" of each field of study in mating.
With the number of matches, m ij , where i and j reflect the husband's and wife's major in each row and column in the resulting contingency table, the FSH matrix can be calculated by Equation (2): where m i and m j represent the row and the column totals, respectively, and T is the total number of pairs. While the FSH matrix reveals the assortativeness between, e.g., a male accountant and a female historian in mating, it would be quite possible that a male accountant's attraction to a 6 Gratefully, this point has been brought to our attention by one of our referees. Another point raised by the referee is that the financial crisis before the 2011 Census is likely to have an impact on choices of field of study. Issues like oversensitivity to uncertainty and a higher level of risk aversion may have strong effects on the choice set of majors for the cohort of students that we investigate in our study.
female historian would be different from her attraction to a male accountant. In order to reflect a differential "appreciation" of each field of study for each spouse in mating, we use a simple horizontal (vertical) normalization for each row (column) of the FSH matrix between 0 and 1.
The results are reported in Table 1.
For example, the match between a male accountant (Business) and a female historian (Humanities) is ranked at 0.30 in terms of its assortativity, among all possible matches available for a male accountant with other different major holders. The same match is ranked at 0.53 among those available for a female historian reported in the bottom section of the table.
These indices simply order each partner's appeal by his/her field of study and do not impose cardinal restrictions. It is obvious from the diagonal of both sections of the table that the evidence supports a strong FSH. Although not reported here, FSH becomes even stronger Husband's major Normalized by husband's major Mathematics, computer and information sciences, (8) Architecture, engineering, and related technologies, (9) Agriculture, and natural sources and conservation, (10) Health and related fields, and (11) Personal, protective, and transportation services.
when we use 41 CIP codes matrix calculated based on 12 and 41 major CIP codes. The results indicate a slightly increasing FSH as we use more detailed CIP codes. 7

Quantifying assortative preferences: FSA
The FSA index compares the field of study of each parent to that of each child in a family and calculates the degree of attraction between the 2 based on the probability distributions. We create 4 contingency tables using the restricted subsample explained earlier. Each table reports the number of field-of-study matches between sons and fathers, daughters and fathers, sons and mothers, and daughters and mothers. Similar to Equation (1), to identify the observed matching patterns, we choose the following identity that reflects the differences between observed and expected frequencies under independence: where P and K are indicators of fields of study for parents and children in matching, respectively. When it is normalized for each parental field of study between 0 and 1, the resulting measures imply the attraction of children to their parents' majors evaluated by the observed distribution of all possible matches between parents and children. The number of different matching possibilities between the parent and the child comes from the fact that it is the child who faces many different alternatives before making a decision on a major. The assortativity exposed by FSA reflects only the child's preferences as they are defined over children, not over parents in matches. While we use 12, 41, and 137 major groups of CIP, in the 4 match tables, we report only the sons' match calculated with 12 major CIP codes in Table 2. The higher values of FSA on the diagonal indicate that the most likely matches happen between the same fields of study. In each row, for any given major that the parent holds, the normalized FSA indicates the son's attraction to all other majors relative to the most likely match. The premise of this measure is that the child's attraction to each parent's major could be different even if the parents have the same field of study. Intuitively, the same major could be more(or less) attractive for the son, e.g., if it is held by his father, which may reflect not only the differences between maternal and paternal influence but also gender differences in occupational distributions. While dissimilarities in each cell between the upper and lower parts of the table may expose this fact, the presence of a strong assortativity indicates that parents' field of study is a fundamental factor in children's choice of field of study.
Although we refrain from using more space to interpret the results here, one may ask to compare the extent of field-of-study attraction between parents and children across 4 match tables. We use an index that computes the ratio of 2 diagonal shares of a match matrix as follows: which is the sum of the joint probabilities on the diagonal relative to the sum of the products of their marginal probabilities. Hence, it provides the ratio of the actual share of matches with the same field of study (on the diagonal) to the share of matches that one would expect under the random matching assumption. 8 The indices calculated for the 41 major CIP codes are as follows: 119.56 for Father-Son, 31.28 for Mother-Son, 60.02 for Father-Daughter, and 48.88 for Mother-Daughter. These sharp differences, tested by 95% bootstrapped confidence intervals, indicate that a randomly picked father-son pair with the same field of study is about twice as likely than would be predicted under random matching. Moreover, a very low index for mother-son pairs suggests that the overall attraction of sons to their mother's major is slightly higher than what would be predicted if sons randomly pick their majors. Although these observations are very informative, they would not provide answers that explain the underlying reasons. In Section 3, we will attempt to confront this challenge.

Field-of-study occupation relatedness -FOR
The evidence shows that when people do not work in their trained jobs, the value of their field of study diminishes (Aydede and Dar, 2016;Robst, 2007). A recent study by Lemieux (2014) finds that this wage penalty varies by each field of study in the range of 16% for engineers  Notes: See the notes to Table 1 for the full description of majors.
and 5.7% for degree holders in the Humanities. The quality of parents' occupational match would also contribute to the formation of subjective expectations about the major-specific outcomes. An accountant working as a chef, for instance, would be a less-reliable channel of information on the prospects of an accounting major than one who works as a certified public accountant.
To measure FOR beyond a binary proposition, related or not, we use the following continuous index suggested by Aydede and Dar (2016): where L is the number of workers, o is the occupation, f is the field of study, and T denotes the whole workforce. Given the large sample at our disposal, we use the frequency distribution of 41 fields of study across 40 occupations classified according to the National Occupational Classification (NOC-2011), which gives us 1,640 cells to calculate FOR. For each of the 41 fields of study, when we normalize FOR between 0 and 1 by using the highest FOR as numeraire, the resulting index, NFOR, reveals the ranking of each occupation for each major based on the native-born workers' distribution. To provide a descriptive summary for FOR, we classify the NFOR in 2 class intervals (1-0.8 and 0.8-0.0) and report the distribution of spouses across these classes and 11 major fields of study in Table 3. If, for any given field of study, we consider the occupations with NFOR between 1.0 and 0.8 as relatively better-matching occupations, we see that 32% of husbands work in related occupations, with the same ratio slightly lower for wives. As expected, the ratio varies across majors from 10% for wives in humanities to 57% for husbands in education.
Finally, to see the relationship between parents' education-job relatedness and children's attraction to their parents' field of study, we summarize the FSA for each child-parent pair by the parents' occupational relatedness. In the first row, both father (F) and mother (M) work in occupations that are related to their majors. This relatedness is reflected with a binary variable, NFORC, which is 1 if NFOR is between 1 and 0.2; and 0, otherwise. Although this classification is arbitrary, it seems that, in all parent-child pairs, a higher FSA is associated with a greater NFOR. More interestingly, the highest average FSA in each column is observed when the matching parent works in a related job irrespective of the other parent's occupational relatedness.
For example, in the first column of Table 4, the average FSA is much higher (0.466 and 0.469) when the father's NFOR is 1 and not affected by the mother's field-of-study relatedness.
This observation recurring in each column implies that the FSA calculated for each child-parent pair is strongly related to the matching parent's occupational relatedness but not to that of the other parent. If this positive relationship is statistically meaningful, which we investigate in the following sections, it also implies that FSA indices properly retrieve parental differences in assortativity.

Conceptual background
Although theoretical work incorporates the uncertainty in schooling decisions, earlier empirical studies assume that individuals are rational and use the achieved (observed) outcomes to infer decision rules. The recent literature shows that this is not a valid assumption and the difference between beliefs on choice-specific outcomes and their true population values is not trivial. A few recent studies (Wiswall and Zafar, 2015;Zafar, 2012Zafar, , 2013Arcidiacono et al., 2012;Stinebrickner and Stinebrickner, 2014) address this identification problem by directly eliciting subjective beliefs from a sample of university students. While the evidence in these studies reveals that subjective expectations on major-specific outcomes greatly vary across individuals, there is a lack of evidence as to why beliefs are so dispersed around the true population values.
In this study, we want to understand the role of parents' educational background in the process of expectation formation by looking at the assortative preferences that result from asymmetric information. The main driver of child i's attraction to major m revealed in his/ her choice is the expected lifetime utility from the vector of future outcomes (Z) of a specific human capital endowment with the subjective joint probability distribution, G(Z|m,t), at time t, defined as follows: The concept of information entropy in computer science, introduced by Shannon (1948) and used in economics by Sims (2003), argues that people have limited information-processing capacity, which alters the information for each individual and thus differentiates their behaviors. The "fundamental problem of communication" is for the "receiver" (user of the information) to be able to identify what data were generated by the "source", based on the signal it receives through the (potentially noisy) "channel". Sim's "noisy information model" provides a convenient framework in our context because the information flow is modeled by the discrepancy in probability distributions of the same event at the source and the receiver.
In our context, parents serve as communication channels, not as the source of data, in transmitting publicly available information on choice-specific outcomes to their child, the "receiver". The parents' capacity (the level of complexity in their communication and the amount of time for them to convey the data) will be determined and bounded by their own majors. To understand the differences in this capacity and related entropy, one can imagine a biochemist father obtaining, carrying, and sustaining the information on possible outcomes of choosing nuclear physics or accounting as opposed to a father who is a nuclear physicist or an accountant.
The channel capacity, which reflects the information entropy on major m defined by the Kullback-Leibler (D KL ) divergence (the difference between the subjective and objective probability distributions of the future outcomes, Z), can be expressed as follows when the father's field of study (FOS) is set to m: where superscripts F and M denote father and mother, respectively. Equation (7) implies that when FOS F = m, the expected level of information received by the child on major m is equal to an index number, g, the father's level of information-processing capacity on his own major plus how compatible the mother's major is with the father's major (FSM M ), and the degree of relatedness between the parents' fields of study and their occupations (FOR). The key element in this expression is FSM M , which reflects the degree of relatedness (normalized between 0 and 1) between the fields of study of the parents. Suppose that the mother's major is the least-related major to her husband's major (FSM M = 0). It implies that she is not a "high capacity" channel for the information on major m but becomes one on her own major. Hence, a higher degree of fieldof-study resemblance between parents makes them more efficient channels (less noisy) for more reliable information on major m by decreasing information entropy. However, a greater homogamy also means that parents become less efficient channels for other majors, with rising relative entropy. Therefore, the level of FSH defines the level of information asymmetry in a family. 9 9 It could be argued that parents are not the only channels in accessing the information on majors. We assume that the information obtained from all other channels (child's peers, councillors in his/her school, his/her close relatives, and the parents of his/her best friends) that a child would receive would be filtered through parents. This assumption is in line with the evidence that parental approval is the most important factor in the choice of major (Zafar, 2012). However, this assumption is not required in our empirical setting, as will be evident later.
This example becomes less intuitive when we compare 2 cases where the father is an accountant in both cases but the mother is a biochemist in the first case and a historian in the second. How different would the parents be in terms of channeling reliable information on accounting? Although the values of FSM M would be different, it appears that these 2 cases should be the same in terms of available information on accounting, especially relative to the case where both parents are accountants. However, one has to think that the information entropy on a major in a family will be determined by not only the fact that it is the major of one of the parents, but also how much the major (accounting) is appreciated, shared, understood, and discussed within the family, which is collectively reflected in FSM M . 10 The details of the conceptual framework summarized here can be found in the Appendix.

Empirical framework
The key challenge in understanding the potential contribution of information asymmetry to the observed assortative patterns is to control for other characteristics that are not observed by the researcher but aggregated in FSA. To address this issue, we use a conventional intergenerational transmission framework, wherein we define quasi-likelihood functions with the response variables that take on fractional values of FSA between each child (son/daughter) and parent (father/mother) as a function of the spousal "appreciation" of each partner's major and field-of-study relatedness. Intergenerational transmission refers to a process that outlines the transfer of individual characteristics, including abilities, preferences, and outcomes, from parents to their children, which we choose as our empirical framework. Although it cannot answer whether more educated parents have more educated children because of their education, the 10 It is true that more and better information on a major would not necessarily make it more attractive. 11 Holmlund et al. (2011) investigate the findings of a large number of studies to answer the following question: do more educated parents have more educated children because of their education? They show that the evidence is inconsistent across the other strategies (twins, adoptions, and Instrumental Variables models) and they could also encounter problems in obtaining bias-free estimates of causal intergenerational coefficients.
intergenerational elasticity of schooling is a fundamental metric that has been used to measure the mobility across generations. 12 Inspired from this literature, we propose a different identification strategy and start with 4 reduced-form matching functions that use the child's assortative preferences aggregated in FSA as an outcome of transmission, a process that is built on available information based on the parents' educational background.
where scripts M, F, S, and D denote mother, father, son, and daughter, respectively. With the normalized FSH (NFSH) and FOR (NFOR), these equations reflect the idea that a child's assortative tendencies observed in his/her choice of major is related to the FSH and the degree of relatedness between each parent's field of study and occupation within a family. 13 As long as a higher homogamy (and occupational match) suggests a greater limitation in available information on alternative majors, the coefficients of NFSH (NFOR) capture the underlying field-of-study transmission that relates the children's assortative preferences to the level of information asymmetry. The variable NFSH M in Equation (9), for instance, is bounded between 0 and 1. It reflects a perfect homogamy as it approaches 1. Intuitively, the a 1 coefficient reveals how much the son's preference for his father's major will be affected by the extent to which his mother's field of study becomes comparable. This reminds us of the earlier example: how much the son's aspiration for his father's major, accounting, will be affected if his mother was a biochemist instead of an accountant. Similarly, a positive and significant coefficient of NFOR validates the transmission as the parents would be more reliable transmitters of information when they work in their trained jobs. Hence, the presence of intergenerational transmission requires that the coefficients of NFSH and NFOR in those 4 equations should be positive, with dissimilarities reflecting the difference between maternal and paternal influences.
Yet, the identification of transmission due to information asymmetry across alternative majors requires controlling for ability sorting and unobserved heterogeneity. Defining each child's FSA separately for each parent provides an opportunity to create a setting similar to panel models. Since we observe 2 matches for each child, when we take the difference between them, the dependent variables in these matching functions better reflect the assortative tendency because the omitted heterogeneity across children are differenced out from the equations as shown below.  Corak (2001Corak ( , 2017. 13 Given the parent's major, the FSA reflects the child's decision on a major that maximizes his/her expected utility. The theoretical foundation of this decision-making process is well-defined in the literature (Altonji et al. 2015). For now, we omit other child, parent, and family-specific attributes in Equations (9) A similar identification method is also recognized and applied by Diamond and Agarwal (2016) by using the repeated measurements made available when each agent on one side of the market is matched to at least 2 agents on the other side. The intuition is that the same value of the unobservable characteristic of an agent determines multiple matches of that agent and can be differenced out in a measurement error model (Hu and Schennach, 2008). 14 Unlike in other matching markets, this is particularly effective in our case because the assortativity revealed by FSA reflects only the child's preferences defined over children, not over parents in matches.
These equations with within-parents differencing suggest that the difference FSA M,S and FSA F,S , for instance, should be smaller when NFSH M decreases, holding other covariates constant. Intuitively, if the mother married to an accountant holds a degree in biochemistry, NFSH M approaches its lower limit. 15 As the mother becomes another channel of information on an alternative major, viz., biochemistry, the family information boundaries expand. Unlike the case when the mother was an accountant, this increase in the level of available information in turn reduces the son's bias toward his father's major, i.e., accounting. Therefore, FSA F,S (the son's attraction to his father's major) should be smaller when NFSH M (resemblance of the mother's major to her husband's, measured by spousal differences in the appeal of their majors) becomes lower. Hence, the differences in w 1 and w 2 , as well as s 1 and s 1 , will provide information about the difference in transmission between fathers and mothers. However, the value (and the volume) of the available information provided by the homogamy measures in the family depends on whether the parents work in related occupations. This could be better understood if we change the accountant-biochemist example to one where the father works as a chartered accountant while the biochemist mother works as a branch manager in a bank, which diminishes the value of information on biochemistry from the mother. Since the parents would be a better channel of information conditional on the quality of their occupational match, an increasing NFOR M in Equation (13)  As outlined earlier, in addition to the level of information asymmetry built on the parents' fields of study, children's assortative preferences could also reflect ability sorting. The suggested within-family specifications can address this identification problem conditional on the assumption that the effects of unobserved parental traits in Equations (9) and (10), as well as in Equations (11) and (12) where w 3 = (β 3 −a 3 ), w 4 = (β 4 −a 4 ), w 5 = (β 5 −a 5 ), and w 6 = (β 6 −a 6 ). When estimated by ordinary least squares (OLS), identification of w 2 (w 1 ) requires either that NFSH M (NFSH F ) is independent of unobserved parental traits or that w 3 , w 4 , w 5 , and w 6 are 0, as shown below.
The coefficients of interaction terms will reveal the differences in the sons' assortative preferences in STEM families. With the within-family specification, 2 factors will shrink the bias on these coefficients: first, the differential effects of unobservables, w 8 = (β 3 −a 3 ) and w 10 = (β 5 −a 5 ) in Equation (17), as opposed to their levels in specifications (9)-(12), will diminish their size; and second, cov(STEM × NFSH M ,h F ) and cov(STEM × NFSH M ,h M ) will be close to 0 for a subsample as specified by Equation (17). The definition of the bias in the estimate of w 5 , for instance, can be expressed as follows: 16 While we could observe a high NFSH for engineers and historians, they would have different mathematical skill endowments. Hence, the size and the significance of the coefficients w 5 and w 6 will reveal whether cov(NFSH M ,h F ) and cov(NFSH M ,h M ) can reasonably be assumed to be 0. The next section will provide the results.

Without within-family differencing
We start with the 4 equations from Equation (9) to Equation (12). To reduce the unobserved heterogeneity across families, we expand the equations by controlling for household income, provincial fixed effects, first spoken official language, household size, and whether the family resides in an urban or rural area. We also control for homogamy in terms of parents' highest educational degree. 17 After these additions, Table 5 reports 2 sets of estimation results for the selected variables. 18 The first 4 columns report the estimation results, which include NFSH for each parent without accounting for parents' occupational relatedness. We control for FSH in the last 4 columns as a binary variable -1 if both parents have the same field of study, and 0 otherwise -and add FOR, for both father and other, as a categorical variable, FORC, which is equal to 1 if the normalized FOR is <0.2 and 0 otherwise. The first 4 specifications use larger subsamples because they exclude FOR, which can be identified only if the person's occupation is known.
The results reported in Table 5 are informative as they reflect the maternal and paternal differences in children's assortative preferences in choosing majors. The robust and positive NFSH coefficients provide evidence for the existence of what we call intergenerational transmission of field of study. As outlined before, the results reflect the combination of ability sorting, differences in parenting skills, and unobserved heterogeneity in individual and family characteristics, in addition to the limited information accessibility constrained by the parents' fields of study. The first 2 columns show that the son's attraction to his parents' majors is strongly related to the FSH, measured by spousal "appreciation" of each parent's major. A comparison of the coefficients (0.10 and 0.05) indicates that the paternal influence is more dominant in educational transmission for sons. A similar gap is not observed for daughters reported in the third and fourth columns. The robust NFSH coefficients still suggest that daughters will also be attracted to their parents' field of study, yet mothers have more influence on daughters.
In the last 4 specifications, we distinguish the parents who have the same field of study and control for their occupational match. The results are consistent with those of the first 4 specifications. The effect of having homogamous parents on the son's attraction to his father's major (0.055) is much higher than his attraction to his mother's field of study (0.002). Again, the same significant but smaller difference can be observed for daughters. The second channel to identify the transmission is the relatedness of parents' field of study to their occupation, which is controlled by FORC in the last 4 estimations. The results confirm a strong and positive relationship between the parents' occupational match and the children's attraction to their parents' majors. The parental difference in this effect is also noticeable and in line with the earlier findings with FSH: the paternal effect is greater than the maternal influence for sons, while the same difference is less magnified for daughters. When it comes to other factors, a higher homogamy in terms of educational degree (EDH) is positively and significantly associated with FSA. Similarly, a higher household income has a positive effect on FSA. Among the other variables not reported in Table 5, only the urban-rural distinction in households' location is significant. Children from families in larger cities experience higher FSA. 19 19 To test the robustness of the results in Table 5, we also used different levels of the CIP and occupation classifications available in the 2011 NHS. The results are not sensitive to using larger or smaller dimensions of match tables.  23,748 23,748 22,254 22,254 21,101 20,016 20,195 19,159 Notes: (1) Dependent variables are indicated in each column's heading.
(2) Standard errors reported under the coefficients are adjusted by using the two-way clustering method (Cameron et al., 2011) at the individual and household levels.
(3) EDH reflects education-degree homogamy and is a continuous variable normalized between 0 and 1. HH Income is the annual disposable income for the household. Other variables that are not reported in the table control for household size, first spoken official language, whether the family is in rural area, and provincial fixed effects. (4) We also ran the regressions with and without the parental age variables. The results are insensitive to the inclusion of parental age variables. (5) When we control for field-of-study fixed effects, the results do not change significantly.

Within-family differencing
We address the identification with a within-family differencing as described in the previous section and report the results in Table 6.
The first 2 columns show the estimation results of Equation (13) with the same dependent variable, the difference in the son's attraction to his parents' majors. The first column reports the estimation results of the restricted version of Equation (13). The estimation results for daughters based on Equation (14) are reported in the last columns. The restricted specifications in the first and the third columns use a new binary variable, NFORC, which reflects the difference in NFOR in 3 categories; the base category refers to the case that both parents have the same field-of-study relatedness. Either both work in related jobs (e.g., NFOR is between 1.0 and 0.2 for both parents) or in unrelated jobs (e.g., NFOR is between 0.2 and 0.0 for both parents). The second category indicates that while the father works in a matching occupation, the mother does not. The third category specifies the opposite situation. Hence, the effect of parental differences in field-of-study relatedness can be captured by the last 2 categories. 20,21 20 We define the base category with two opposite cases, either both parents work in related jobs or unrelated jobs, because we want to estimate the effect of field-of-study relatedness for each parent. Given that the dependent variable is the difference in child's attraction to each parent's major, this effect can only be captured when parents' FOR is different. 21 The idea here is to identify the effect of FOR on the children's attraction to their parents' major, when the parents work in an unrelated occupation. Therefore, we actually tried to find the lowest cutoff point that realistically classifies the person's job completely unrelated to his/her training. We also applied higher thresholds up to 0.4. The results are still robust. This is mostly because the distribution of FOR is convex. Hence, increasing the threshold from 0.2 to 0.4 had a minor effect because relatively few people exist in the bin of 0.2-0.4. Notes: (1) Dependent variables are indicated in each column's heading.
(2) Standard errors reported under the coefficients are adjusted by using the two-way clustering method (Cameron et al., 2011) at the individual and household levels. Since an increase in NFSH M has a positive impact on FSA F,S , it reduces the distance between FSA M,S and FSA F,S . When the mother's major becomes similar to the father's, NFSH M rises.
Because higher homogamy implies more constraint in the family in terms of available information on other majors, the son's bias toward his father's field of study rises. This is confirmed by the negative sign of the NFSH M coefficient. Equally, when NFSH F rises, the similarity between parents' major becomes higher. Constrained by less information being available regarding other majors, the son's attraction toward his mother's field of study increases. This is verified by the positive sign of the NFSH F coefficient: because a rise in NFSH F increases FSA MS , the distance between FSA MS and FSA FS becomes larger. The difference between these effects (0.101 and 0.162) again suggests that paternal influence is noticeably greater than maternal influence for sons. The same comparison for daughters in both specifications of Equation (14) would not offer the same evidence, which is also consistent with the relatively weaker effects for daughters reported in Table 5.
The existence of intergenerational transmission is also verified by the effect of the parents' field-of-study relatedness. In the first column, when evaluated against the base, the first category (fathers work in their trained job but mothers do not) has a negative effect on (FSA MS ) -(FSA FS ). Similarly, a significant positive effect is observed for the second category, wherein the mother works in her trained job but the father does not. These results are also confirmed with the unrestricted specification reported in the second column. Now, using FORC, if the mother's major is not a good fit for her occupation, the negative coefficient

Within-family differencing among families with STEM majors
With within-family differencing as specified by Equations (13) and (14), the other factors, such as the effects of siblings, neighborhoods, and peers, either observed or unobserved, are differenced out in the estimations. Hence, the results deliver better evidence about the role of information constraint in children's assortative preferences. However, as outlined earlier, the success of this identification strategy is conditional on the extent to which the FSH is driven by the ability sorting in parents' marriage.
One way to address this problem is to use a subsample that includes only those families in which both parents hold at least a university degree in one of the STEM majors so that the difference in terms of their ability endowments would not be significant. Table 7 reports the estimations of the same specifications shown in the second and the last columns of Table 6 with STEM variables as expressed by Equation (17).
When one of the parents holds a degree in one of the non-STEM majors (or less than a bachelor's degree), the coefficients of NFSH M and NFSH F (-0.079 and 0.161) are almost identical to those reported in Table 6  homogamy and their assortative preferences in their choice of major.

Limitations
The total elasticity of the assortative preferences in terms of parental homogamy can be expressed for sons by the sum of the coefficients NFSH M and NFSH F , which is 0.24 (-0.079 and 0.161) from Table 7. This measure suggests an important role of information asymmetry in children's choice of major to the extent that the FSH reflects the level of constraint on the available information when children choose a major. It should be noted that the results reported here are conditional on a couple of assumptions. Although our sample, children living with their parents, is representative of the whole sample, there would still be a selection problem whereby children living with their parents may have different behavioral predispositions that affect their assortative preferences. Notes: (1) Dependent variables are indicated in each column's heading.
(2) Standard errors are adjusted by using the two-way clustering method (Cameron et al., 2011) at the individual and household levels. Coef. = coefficient; Std. err. = standard error.
Second, our underlying model is static and uses data that includes children mostly with completed majors. The evidence in the literature is very clear that students update their beliefs in their first years of study and switch majors, if the cost is endurable. We believe that using data on completed majors leads to a downward bias in our estimations.
Third, the constraint on available information in a family measured by the FSH would not necessarily suggest a positive bias in children's choice toward their parent's majors. Although it is less likely, 2 accountant parents would not necessarily be in favor of their child taking up their major and may deter their children from their own majors. This possibility would also create a downward bias in our estimations.
Finally, as is very common in most empirical studies in the field of education economics, our attempt to remove a possible ability bias from our estimations has its own limits. We think that specifications that use within-family differencing and a proxy that groups families with similar ability endowments substantially shrink the bias. Still, within-family differencing may have some other complications in our estimations. For example, a son's attraction to both parents' majors may not be homogeneously comparable, when each parent's attraction to their own field of study strongly reflects their gender preferences. There is an extensive literature on gender differences in field-of-study preferences (Zafar, 2013). Our hypothesis in this study implies that, when his mother's field of study becomes distinct from his father's major, the son's attraction to his father's major will be affected negatively. This is because the field-of-study diversity in the family will expand the information boundaries and consequently his choice set on majors. This may not be true, i.e., he will not be less attracted to his father's major, if his mother's choice of field of study is strongly gender based. Our expectation is that a possible bias due to this issue would lead to underestimation of the true effect of information constraints.
With all these caveats, we still believe that the transmission coefficients provide very valuable information on the intergenerational field-of-study elasticity, which is the first in the literature, to the best of our knowledge.

Concluding remarks
The potential spillover effect of education is a fundamental public policy matter because it may lead to progressive skill stratifications and dispersed income distributions in every generation if ability sorting in mating and across generations is substantial. Most studies use years of schooling as the educational outcome for children, treating education as unidimensional. Yet, educational decisions are no longer just about the quantity but about the specialization to be pursued as well. This study quantifies assortative mating by estimating FSH and intergenerational transmission of skills by measuring assortative preferences in the choice of major. As uncertainty increases with the complexity of educational choices, misinformed decisions made by students in choosing their field of study or by administrators in allocating their limited resources across disciplines would curtail social and economic progress. This study's primary objective is to investigate children's attraction to their parents' field of study, reflected by assortative tendencies in child-parent matches as an outcome of information asymmetry.
To identify the role of information asymmetry in assortative patterns in each fieldof-study match between parents and children, we define quasi-likelihood transmission functions, wherein the response variables take on fractional values of FSA between each child (son/daughter) and parent (father/mother) as a function of the spousal "appreciation" of each partner's field of study. We use the confidential major file of the 2011 NHS so that the size of the data and the availability of different levels of aggregation in the CIP allow us to develop 3 indicators: the degree of children's attraction to their parents' field of study (FSA), the degree of FSH, and the degree of relatedness between each parent's field of study and occupation (FOR).
Comparable to panel models, we define within-family transmission functions with 1-to-2 matches (1 for each parent). The results show that children's choice of field of study exhibits significant assortative preferences isolated from ability sorting and unobserved differences across majors and other family characteristics. We also find that the assortative tendency is the highest between fathers and sons relative to all other pairs, namely, fatherdaughter, mother-son, and mother-daughter. This evidence becomes even stronger when we use more disaggregated CIP codes and control for the educational degrees. With some caution, we attribute this persisting assortative tendency to the information asymmetry across alternative majors built on by parents' educational backgrounds within families.
Since f i represents the "true" (joint, marginal, or conditional) distribution of the data, differences in D KL across individuals can only be explained by varying g i , different approximations of f i for any given value, z.
This example illustrates a more general point: the optimal prediction of f i with g i that minimizes the information loss depends on the information-processing capacity of each channel. Therefore, there must exist 1 optimal channel among the many options for each f i that maximizes the mutual information measured by D KL . Yet, when these options are not readily and equally available, the optimization problem leads to suboptimal choices that fit the channel capacity. Sims (2003) suggests that the nature of the "noise" that quantifies the information-flow constraint in each channel does not need to be exogenous as in physics. Instead, the available information capacity delivers a model for the "noise" in many applications in economics.
In our context, parents serve as communication channels, not as the source of data, in transmitting publicly available information on choice-specific outcomes to their child, the "receiver". The information on choice-specific outcomes is not generated by the parents. These outcomes are random variables, and the information on them is publicly available. The parents' capacity can also be defined by their ability to reach out to available sources. Sims (2003) calls this concept as "individuals' information-processing constraints". i and may affect their capacity on major m, are included in vector s. For example, even if both parents are teachers (major k), the parents' friends (or close relatives) who are dentists (major m) would reinforce the parents' capacity and reduce the entropy in transmitting the information on major m. Although modeling information flows is a complex task, one simple way to approximate the above expression is to define it as conditional on one of the parent's major.
Thus, assuming linearity in f(.), the level of information-processing constraint can be expressed as Equation (7) in the text.