The Early Career Dynamics of Informality and Underemployment Evidence from the Arab Republic of Egypt

The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent.


Policy Research Working Paper 10499
This paper studies the early career dynamics of employment and formality using data from the Arab Republic of Egypt. With 14 rounds of Egypt's labor force surveys several measures of informality and underemployment are constructed to examine how the labor market conditions faced by young men when they exit school shape their future employment trajectories. Employment outcomes at different levels of potential experience are linked to cross-cohort, cross-regional and cross-schooling level variation in labor market conditions at graduation to achieve identification. The results show that cohorts of Egyptian men who enter a labor market in which employment rates are high (low) are only better (worse) off for a few years. These fast mean reversals stand in contrast to the typical finding from rich countries that scarring effects persist through the first decade of a worker's career. This paper is a product of the Development Research Group, Development Economics. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors may be contacted at cjoubert@worldbank.org.

Introduction
Informality and under-employment are pervasive and associated with stagnant productivity in low and middle income countries (Busso et al. (2012)). Informality rates have not declined significantly in most countries despite a general trend towards reducing payroll tax wedges in formal employment, and even in the presence of high economic growth (La Porta and Shleifer (2014)). One reason for persistent informality is that the formal private sector has struggled to absorb bulging cohorts of youth exiting school, a problem that is likely to persist given that the median population age is lower than labor market entry in many cases.
This paper studies the extent to which the labor market opportunities faced by young men entering the labor market shape their access to formal employment and broader employment trajectories. It uses data from 14 rounds of the Arab Republic of Egypt's labor force survey that record employment outcomes for individuals in successive graduation cohorts. The dataset contains information on multiple measures and proxies for formal vs. informal employment and under-employment, such as working without social security coverage, being employed full-time, working without a contract, working for a wage, working for the public sector, working in agriculture, and the size of the employing establishment (number of employees). We exploit variation across cohorts, regions, and schooling levels in the labor market conditions faced by an individual upon graduating to achieve identification, following Schwandt and von Wachter (2019).
Middle-Eastern and North African countries ("MENA region"), and Egypt in particular, exemplify the "youth bulge" phenomenon. 1 Public employment historically absorbed labor market slack, creating distortions in the supply of skills by encouraging credentialism and overeducation. 2 Youth labor markets are thus characterized by high levels of informality, unemployment, or underemployment while queuing for public jobs (Assaad and Krafft 2015). In this context, the opportunities of the successive cohorts entering the labor market can vary substantially depending on shocks affecting private but also pub-1 See Assaad and Roudi-Fahimi (2007), Saliola (2022), and projections by Assaad (2022) 2 El-Kogali and Krafft (2019) lic labor demand as governments' fiscal spaces are often directly connected to commodity prices through price subsidies (Ghoneim 2015).
Studies on the effect of graduating in a recession in developed countries (Kahn (2010a), Oreopoulos et al. (2012), Altonji et al. (2016)) suggest that labor market conditions immediately following graduation can have long-lasting effects on the future earnings and employment prospects of young workers. These studies exploit fluctuations in the labor market conditions faced at labor market entry by successive cohorts of graduates for identification. This paper draws on this identification strategy to pin down the impact of shocks to labor market opportunities at the time of labor market entry on the subsequent outcome trajectories of young Egyptian males. It contributes to the understanding of career dynamics and life trajectories in the MENA region: the literature has documented declining probabilities of transitioning from school to "good" jobs (Assaad et al. (2019), Assaad and Krafft (2021)) and analyzed the determinant of unemployment dynamics using survival analysis models (Assaad et al. (2016)), but the causal effect of initial labor market conditions of subsequent employment and life outcomes in the region has not been measured.
We also contribute to the broader literature on informality dynamics. Cross-sectional characteristics of informal employment are well-documented: it is associated with low wages, low-productivity, smaller establishments, self-employment, and lack of social protection coverage. A strand of the literature considers macro-dynamics, e.g. how informality co-moves with the business cycle (Loayza and Rigolini (2006), Bosch and Maloney (2008)) but the micro-dynamics of informality are the object of much less research. Existing studies have described short-term transitions in informality status and informal wages (e.g. Bosch and Maloney (2010), Pratap and Quintin (2006) Bargain and Kwenda (2010)). Few papers focus on career or life-cycle dynamics beyond basic descriptions (see Gatti et al. (2014), for MENA countries including Egypt) even though they may determine the investment and accumulation of skills, or the design of social protection programs (Joubert (2015)).
Life cycle dynamics can also shed light on the nature of informality. A simple model in which individuals randomly search for a rationed number of highly-valued formal jobs would predict that rates of formality increase with age as individuals repeatedly attempt to access good jobs and never leave them. In such a model, the age-formality gradient reflects the arrival rate of formal jobs, i.e. the degree of rationing. In contrast, other theories of informal employment that emphasize comparative advantage or nonpecuniary benefits do not structurally produce life cycle dynamics (Clark et al. (2017)).
We find that cohorts of Egyptian men who exit school when labor market conditions are bad do not exhibit the same persistent scarring effects identified in rich countries (e.g. Kahn (2010b), Oreopoulos et al. (2012)). Instead, they are worse off for at most a few years and only in terms of the quantity but not the quality of employment. More puzzling, good initial conditions in fact are associated with negative outcomes after 5 or 6 years of potential experience. For example, those who faced a better labor market when they exited school (as measured by high employment rates) are less likely to be employed, to have social security coverage, to have a full time job, to have a contract, and to be wage workers, and they work fewer hours, especially towards the end of their first decade of labor market experience.
We hypothesize that these results reflect that cyclical variation in labor market conditions dominates scarring effects. Cohorts that enter into a bad labor market end up profiting from eventual economic recoveries, and are not strongly penalized by having a bad start. The negative correlation between initial employment rates and eventual employment quality could also be consistent with a model of queuing in which some cohorts forgo initial employment opportunities in order to secure rationed quality jobs down the road (and vice versa). We concluded that data sets spanning longer periods are needed to confirm either hypothesis.

Measures of informality
Three measures of informal employment are most often used in the literature, see Wahba and Assaad (2017). One is the type of contract (officially written or not), another is whether the individual has social security coverage or not, and the last one is the size of the employing firm or establishment (firms with fewer than 5 employees are often assumed to be informal). In addition to these measures of informality, we also consider other aspects of employment quality to describe employment trajectories post graduation such as: wages, full time employment, whether the individual is a wage worker or not, employed in the public sector, or employed in the agriculture sector. To complete the picture, we also consider the quantity of employment, namely whether an individual is employed and the number of hours worked per week.
We use Labor Force Surveys (LFS) between 2006 and 2019 to construct the employment outcomes. 3 Labor Force Surveys were first conducted in Egypt in 1957 by the Central Agency for Public Mobilization and Statistics "CAPMAS", on a quarterly, biannual, and annual basis. In 2006, CAPMAS improved the questionnaire by re-ordering existing questions and creating new ones to better capture labor market conditions. 4 The Economic Research Forum (ERF) cleaned and harmonized the data as a part of a project that started in 2009, in which major efforts were devoted to facilitate micro data access and use in Arab countries (OAMDIA (2018b)).
In our main model we use a labor market indicator (the regional rate of employment) that is constructed directly from the 2006 through 2019 LFS survey rounds. Therefore we must restrict the estimation sample to individuals who exited the schooling system during those years. The indicators used in our second model (youth employment rate, GDP growth, oil and wheat prices) are only available at the national level but cover 1990 onwards. In that case we can include all individuals who graduated after 1990. Besides We check the robustness of our results to excluding these two rounds.
restricting the sample to cohorts defined by the year of exiting the schooling system, we also restrict it to males because we want to abstract from the broader set of economic mechanisms that must be invoked to explain female employment trajectories.

Early career dynamics of informality and underemployment
Our analysis of informality and under-employment organizes the data by graduation cohort defined by the year of exit from the schooling system. We also use the term labor market entry or labor force entry, abstracting for simplicity from the possibility that young men may exit school and not look for work. The LFS do not record an individual's year of graduation, therefore we impute it based on the typical age of graduation for the schooling level attained by each individual and their age at the time of the survey. Specifically, we use 22 as the graduation age of a tertiary-educated individual, 18 for a secondary educated individual, and 12 for a primary educated individual. As discussed in section 3, while this procedure implies some loss of efficiency, it is usually applied in the literature (e.g. Kahn (2010a)), even when actual graduation ages are available, to remove a source of endogeneity whereby individuals time their entry in the labor market based on labor market conditions. We construct potential experience as the difference between the age at the time of the survey and the imputed age at the time of graduation. Figure 1 plots the evolution of employment outcomes describing young men's situations in the labor market and access to higher quality jobs. Those improve overall as the time passes after graduation. Access to formal employment increases, as captured by the share of employment with contracts, with social security coverage, or full time employment.
The share of those who work in the agriculture sector (usually a low paid sector) decreases.

Labor market conditions at graduation
Regional labor market conditions indicators at labor market entry are also extracted from Egypt's Labor Force Survey using rounds 2006 through 2019. Therefore LFS 2006-2019 are both used to describe the subsequent labor market outcomes of individuals in years following their graduation (as described in subsection 2.1), and to assess the labor market situation at the time of graduation in the regional model (described in section 3).
At the national level, we use the following economic condition indicators at the time of graduation: GDP growth, oil prices, wheat prices and the youth employment rate. Table   3 shows the average of each indicator over the years 1990-2019. The table shows a high (34.76%) average unemployment rate for males aged between 15-24. Figure (2015) and Assaad 2019). Better-off individuals can afford to remain unemployed rather than accepting a job that does not match their level of skills, a phenomenon known as "queuing". Therefore, we estimate our model using the overall employment rate, which includes unemployed individuals. We also produced results (available upon request) using other indicators of labor market opportunities, as suggested by Assaad (2019): hours worked, the share of workers with social security coverage, the share of workers with a contract (officially written or not), the share of workers with full time contracts.
To compute each regional labor market conditions indicator, we consider individuals who are out of the schooling system aged 15 years of age and older. The indicator is obtained by averaging the corresponding individual measurement among observations in the same year, governorate of residence and education category. Since we do not have information on the governorate of residence at the time of graduation, we impute it using the governorate of residence at the time of the interview. Our data span 14 years, there are 29 governorates, and we construct four schooling categories, for a total of 1,475 combinations (excluding cells with fewer than 50 observations). Table 2 describes the variation in each labor market conditions indicator across the 1,475 data cells we have constructed.

Empirical Strategy
The goal of our empirical analysis is to relate an employment outcome of interest Y it experienced by individual i in year t, with the labor market conditions LM C 0 i they faced upon first entering the labor market. This individual model is aggregated at the region r x schooling-level s x graduation cohort c x potential experience e level, because we do not use independent variables that vary at a more disaggregated level. Variation in the outcome of interest is purged from average differences between regions, educational attainments, year, and potential experience using corresponding fixed effects. Cohort effects are not identified separately from year and potential experience so only a linear cohort trend is included. This yields the following regression model: whereȲ scer is one of the outcomes of interest (e.g. log of monthly wages or being employed in a full time contract), LM C scr0 is a measure of labor market conditions faced by a given cohort and schooling level in a given region when entering the labor market, and calendar time t is a deterministic function of cohort c and potential experience e. The main coefficient, β e is allowed to vary by potential experience, to measure how the whole early career trajectory is affected by initial labor market conditions.

Identification and interpretation
The inclusion of regions, educational attainments, year, and potential experience fixed effects implies that we are measuring deviations from the average career path, net of average differences across regions and schooling levels, and net of average annual variation. Therefore, we interpret the main remaining variation in LM C scr0 as arising from regional shocks to labor demand away from the country's average situation. The identifying assumption is that these shocks are uncorrelated with the characteristics of the corresponding cohorts graduating in each region.
The literature has identified two main threats to identification for this empirical strategy. 5 First, schooling decisions (and thus the timing of labor market entry) can respond to labor market conditions: studies have shown that college graduates delay graduation when the labor market is bad. Those who decide to enter the labor market in a crisis could be different than those who delay their entry, generating selection bias. To deal with this issue, predicted labor market entry based on age and schooling level will be used instead of actual labor market entry as an alternative specification. Second, some of the measures of labor market conditions we use as independent variable (e.g. employment and unemployment rates) are aggregates of the labor supply decisions of young workers upon graduation. This creates a mechanical correlation between employment outcomes and our measure of labor market conditions at 0 potential experience. In those cases, the subsequent career profile (potential experience gt 0) is still interpretable as capturing the persistence of initial employment conditions on future career outcomes.
Notice also that we do not include the current labor market conditions in the regres-5 See a discussion in Oreopoulos et al. (2012).
sions (only labor market conditions at the time an individual graduated); therefore, the regression coefficients should be interpreted as the effect of graduating in a recession, say, on the employment trajectory given how labor market conditions tend to evolve after recessions (rather than the pure effect of one year of recession at graduation followed by normal conditions) (Oreopoulos et al. (2012)).
We first use as our independent variable the aggregate employment rates that vary at the governorate x year level ("regional model"). Because this measure is calculated using LFS data spanning 2006-2019, it is only available for cohorts that graduated in those years.
To complement these results we consider other outcomes that are available from 1990 onwards but vary only by year ("national model"). In the national model, the identification exploits differences in the career profiles exhibited by each subsequent cohort. It cannot be ruled out that a spurious correlation arises where differences between cohorts in, say, social norms, happen to be correlated with the evolution of labor market conditions.

Results
We present results from estimating equation 1 on the sample described in section 2 in tables 4 through 9 and appendix tables A1 through A6. The tables report the coefficients that multiply the interaction between measures of labor market condition at labor market entry and years of potential experience. Coefficients on the other control variables such as the sets of dummies, are available from the authors upon request. Our main set of results exploit regional variation in employment rates ("regional model") between 2006 and 2019. We assess their robustness by considering a broader set of measures of macroeconomic conditions over a longer time period  that vary only at the national level ("national model").

Regional model
Tables 4 and 5 present results using regional employment rates as our independent vari-  (iv) Cluster-robust standard errors in parentheses (clustering on cohort and governorate of residence).
The results tables show the coefficient estimates associated with the interaction between initial labor market conditions and years of potential experience from 0 to 10. 6 The 6 Potential experience is top coded at 10. Only individuals with 15 years of potential labor market  More puzzling, the coefficients become significantly negative after year 6 in the career profile. These estimates imply that individuals who graduate in a bad labor market (i.e. characterized by low employment rates) end up exhibiting higher employment rates 6 years into their careers than those who started in a better labor market. As noted in Oreopoulos et al. (2012), these coefficients capture the effects of the initial employment conditions augmented by the regular evolution of employment conditions faced afterwards. Therefore, the first possible interpretation of these results is that cyclical variation in labor market conditions eventually dominates scarring effects. Cohorts that enter a bad labor market end up profiting from eventual economic recoveries (and vice-versa), and are not strongly penalized by having a bad start. Indeed, the youth employment rates and GDP growth cycles depicted in Figure 2 exhibit peak-to-trough durations of around 6 years.
experience are included in the sample.
Note also that, due to the structure of our data, the coefficients on higher years of potential experience rely on fewer cohorts for identification. Therefore, our results may reflect the specific circumstances of cohorts that graduated between 2006 and 2010. These cohorts contribute the most to the coefficients for higher numbers of years of potential experience. Indeed, our national model (see section 4.2), which leverages information from many more cohorts (graduation years 1996-2019), does not exhibit the same consistently negative coefficients at higher years of potential experience.
Conditional on being employed, we now examine the fraction of those employed in "good" jobs, as measured by the outcomes listed earlier (tables 4 and 5). In the first few years following labor market entry, if the prevailing employment rate was high, individuals are more likely to have a full-time job and work more hours, to be a wage worker, but also more likely to be employed in agriculture. The corresponding coefficients on our classic measures of informality such as the existence of a contract or social security coverage are also positive but not significant. 7 Starting in the 6th year of potential experience, these variables exhibit the same sign reversal described earlier when discussing employment rates. Those who faced higher employment rates when they exited school are less likely to have social security coverage, be employed full time, have a contract, to be wage workers, and, marginally, to be employed in the public sector. These patterns suggest that scarring effects in Egypt are short-lived in terms of job quality and are dominated by the cyclicality of labor market conditions.
It is worth mentioning that these patterns in employment quality could also be generated by a model of queuing in which applying for a good job restricts contemporaneous employment opportunities. Fluctuating initial employment opportunities would affect the opportunity cost of queuing and lead some cohorts to queue for quality jobs less than others. These cohorts would then exhibit both higher initial employment rates and lower employment quality down the road. However, since the same patterns apply to the quantity as well as the quality of employment, we favor the cyclicality hypothesis outlined above.

National model
In this section we assess the robustness of our main results using a set of additional proxies for initial labor market conditions that span a longer time frame but are only available at the national level. These include the youth unemployment rate, the GDP growth rate, and the logarithms of oil and wheat prices. Each of tables 6 through 9 shows regression estimates for a representative selection of outcomes, each column corresponding to one of the four labor market condition proxies. Results for other outcomes are relegated to the appendix for legibility. In the national model, we do not exploit differences in labor market conditions across region, but instead rely on differences across years in which successive cohorts enter the labor market. 8 The advantage of these national level independent variables is that they are available from 1990 onwards. This allows us to include cohorts of workers that entered the labor market before 2006 in our sample.
The patterns in employment rates (table 6), and hours worked (table 7) are consistent with the results from the regional model. To the extent that coefficients are significant, they are positive for the first few career years, and negative afterwards (or the reverse when considering commodity prices).
Measures of employment quality such as social security coverage, full-time employment, having a contract, and public employment are for the most part negatively correlated with graduating in a period of high youth employment and gdp growth (table 8, and appendix tables A1 A2, A3). These results are also consistent with those from the regional model for the second half of the career profiles. The two models diverge in that the regional model produced positive correlations for the first few years of work prior to the sign switching. Working for a wage also exhibit the same negative correlations in both the regional and national models for most of the potential experience levels (table A4). However, the national and regional results are at odds for the first two years and the last three years of potential experience. The size of the establishment, conditional on working does not exhibit strong significant patterns in the national model (table A5). Working in agriculture (table A6) also does not exhibit strong correlations with either youth employment or GDP growth. However, it is negatively correlated with initially high oil and wheat prices for the first five years of potential experience and positively correlated thereafter.
The results on log-monthly wages (table 9), which are only defined for wage workers are also broadly consistent with the overall pattern found in the regional level model, in which good initial economic conditions are associated with positive outcomes early in the career, and negative outcomes down the road. This is reflected in the negative and significant coefficients attached to the interaction of potential experience levels above 5 and GDP growth and youth employment. Correspondingly, the coefficients for log-oil prices and log-wheat prices (which proxy bad labor market conditions) are negative. (iv) Cluster-robust standard errors in parentheses (clustering on cohort and governorate of residence).
(v) * , ** and ***refer to significance at the 10, 5 and 1% levels respectively. Notes: (i) Hours worked per week for employed individuals are regressed in each column on country level economic condition indicators at the year of graduation: youth employment rate (15-24 years old), oil price, wheat price and gdp growth; interacted with potential years of experience 0 to 10. The potential experience is captured by the difference between the individuals' age at the year of the survey and their age at the year of graduation.
(ii) Governorate and potential experience fixed effects are included in each regression. Controls for eudcational attainment are also included. (iv) Cluster-robust standard errors in parentheses (clustering on cohort and governorate of residence).

Conclusion
We find that cohorts of Egyptian men who exit school when labor market conditions are bad do not exhibit the same persistent scarring effects identified in rich countries (e.g. Kahn (2010b), Oreopoulos et al. (2012)). Instead, they are worse off for at most a few years and only in terms of the quantity but not the quality of employment. More puzzling, good initial conditions are in fact associated with negative outcomes after 5 or 6 years of potential experience. For example, those who faced a better labor market when they exited school (as measured by high employment rates) are less likely to be employed, to have social security coverage, to have a full time job, to have a contract, to be wage workers, and they work fewer hours, especially towards the end of their first decade of labor market experience.
We hypothesize that these results reflect that cyclical variation in labor market conditions dominates scarring effects. Cohorts that enter a bad labor market end up profiting from eventual economic recoveries, and are not strongly penalized by having a bad start. The negative correlation between initial employment rates and eventual employment quality could also be consistent with a model of queuing in which some cohorts forgo initial employment opportunities in order to secure rationed quality jobs down the road (and vice versa). Data sets spanning longer periods would be required in order to confirm our results on employment outcomes after year 6 and discriminate between these different hypotheses.

Effects of Entering the Labor Market in a Recession in Large Cross-Sectional Data Sets,"
Journal of Labor Economics, January 2019, 37 (S1), S161-S198. Publisher: The University of Chicago Press.
(v) * , ** and ***refer to significance at the 10, 5 and 1% levels respectively. Notes: The dummy variable of being employed in the public sector or not is regressed in each column on country level economic condition indicators at the year of graduation: youth employment rate (15-24 years old), oil price, wheat price and gdp growth; interacted with potential years of experience 0 to 10. The potential experience is captured by the difference between the individuals' age at the year of the survey and their age at the year of graduation.
(ii) Governorate and potential experience fixed effects are included in each regression. Controls for eudcational attainment are also included. (iv) Cluster-robust standard errors in parentheses (clustering on cohort and governorate of residence).
(v) * , ** and ***refer to significance at the 10, 5 and 1% levels respectively. Notes: The dummy variable of being a wage worker or not (self employed of non employed) is regressed in each column on country level economic condition indicators at the year of graduation: youth employment rate (15-24 years old), oil price, wheat price and gdp growth; interacted with potential years of experience 0 to 10. The potential experience is captured by the difference between the individuals' age at the year of the survey and their age at the year of graduation.
(ii) Governorate and potential experience fixed effects are included in each regression. Controls for eudcational attainment are also included.
(iii) Sample of males who graduated between 1990-2019. Data are constructed from LFS 2006-2019, and the economic conditions indicators are extracted from World Bank data.
(iv) Cluster-robust standard errors in parentheses (clustering on cohort and governorate of residence).
(v) * , ** and ***refer to significance at the 10, 5 and 1% levels respectively. Notes: The number of workers in the establishment of the individual's main job is regressed in each column on country level economic condition indicators at the year of graduation: youth employment rate (15-24 years old), oil price, wheat price and gdp growth; interacted with potential years of experience 0 to 10. The potential experience is captured by the difference between the individuals' age at the year of the survey and their age at the year of graduation.
(ii) Governorate and potential experience fixed effects are included in each regression. Controls for eudcational attainment are also included.
(iii) Sample of males who graduated between 1990-2019. Data are constructed from LFS 2006-2019, and the economic conditions indicators are extracted from World Bank data.
(iv) Cluster-robust standard errors in parentheses (clustering on cohort and governorate of residence).
(v) * , ** and *** refer to significance at the 10, 5 and 1% levels respectively. Notes: The dummy variable of being employed in the agricutural sector or not is regressed in each column on country level economic condition indicators at the year of graduation: youth employment rate (15-24 years old), oil price, wheat price and gdp growth; interacted with potential years of experience 0 to 10. The potential experience is captured by the difference between the individuals' age at the year of the survey and their age at the year of graduation.
(ii) Governorate and potential experience fixed effects are included in each regression. Controls for eudcational attainment are also included.
(iii) Sample of males who graduated between 1990-2019. Data are constructed from LFS 2006-2019, and the economic conditions indicators are extracted from World Bank data.
(iv) Cluster-robust standard errors in parentheses (clustering on cohort and governorate of residence).