Promotions and Earnings – Gender or Merit? Evidence from Longitudinal Personnel Data

This study examines the determinants of promotions, performance evaluations and earnings using unique longitudinal data from the personnel records of a large university. The study focuses on the role of gender in remuneration using, first, information on the complexity ratings of job tasks to define promotions on job ladders and, second, information on objective individual productivity. The study finds that individual research productivity was an important determinant of promotions and earnings. The results indicate that gender has no effect on the probability of being promoted, conditional on productivity, nor does it play a role in the performance evaluation of employees. Furthermore, the results suggest that contemporaneous productivity measures provide a usable proxy for the past productivity of a worker.


Introduction
. A significant shortcoming of existing literature is the inability to control for actual individual performance differences: observed gender differences in average earnings and promotion rates do not necessarily arise from discriminatory behavior but may simply reflect gender differences in worker output. Consequently, estimates of gender gaps in earnings, promotions, or both, may be biased by the omission of performance variables or by the use of biased proxy variables for performance, most notably, subjective performance evaluation ratings. Imperfect information about the hierarchy of jobs presents another difficulty for the analysis of promotion outcomes, as an ambiguous ranking of job titles within an organizational hierarchy complicates the identification of promoted workers. This problem is particularly evident in multi-organizational studies, as the range of job titles and their hierarchy can widely vary across organizations.
We present new evidence on the relative contributions of worker output and gender to promotions and earnings using longitudinal personnel data from a large Finnish university. The data set contains unique information on worker-specific productivity, employee performance evaluations and detailed job task complexity ratings, allowing us to analyze the role of gender in earnings determination, performance evaluations and promotion decisions within well-defined job ladders while accounting for differences in actual individual productivity.

Motivation and Related Literature
To assess whether gender plays a role in remuneration and promotion decisions, it is essential to compare the (average) pay and promotion rates of identical men and women who are performing equally well. Observed pay and promotion differences between men and women of similar merit and qualifications are conventionally interpreted as evidence of discriminatory behavior by employers, but they can also be ascribed to other factors, such as employee differences in negotiation skills and willingness to ask for pay increases and promotions (e.g., Booth 2009).
In the absence of data on individual output, studies on pay differences and promotion decisions conventionally use human capital-related proxy variables (such as tenure and education level) to control for potential productivity differences among workers. Additionally, some studies have used supervisors' performance evaluation scores of employees to proxy for actual productivity (e.g., Bartel 1995;Flabbi and Ichino 2001;Pekkarinen and Vartiainen 2006;Pema and Mehay 2010). The problem with this approach is that performance evaluations may be biased measures of actual productivity (Waldman and Avolio 1986;Prendergast and Topel 1993), most notably because supervisors tend to give more lenient and compressed evaluation ratings when they know that the ratings are used for administrative purposes (Jawahar and Williams 1997;Moers 2005). Moreover, the use of subjective performance evaluations to account for individual productivity differences is particularly problematic in the analysis of gender biases in earnings and promotions, as gender may be a significant determinant of performance evaluation scores (Bartol 1999;Castilla 2012).
In addition to the lack of worker productivity data, promotion studies are further complicated by the problem of defining the hierarchy of jobs: due to the wide variety of job titles, it can be difficult to identify which job changes within organizations should be regarded as promotions. To define promotions on job ladders, studies have typically deduced the job hierarchy from combined information on job titles, job descriptions and transitions between job titles (e.g., Baker et al. 1994;Dohmen et al. 2004). Alternatively, some studies have determined promotions using questionnaire information on self-reported job changes at the same employer (e.g., Francesconi 2001;Booth et al. 2003). The latter approach is potentially problematic because, as noted by Pergamit and Veum (1999), promotions reported by employees are not always actual promotions but mere formal upgrades of the current position that do not involve changes in job duties.
One particular labor market of highly skilled workers, namely, the academic labor market, provides an ideal setting for an analysis of career outcomes for two reasons. First, academia has a well-defined hierarchy of jobs, thereby facilitating the identification of promoted workers. Second, data on academic employees frequently include detailed individual performance measures, such as research productivity and teaching merit (e.g., Toutkoushian 1998Toutkoushian , 1999Monks and Robinson 2000). Previous empirical evidence suggests that academia is not an exception in regard to gendered remuneration: results from various countriesincluding the US (Toutkoushian 1998), the UK (Blackaby et al. 2005), Canada (Warman et al. 2010) and Japan (Takahashi and Takahashi 2011)indicate that female academics earn less than male colleagues of comparable merit and productivity. Furthermore, gender pay inequality is evidently increased by gender-biased promotion procedures, as men in academia are more likely to be promoted than are women, even when conditioning on differences in individual qualifications and academic productivity (Ward 2001;Ginther and Hayes 2003). However, previous findings also illustrate that gender gaps in career outcomes are partly attributable to productivity differences, as the results show that the observed gender pay gap decreases when differences in academic achievements are considered (e.g., Barbezat 1991;Ransom and Megdal 1993).
The first contribution of our analysis is the use of detailed information on the complexity of job tasks to determine the hierarchy of jobs, allowing us to assess the roles of gender and productivity in promotions along well-defined job ladders. In contrast to some closely related promotion studies, including those by Pekkarinen andVartiainen (2006), Van Herpen et al. (2006) and Kunze and Miller (2014), we use information on actual output rather than subjective performance evaluations to control for individual productivity differences. Furthermore, we contribute to the literature on employee performance appraisal by testing whether the gender gap in performance evaluations (Bartol 1999;Castilla 2012) is sensitive to the inclusion of variables measuring worker productivity. Finally, our results provide additional evidence of gender pay differences in formalized pay systems that partly tie compensation to worker performance. Such pay systems may reduce gender inequality in compensation for two reasons. First, the explicit guidelines of formalized wage systems may restrict supervisor discretion in pay and promotion decisions, leaving less room for gender discrimination (e.g., Elvira and Graham 2002). Second, performance-related compensation ought to limit the pay differences between male and female workers with similar outputs. However, the (indirect) empirical research on whether this is in fact the case is inconclusive: some findings suggest that the gender pay gap is smaller when workers are paid on the basis of output rather than on the time they spent working (Jirjahn and Stephan 2004;Petersen et al. 2007), while others indicate that gap is more pronounced in pay-for-performance wage systems (De la Rica et al. 2010;Kangasniemi and Kauhanen 2013).
Our analysis employs a longitudinal data set drawn from the personnel records of a single university (the University of Jyvaskyla). In recent decades, a growing body of empirical literature has utilized personnel data from single firms and universities to analyze the determinants of different career outcomes (e.g., Baker et al. 1994;Flabbi and Ichino 2001;Ransom and Oaxaca 2005;Haeck and Verboven 2012;Kelchtermans and Veugelers 2013;Dohmen et al. 2014). Although results based on a single organization should be interpreted with some caution, there are several advantages of using such data to study earnings and promotions decisions. First, the data from personnel records are typically highly accurate and contain detailed information not available in customary survey and administrative data sets, including worker-specific productivity measures and a well-defined hierarchy of job titles. Second, personnel data allow us to analyze earnings and promotion decisions within an internal labor market with homogenous personnel policies and uniform criteria for remuneration and career advancement. Third, as opposed to a multi-organizational study, we can ignore the effects of unobserved organization heterogeneity with respect to earnings. This is particularly important in analyses of gender pay gaps, as the available evidence shows that these gaps can vary considerably across organizations (e.g., Heinze and Wolf 2010).

Institutional Background
Finnish universities and the pay system for academic employees The university system in Finland consists of ten multidisciplinary universities, two universities of technology, one university of the arts and one independent business school. 1 The universities are administered by the state, and the majority of their funding comes from the state budget and other public sources. In recent years, and following international patterns (Vincent-Lancrin 2009), the allocation of public funding has become more closely tied to university-specific output; the level of state funding is mainly based on universities' teaching loads (the number of graduates and course credits) and research achievements (the number and quality of publications) and, to a lesser extent, on university-specific and strategic factors. Furthermore, project-based research funding from external sources has become an increasingly important component of university budgets over the past decades.
In 2014, there were 17,653 researchers and university instructors in Finnish universities (AFIEE 2014). Compared to other EU-27 countries, women are well represented in Finnish academia, with the share of females exceeding those of other countries at every level of the academic hierarchy ( Figure 1); in 2012, 52% of the faculty was female in academic ranks typically held by recent PhD graduates (grade C) and more senior researchers (grade B). However, female researchers also seem to be underrepresented in top academic positions in Finland: the share of female professors (grade A), although high compared to other nations, was only 24%.
The university analyzed in this paper, the University of Jyvaskyla, is the sixth largest university in Finland based on student enrollment. The university includes seven faculties, each with a number of schools and disciplines: 1) education, 2) humanities, 3) information technology, 4) mathematics and science, 5) social sciences, 6) sport and health sciences and 7) business and economics. As illustrated in Table 1, the student and personnel characteristics of this university are comparable to those of other Finnish multidisciplinary universities. A distinguishing feature of the University of Jyvaskyla is the high representation of women in top academic ranks: the share of female professors (37%) exceeds the average of other universities (31%), partly reflecting differences in the disciplinary composition of Finnish universities. Academic earnings are set by a collective bargaining agreement, which applies to all university employees. The pay system is uniform across all universities and relates remuneration to the complexity of job tasks and personal performance by decomposing monthly earnings into two main components, namely, a task-specific component and a performance component (see Table 10 in the Appendix). 2 The task-specific component is based on job complexity (measured on 11 levels) and determines the minimum earnings level. The performance component is proportional to task-specific component, varying from 0 to 46% depending on the employee's performance level (of 9 different levels). Additionally, employees can earn bonuses for supplementary assignments, such as administrative duties. In 2014, the average shares of the task-specific, performance and bonus components among full-time faculty members of Finnish universities were 79%, 19% and 2%, respectively, of total monthly earnings (AFIEE 2014).   Job complexity ladder, employee evaluations and promotions When appointed to a university, a new employee typically starts a fixed term of employment lasting up to 5 years. After holding a temporary research or teaching position, the employee may be considered for a permanent appointment (an employment contract of indefinite duration), subject to satisfactory job performance. At the time of recruitment, the employee is assigned to one of 11 job complexity levels, with higher complexity levels being associated with a wider variety of academic duties, more complex job tasks and greater responsibility. There is a built-in relationship between the job complexity ladder and the hierarchy of occupations, as illustrated in Table 2: early career researchers, such as PhD students and teaching assistants, typically work at complexity levels 1-4, lecturers and researchers with more seniority at levels 5-7 and full professors at levels 8-11. Two details from Table 2 should be emphasized. First, each occupation has its own job complexity ladder. For example, within the rank of full professor, there exists a four-step ladder with job complexity levels ranging from 8 to 11. Second, job complexity levels overlap occupations; for example, senior researchers with the longest tenures may reach job complexity level 8, which is the typical starting level for newly hired full professors. Job complexity and performance levels are evaluated independently in an assessment meeting between a supervisor and an employee. 3 The assessment meeting is typically held once every two years, but the employee is entitled to request a reassessment in the event of significant changes in his or her job duties. The job complexity level is assessed based on a job description, which includes all the essential duties and responsibilities of the employee. The assessment of personal performance is based on three different criteria: (1) teaching merit, (2) research achievements and (3) societal engagement and contributions to the university community. Each of these criteria is rated on a nine-point scale ranging from Bvery low^to Bexcellent^based on a performance evaluation of the assigned tasks and duties. The overall performance rate is obtained as a weighted sum of rates on different criteria, weighted by the share of working time devoted to each activity. After the job complexity and performance evaluations are agreed upon by the employee and the supervisor, the central university administration appraises the performance evaluations to ensure that performance is assessed consistently across employees in the same discipline, occupation and job complexity level.
A promotion on a job complexity ladderi.e., an increase in an employee's job complexity levelis always associated with an increase in the variety in, complexity of and responsibility associated with the employee's job duties. Moreover, because the job complexity level essentially determines the minimum earnings level of an employee (see Table 10 in the Appendix), a promotion on the complexity ladder is always accompanied by a pay increase. At the lowest job complexity levels, 1-4, promotion is typically the result of progress in PhD studies and increased teaching responsibilities. At higher complexity levels, 5-11, promotion involves a diversification of academic tasks (e.g., research, teaching, administrative duties, thesis supervision) and more responsibility and job Authors' own calculations from the personnel data used in the following analysis complexity (e.g., heavier teaching loads, teaching more advanced courses, managing research projects, serving as the vice-head or head of a department). An employee can be promoted on a job complexity ladder in three different ways. First, the job complexity level may be increased during an assessment meeting with a supervisor, which is organized biennially without the need for an employee request. Second, the employee can request a reassessment of job complexity if he or she is unsatisfied with the current assessment (e.g., due to the notable changes in his or her job duties and responsibilities after the previous assessment meeting). Third, the employee can apply and be appointed to a new occupation higher on the job complexity ladder.

Determinants of Promotions and Earnings
Data and empirical approach The data employed are drawn from the personnel records of a Finnish university for the 2006-2012 period. This panel data include all full-time faculty members, with 8894 observations on 2583 individuals. The data set contains the following information for each individual 4 : personal id number, observation year, monthly earnings, gender, age, tenure, highest degree, department, occupation (academic rank), job complexity level, personal performance level and annual number of variously classified publications. Our data differ from those of earlier studies in two important ways. First, the data are well balanced by gender, with a proportion of women of approximately 48%. Second, the panel structure of the data allows us to track individuals over time; with few exceptions (Binder et al., 2010;Bratsberg et al. 2010;Haeck and Verboven 2012), the majority of the previous research on academic pay gaps has relied on cross-sectional data. Table 3 summarizes the personnel data, showing that the average monthly earnings of female researchers were approximately 12% lower than those of their male colleagues. The mean values of background characteristics indicate that, on average, female faculty members were younger, had shorter tenures, were less likely to hold a doctoral degree and worked at lower job complexity and performance levels than male faculty. The distributions of employees by occupation show that women were significantly more likely to work as university instructors and less likely to work as full professors than men. Furthermore, compared to men, women published a lower number of peer-reviewed international articles. A more detailed examination of the data implies that the lower publication activity of female researchers is partly explained by the concentration of women in disciplines (departments) with lower average output of international articles. In the following analysis, we will examine gender differences in research productivity more thoroughly by estimating a set of research output models.
The last panel of Table 3 reports the yearly promotion rates by gender. The reported promotion ratesdefined as the fraction of employees whose job complexity level increased in consecutive yearsreveal that a higher fraction of women were promoted than men and that a major portion of promotions occurred at lower rungs of the job complexity ladder. The promotion rate for all employees was 12.6%. Approximately one-fourth (24.5%) of all promotions were accompanied by a change of occupation. The majority of promotions (78%) consisted of shifts to the next level on the job complexity ladder, with only 22% increasing by two job complexity levels.
The joint distribution of employees' job complexity and performance levels in Fig. 2 illustrates that a higher proportion of men than women were working at the highest job complexity levels; in 2012, 23% of men were working at complexity levels 7-11, *Statistically significant at the .10 level **at the .05 level ***at the .01 level a Promotion = increase in job complexity level in consecutive years. Reported promotion rates are averages of yearly promotion rates. Rates are based on employees who worked for (at least) two consecutive years compared to 14% of women. Furthermore, at the top of the job complexity ladder, men were likely to have higher performance levels than women: among those working at the highest job complexity levels (9-11), 40% of men and 23% of women attained the highest performance levels (8-9). The joint distribution also shows that job complexity and performance levels were positively related, indicating that the performance level was higher for those higher up the job complexity ladder (in 2012, the correlation coefficient between job complexity and performance level was 0.59). These observations raise two key questions that we address in this study: (1) Does the segregation of women at lower job complexity levels result from gender bias in promotion decisions and/or in entry-level job complexity levels? That is, do female employees encounter barriers to reaching higher levels on the job hierarchy? (2) Are the higher performance evaluations of men determined by actual gender differences in worker productivity or do they reflect undervaluation of female researchers' academic achievements? Using our longitudinal personnel data, we first evaluate the role of gender in promotion decisions by running a linear probability model on whether the employee's job complexity level increased between two periods. Second, we estimate ordered probit models of job complexity and performance levels to analyze gender bias in assignment to different job ladders and in employee performance evaluations. Third, we assess the robustness of gender gaps in total earnings by estimating a set of earnings equations using standard OLS regressions. Finally, we conduct additional analyses to determine (1) whether gender differences in the production of peer-reviewed articles exist, (2) whether a gender difference in the probability of working as a full professor exists and (3) whether a gender pay gap exists within the full professor rank. In all estimated models, we control for an appropriate set of individual qualifications, job characteristics and research productivity variables. Job complexity levels 7-8 Job complexity levels 9-11 Performance levels 1-3 Performance levels 4-5 Performance levels 6-7 Performance levels 8-9

Fig. 2 Job Complexity and Performance Levels in 2012
We include age and job tenure to control for the employee's previous work experience: age acts as a proxy for potential total work experience, and job tenure measures the time that has passed since an employee became employed at the university. To account for the effects of education level, we include dummy variables for the highest degree completed. To allow for possible career outcome differences between academic disciplines, we employ dummy variables for departments as proxy variables. The discipline controls are particularly important for the earnings equations because disciplines may differ significantly with respect to outside wage offers, and academic earnings may be inversely related to the proportion of women in the discipline (Bellas 1997;Umbach, 2006).
In the earnings equations, we include a dummy variable for administrative duties for two reasons: to account for the additional compensation received for performing these duties and the time spent on these duties. Remuneration for administrative tasks yields an additional source of gender bias in earnings if men are more likely to be assigned to administrative positions. To assess potential gender bias in the assignment of administrative tasks, we regressed a dummy variable for these duties on a gender dummy and set of individual background variables (see Table 11 in the Appendix); the gender coefficients were consistently close to zero and statistically insignificant, implying that gender did not play a role in the assignment of administrative duties.
To evaluate the role of worker-specific productivity on earnings and promotions, we include three research productivity variables in our models: the number of peerreviewed international and national articles and other publications (e.g., books, book chapters, working papers). The distinction among publication types is essential, as some academic disciplines primarily focus on international publications, whereas others also emphasize the importance of national publications; moreover, academic disciplines differ with respect to how they weight journal articles relative to other publications (Räty and Bondas 2008).
As our data only include the annual publication records of employees working at this university (i.e., we have no information on past research achievements or merit beyond this university), we use the contemporary publication count as a proxy variable for employees' past research productivity. Hence, we implicitly assume that individual research productivity is relatively stable over time. Because this assumption may fail to hold in practice, we also use the cumulative publication count in the previous periods to assess the effects of past productivity on earnings and promotions. Furthermore, because earnings and promotion decisions may depend not only on absolute worker output but also on relative output, we also employ relative publicationscalculated by dividing the publication count for worker i during a given period by the average number of publications in worker i's discipline (department) during that periodin our analysis.
In the absence of information on the quality of individual research output, we cannot directly analyze whether higher quality research was rewarded with higher earnings and/or promotions. However, the distinction among publication types provides an indirect way to assess the role of research quality in earnings and promotions: international peer-reviewed articles are likely to carry more weight in performance evaluations than other publications. Hence, we expect to observe larger positive coefficients on these articles than on other publication variables in the estimated earnings and promotion equations.

Determinants of promotions
To analyze the determinants of promotions, we estimate the following linear probability model: where the dependent variable Promoted ijdt is a dummy variable that equals one if employee i's job complexity level increased from the previous period. This is an appropriate definition of a promotion because, as described above, higher complexity levels entail more demanding job tasks and greater responsibilities. Because employees typically have a promotion opportunity once every two years, the dependent variable of the base specifications is equal to one if an employee was promoted between year t − 2 and year t and zero otherwise. The dummy variable Female i equals one if employee i is a female. Publications it is a vector of three publication variables (peer-reviewed international articles, national articles and other publications) that indicate the sum of employee i's publications over the two previous years. X i, t−2 is a vector of two-yearlagged control variables, including age, tenure, dummy variables for education levels and a dummy variable indicating whether an employee's education level changed from the previous period (i.e., between t − 2 and t) and θ d are department dummies. Furthermore, control variables include the job complexity level in the previous period, π j, t−2 , to account for the fact that there are a limited number of job complexity levels and that promotion probabilities may differ across job complexity levels (see Table 3). 5 Table 4 presents the main results of the linear probability models of promotions within job complexity levels. The estimated gender coefficients in columns 1 and 2 indicate that the probability of promotion was lower for women only before controlling for research productivity differences. The estimates in the next columns show that this conclusion holds after controlling for the background characteristics of a worker (column 3), after using relative publications instead of publication counts (column 4) and after redefining the dependent variable to account for year-to-year promotions (column 5). Hence, the results suggest that female researchers were as likely to be promoted as their similarly productive and qualified male colleagues. Furthermore, consistent with previous studies (Ward 2001;Ginther and Hayes 2003), the results in Table 4 suggest that higher research productivity was associated with a higher probability of being promoted, with the coefficient estimates in columns 2, 3 and 5 indicating that the promotion probability increased with the publication count. These estimates imply that national articles carried more weight in promotion decisions than did international articles. This finding is explained by the sensitivity of the results to the extensive publication records of a few researchers; for example, excluding the top 5% of observations for all peer-reviewed articles more than doubled the coefficient estimate for international articles (to 0.033) in the model in column 3, while the other publication coefficients remained nearly unchanged. Hence, the number of internationally published, peer-reviewed articles seems to have been the primary research output measure in promotion decisions. This conclusion is confirmed by the research productivity estimates presented in column 4, which suggest that both absolute and relative research output were factors in promotion decisions: the relative number of peer-reviewed international articles was positively and significantly related to promotion prospects, whereas the relative output of peer-reviewed national articles was not a significant factor in promotion decisions.
The results in Table 4 strongly indicate that, conditional on individual research output, gender was not a determinant of promotions. In addition to differences in promotions, a gender-biased job hierarchy may result not only from gender differences in promotion probabilities but also from gender differences in the assignment of employees to job levels upon hiring. To examine the role of gender in the determination of job levels more carefully, we estimate an ordered probit   Cluster-robust standard errors in parentheses (clustered at the worker level). All models include a constant term. Worker characteristics include age, tenure, dummy variables for education levels and a dummy variable indicating whether an employee's education level changed from the previous period. When the models in the last three columns were estimated with department dummy variables as explanatory variables, the coefficient estimates were virtually unchanged. Using probit or logit models instead of linear probability model produced qualitatively similar results. Full results are available upon request *Statistically significant at the .10 level **at the .05 level ***at the .01 level model of the job complexity level. In other words, we estimate a latent variable model of the following form 6 : where the latent unobserved variable Job complexity level idt * takes values in {1, 2, 3, …, 11}. The dummy variable Female i equals one if employee i is a female; Publications it is a vector of contemporaneous publication counts of peer-reviewed international and national articles and other publications; X it is a vector of control variables, including age and tenure (as well as their squared terms) and dummy variables for the education level; and θ d are department dummies. The parameters of the model (β, γ, δ, θ) are estimated by maximum likelihood estimation. Table 5 presents the coefficients of the ordered probit models of the job complexity level. 7 According to the results, there is weak or no evidence that gender plays a role in the assignment of employees to a level on the job hierarchy: the estimated gender coefficient is statistically significant when contemporaneous publication counts are used to account for research productivity differences (column 1) but statistically insignificant when past publications are used to measure worker output (columns 2-3). The coefficient estimates on the publication variables indicate that more productive (both in absolute and relative terms) faculty members were more likely to work at higher levels of the job complexity ladder.

Determinants of performance evaluations
To examine the determinants of performance evaluations, we employ an ordered probit model similar to that in eq. (2), with the individual performance level now used as the dependent variable. Columns 1-3 in Table 6 present the main results of the ordered probit analysis. 8 All reported models condition on a set of worker background characteristics, research productivity and job complexity. Controlling for the job complexity level is important because performance levels are positively related to job complexity levels (see Figure 2); however, excluding job complexity from the models does not alter the conclusions of the analysis presented here. The estimates in column 1 are conditioned on contemporaneous publication counts to control for individual research output. The estimated gender coefficient is negative and statistically significant at the 10% level. The significant gender difference disappears when past publication output is used to control for research productivity in columns 2 and 3, suggesting that male and female employees with similar qualifications and research outputs received similar performance evaluations. Hence, in contrast to some previous studies (Bartol 1999;Castilla 2012), our results suggest that gender plays a negligible or no role in employee performance evaluations. The estimated coefficients on the publication variables indicate a positive relationship between actual output and assessed performance, implying that better-performing employeeswhether measured in absolute or in relative termsare likely to receive better performance evaluations.
Furthermore, to examine whether gender differences in changes in performance evaluations exist, we estimate a linear probability model that parallels that in eq. (1), with the dependent variable now being a dummy variable that equals one if the employee's performance level increased between year t − 2 and year t and zero otherwise. 9 To account for the fact that performance level changes among employees who moved up or down the job complexity ladder were likely less related to job performance than to other considerations (e.g., promotions were often associated with a decrease in individual performance), the estimated model is based on a sample of employees whose job complexity remained unchanged from the previous period. The results of the linear probability models of performance level increments are presented in columns 4 and 5 of Table 6. The nonsignificant coefficient estimates on the female  variable indicate that the probability of being upgraded to a higher performance level did not depend on gender. The coefficients of the publication variables show that employees who produced more peer-reviewed articles were more likely to be upgraded to a higher performance level, while output of other publications had no effect on the probability of being upgraded.   The first three columns report the coefficients of the ordered probit models. Cluster-robust standard errors in parentheses (clustered at the worker level). The dependent dummy variable of the linear probability models equal one if employee's performance level increased between t − 2 and t and zero otherwise. Worker characteristics include age, age 2 , tenure, tenure 2 and dummy variables for education levels. Both linear probability models include a constant term. The full results are available upon request *Statistically significant at the .10 level **at the .05 level ***at the .01 level

Earnings differentials
In order to assess the magnitude of the gender pay gap, we estimate the following earnings equation using standard OLS regression: where the dependent variable is a logarithm of monthly earnings (in euros) for employee i at job complexity level j in department d in year t; Female i equals one if an employee is a female; Publications it is a vector of contemporaneous publication counts of peer-reviewed international and national articles and other publications; X it is a vector of control variables, including age and tenure (as well as their squared terms), a dummy variable for administrative duties and dummy variables for the education level; and θ d are department dummies and λ t are year dummies. If there exists a gender gap in earnings, we would expect to observe a statistically significant nonzero value for the coefficient of the Female variable, β. Table 7 summarizes the main results of the earnings equations. According to the gender coefficient in column 1, female researchers earned approximately 11% less than their male co-workers. The estimates in columns 2 and 3 illustrate that the male premium in earnings is mainly explained by differences in research productivity and individual background characteristics: after adjusting for these differences, a gender gap of approximately 2% remains. Adding controls for occupations in column 4 significantly improves the fit of the model and yields a statistically insignificant gender gap of approximately 1%, suggesting no gender pay gaps among occupations. If the assignment of employees to different occupational levels depended on gender, then the inclusion of occupation dummies would bias the estimated gender earnings gap downward; however, as the analysis above suggests, gender was not a significant determinant of position on the job hierarchy in this particular organization. Using past research output instead of contemporaneous output to control for research productivity in column 5 produces a statistically insignificant gender earnings gap of approximately 1%. The earnings equations reported in the table only control for publication counts and, hence, do not account for differences in relative research productivity. However, reestimating the models in columns 3-5 using relative publications variables instead of publication counts lead to similar results regarding the gender pay gap: the gender coefficient is close to zero and typically statistically insignificant.
The previous results provide strong evidence that the observed gender gap in average earnings is mainly attributable to worker differences in background characteristics and research productivity. To further examine whether there were gender differences in pay changes, we replaced the dependent variable of the earnings eq. (3), the logarithm of earnings, with the difference in logarithmic earnings between two consecutive years and used one-year lagged publication and background variables instead of contemporaneous variables as regressors. The results of the estimated pay-change equation are reported in column 6. The results indicate that gender was not a determinant of year-to-year earnings changes. The coefficient estimates on the publication variables show that earnings changes were positively related to research output, implying that increments in earnings were higher for more productive workers.
Data limitations prevent us from directly assessing the robustness of our results to the inclusion of productivity measures other than research output. One important measure might be the amount of time devoted to teaching. In Year Earnings = monthly earnings in euros. Cluster-robust standard errors in parentheses (clustered at the worker level). All models include a constant term. Worker characteristics include age, age 2 , tenure, tenure 2 , a dummy variable for administrative duties and dummy variables for education levels. (The linear probability model in the last column also includes a dummy variable indicating whether an employee's education level changed from the previous period as a control variable.) The full results are available upon request *Statistically significant at the .10 level **at the .05 level ***at the .01 level the absence of teaching data, we evaluated the sensitivity of the gender and publication coefficients to the omission of teaching load variables by reestimating the model in column 3 after excluding the most teaching-intensive occupations, namely, university instructors and lecturers. The resulting coefficient estimates on the gender and publication variables were virtually unaffected, implying that the main results are not altered by the omission of teaching load variables. 10

Additional Findings
Gender differences in research productivity The above results suggest that gender gaps in promotion rates and earnings partly reflect gender differences in research productivity. The lower research productivity of female academics is widely acknowledged in the literature (Schneider 1998;Xie and Shauman 1998). Other empirical studies suggest that gender gap in research output cannot be fully explained by differences in researcher characteristics (e.g., experience and academic rank) or by other factors, such as the concentration of female researchers in academic disciplines with less publishing (e.g., Toutkoushian and Bellas 1999;Hesli and Lee 2011).
To analyze gender differences in research productivity in more detail, we estimate the following equation using OLS, Poisson and negative binomial regression models: where the dependent variable is the annual number of peer-reviewed articles (both international and national) of employee i at job complexity level j in department d in year t; Female i equals one if an employee is a female; Other publications it is the annual number of other (non-refereed) publications; X it is a vector of control variables, including age and tenure (as well as their squared terms), a dummy variable for administrative duties and a dummy variable for a doctoral degree; and π jt are job complexity level dummies and θ d are department dummies.
The regression results of eq. (4) are presented in Table 8. The gender coefficient from the OLS regression in column 1 indicates that female researchers produced, on average, approximately one fewer article than their male colleagues. The results in column 2 suggest that this gender gap in research productivity is mainly attributable to differences in worker characteristics, and the gender coefficient is no longer statistically significant when these differences are accounted for. However, because the OLS regression assumes a continuous dependent variable and is therefore not appropriate for the analysis of count dependent variable, we also estimated research output using methods designed for count data, namely, Poisson regression (column 3) and negative binomial regression (column 4). The gender coefficients from these preferred regressions indicate that female researchers produced statistically significantly fewer peer-reviewed articles than male researchers with similar background characteristics.
The existing literature proposes several potential explanations for the gender gap in research output. First, female researchers' research output might be adversely affected by childbearing and heavier engagement in childcare and other household responsibilities (e.g., Stack 2004). Second, female faculty members may use more of their working time to activities other than research (Toutkoushian and Bellas 1999;Link et al. 2008), possibly due to their stronger preferences for or motivation to engage in non-research  Cluster-robust standard errors in parentheses (clustered at the worker level). All models include a constant term. In the Poisson and negative binomial regressions, only statistically significant worker and job characteristics were included in the models (i.e., age 2 was excluded) *Statistically significant at the .10 level **at the .05 level ***at the .01 level activities (e.g., Bentley and Kyvik 2013). Third, insufficient resources and weaker research networks may diminish the publication output of female researchers, especially in male-dominated disciplines if researchers tend to co-author with colleagues of the same sex (McDowell and Smith 1992) and if research productivity increases with coauthorship (Hollis 2001). Fourth, journal editors and reviewers may discriminate against female authors, leading to higher rejection rates for female-authored manuscripts (Ferber and Teiman 1980 ). However, the empirical support for these explanations is ambiguous, as studies have shown that (1) female researchers with dependent children have similar research productivity as male researchers (e.g., Sax et al. 2002), that (2) additional time spent on other activitiesmost notably, teachingdoes not have a negative effect on research output (Shin and Cummings 2010) and gender is a weak predictor of research time amongst university faculty (Bentley and Kyvik 2013) and that (3) gender does not play a role in the article review process (Abrevaya and Hamermesh 2012). The estimated coefficients of other covariates also reveal some interesting relationships. First, research output was lower for older workers and increased with tenure but at a diminishing rate. Second, the coefficients imply that time spent on administrative duties had a negative effect on research productivity. Third, those who had higher outputs of other publications (e.g., non-refereed book chapters, discussion papers) produced more peer-reviewed articles.

Female professors and the gender pay equity of professors
The results above show little or no evidence of gender bias in the assignment of faculty members to different levels on the job complexity ladder. However, a potential obstacle to career advancement was identified for female researchers: women may be less likely to achieve the full professor rank than men. As reported in Table 3, approximately 19% of men were working as full professors compared with 9% of women. To assess whether this difference was attributable to differences in worker characteristics, the first column of Table 9 reports the gender difference in the likelihood of holding a full professor position among employees working at job complexity levels 6-11. The female coefficient suggests that, conditional on worker and job characteristics, female employees were 6% less likely to work as professors than equally qualified males. Although women were underrepresented in professor positions, women who had achieved professorships earned equal pay for equal work: the results of the earnings equation in the second column of the table imply no gender pay gap within the professor rank.

Conclusions
This study employs personnel data to evaluate the role of gender in internal promotion, employee performance evaluation and earnings determination. Using detailed information on the complexity rating of job tasks to identify promotions along the job hierarchy, we show that male and female researchers were equally likely to be promoted, conditional on individual research productivity. The findings demonstrate that worker-specific productivity differences may be a primary reason for gendered promotion rates. An analysis of the determinants of employee performance evaluations reveals that gender played a negligible or no role in evaluation decisions. The observed male premium in earnings was mainly attributable to individual differences in research productivity and background characteristics: adjusting for these differences reduced the gender earnings gap from approximately 11% to approximately 1-2%. Moreover, once the full set of controls was included, the gender coefficient was no longer statistically significant. Additionally, the results demonstrate that female researchers had lower research output than their male colleagues, even after conditioning on a set of worker characteristics, including age, tenure and academic discipline. Finally, the results suggest that female and male professors were paid equally, although female employees were less likely to work as full professors than equally qualified men.
The results indicate that higher research productivity was related to higher probabilities of being promoted to or working on the highest job ladders (academic ranks). The findings also confirm that more productive researchers received more favorable performance evaluations than others with similar background characteristics, implying that the available worker output information was effectively employed in the assessment of employee performance and was therefore likely to reduce the subjectivity of the evaluation process and result in more objective performance evaluations. Our analysis employed publication counts to measure individual productivity. Other productivity measures, such as the quality of research and teaching, can also contribute to pay and promotion decisions. Earlier studies illustrate that publication quality, as measured by the number of citations (Moore et al. 1998;Bratsberg et al. 2010) or by the number of articles in top-tier journals (e.g., Hilmer and Hilmer 2005), is positively related to academic salaries. Our findings also provide some evidence that the quality of research matters for career advancement decisions: peer-reviewed international articles carried more weight in pay and promotion decisions than peer-reviewed national articles or other publications.
Employees may also receive rewards for their teaching load and skill, in terms of higher earnings and promotion probabilities. However, given (1) the theoretical arguments for why incentives for research productivity may have increased in universities (Remler and Pema 2009) and (2) the empirical findings suggesting that universities have become more inclined to make hiring, promotion and remuneration decisions largely based on research without regard to other achievements (Laband and Tollison 2003;Remler and Pema 2009), teaching may play a constantly diminishing role in various career decisions. In fact, the findings of several recent studies suggest that heavier teaching loads are penalized with lower earnings (e.g., Graves et al. 2002;Umbach 2006;Binder et al. 2012). Teaching might be less relevant to pay and promotion decisions at the university analyzed in this paper for several reasons. First, teaching loads are typically uniform within occupations, and the results are robust to the exclusion of occupations with more variable teaching loads. Second, the assessment of teaching skill is difficult, especially because student evaluations of instructors are not collected. Finally, the university's funding is closely tied to the number of research publications, giving supervisors strong incentives to emphasize publications in pay and promotion decisions.
Two conclusions can be drawn from our findings about the role of worker productivity in career outcomes. First, both absolute and relative individual output may be important factors in determining promotions, earnings and performance evaluations. Second, using contemporaneous and past productivity measures yielded qualitatively very similar results for the effects of worker output on earnings and performance evaluations, suggesting that information on employees' concurrent productivity provides a valid proxy for their past productivity.
Description of the variables.

Variable name Description
Monthly earnings Monthly earnings in euros.

Age
Age in full years. Used as a proxy variable for potential total work experience.

Tenure
Measures the number of years of service at the university. For employees missing this information, tenure measures the length of time since the latest labor contract was negotiated; the variable will therefore underestimate actual job tenure for some employees. Furthermore, in some cases, tenure is likely to be an overestimate of actual work experience because it is measured in full years after a specified reference date and possible career breaks are not accounted for.
Education (highest degree) Three options: master's degree (or lower), licentiate's degree, doctoral degree. Approximately 13% of the worker-year-observations lack information on education level. We imputed these missing values with the most common education level of the employees working in the same occupation. However, the reported results were essentially unchanged when individuals with missing education information were excluded from the analysis.
Job complexity level 11 different job complexity levels.

Number of publications
Publications are divided to three categories: 1) peer-reviewed international articles, 2) peer-reviewed national articles, 3) all other publications (e.g., book chapters, discussion papers).

Departments 27 departments.
Administrative duties = 1 if a worker had concurrent administrative duties (i.e., earned wage bonus for administrative duties), = 0 otherwise. Research productivity variables measure concurrent numbers of publications (international peer-reviewed articles, national peer-reviewed articles and other publications). Full results are available upon request *Statistically significant at the .10 level **at the .05 level ***at the .01 level  Ordered probit models of Table 5 Job complexity level   Ordered probit models of Table 6 Individual performance level