In summary
Inequality in education leads to great potential handicaps for the kids who suffer an academic gap throughout their time in class. Reliable evaluations are a must to measure students’ progress (or lack thereof) and drive in-class teacher practice and system-wide education policies. Teachers have their curriculum-related tools for individual students’ follow-up. National authorities establish country-wide testing at critical turns in their education system to measure relative local performance and allow results-based progress throughout the system. International organizations run large tests across a large array of countries allowing their members to benchmark their respective education systems against each other and draw lessons for policies.
Using one of these large scale testing exercises, UNICEF has recently published a Report Card that includes a ranking of countries based on a measure of inequality in education. It seeks to compare the achievement gap of the most academically delayed across a large number of countries. Although the research is rich and well supported, in this post I argue that the specific metric retained for the ranking is not well appropriate to lead actions of policy makers. I present a simple measure of the academic gap that draws on the very characteristics, statistically sound, of the large scale test. This measure appears quite coherent with other approaches to get at this important issues of inequality in education.
Academic gap: How deep societies let their kids sink academically?
Equity is one of the major concerns in education policy design. Recently, the UNICEF Office of Research – Innocenti published a very interesting report, its 13th Report Card entitled Fairness for Children – A league table of inequality in child well-being in rich countries. Although it is quality research work, I ran to the metric applied to measure and rank inequality in education (League Table 2 in the UNICEF report). In this post, I would like to explain my reservations about this measure and propose what I think is a more appropriate indicator for measuring inequality in student academic performance, one that leads to a more direct, policy-relevant application.
What is the objective of measuring inequality in student academic performance?
In all strands of life evaluation is key to measure relative standings and progress. In the area of education, as schooling brings up student academic and social competencies through progressive construction of self, evaluation is an important process through which each individual student’s standing and progress are measured against academically and socially validated benchmarks. Let’s focus here on the academic aspect. Evaluation activities collect important information about the performance of students, the performance of classes, schools and national education systems as well when properly aggregated and weighted to the student population. With application of simple, but appropriate, statistical tools, one develops an assortment of measures that will tell a lot about the distribution of student performance within class or within a national education system, allowing to judge efficiency and equity of policies and practices.
Key questions can be addressed, in search of answers to drive educational policies and practices to achieve greater levels of equity in academic success:
- How many students lag in academic performance,
- By how much do they underperform,
- Against which recognized “standard performance” are they evaluated, and ultimately
- Who are they – or what are their characteristics (when analysing survey results rather than a nominative student population) to allow the development of identification strategies.
What tools do we have to measure inequality in student academic performance?
Inequality of student performance comes up through evaluation of students:
- Teachers in the classroom measure their students’ progress through a variety of exercises appropriate to the curriculum they are responsible to impart. Doing so, they identify difficulties their students face, adjust their practices and develop strategies to support them.
- In Ministries of Education, national curriculums are set and national – often high stake – exams are established at key stages of schooling. Doing so, national education policy designers review the overall performance of schools in the system and may develop strategies to support the ones identified with struggles to meet set objectives.
- International organizations set up student evaluation projects – using most appropriate sampling strategies – to allow their member countries to benchmark the performance of their respective education system to the best performing ones.
The Programme for International Student Assessment (PISA) is one of the major such international evaluation projects. It is orchestrated by the Organisation for Economic Co-operation and Development (OECD) with evaluation data collected every three years since 2000. In contrast with in classroom evaluation, national evaluation systems and other international evaluation projects (such as the ones co-ordinated by the International Association for the Evaluation of Educational Achievement), PISA adopts a “literacy” approach to its evaluation: “PISA is unique because it develops tests which are not directly linked to the school curriculum. The tests are designed to assess to what extent students at the end of compulsory education [when they are 15 years old], can apply their knowledge to real-life situations and be equipped for full participation in society. The information collected through background questionnaires also provides context which can help analysts interpret the results.” The UNICEF Report Card and my critics contribution use the evaluation data collected in 2012. PISA evaluations cover three key subjects, reading, mathematics and science, with a more specific focus on one of the subjects in each year of assessment (mathematics in 2012).
Of particular relevance for me in the following discussion is the “standard” established in each subject in PISA. The outcome of the scaling process and the transformations scientifically applied to the data were chosen so that the mean and standard deviation of the PISA 2000 scores in reading are 500 and 100 respectively, for the equally weighted 27 OECD countries that participated in PISA 2000 that had acceptable response rates. While respecting the same principle, but taking into account the greater development of the mathematics evaluation, as the main domain in PISA 2003, and of the science evaluation, as the main domain in PISA 2006, the distributions for these subjects were rebased to a mean of 500 and a standard deviation of 100 in these respective years. Successive PISA assessment are all conforming to the such defined bases (more detail can be found in the PISA 2012 Technical Report).
In my mind, the score of 500 becomes an important reference for measuring up academic performance. By construction, it is a stable mark for comparisons over time, as new cycles of PISA for the three subjects are run, and an anchor for comparison of performance across countries. In addition, when the 500 benchmark has been set, the global OECD distribution of individual scores was calibrated so that the standard deviation around the mean (the 500 mark) is equal to 100 score points – this means that two-thirds of the student population, considered for the calculations respectively in 2000 for reading, in 2003 for mathematics and in 2006 for science, performed between 400 and 600. All these are important benchmarks for a measure of inequality. A further interesting number generated through a comparison of average scores of same age students attending school in two different grades (measured on the reading scale): 39 score points – this reflects the approximate gains an average student can expect to make going through an additional school year, in an education system reflecting average characteristics across a large number of OECD countries. Comparing the last two numbers would mean that a student obtaining a score of 400 would have an approximate learning deficit of two-and-a-half years of schooling vis-à-vis one with an average score of 500.
How does UNICEF Report Card do it?
To establish its “League Table” on inequality in education, UNICEF’s measure of an achievement gap aims at presenting “how far low-achieving students are allowed to fall behind the ‘average’ child in reading, maths and science literacy at the age of 15”. With data from PISA 2012, it looks at the performance difference between the 10th percentile in the distribution of student scores – the upper score achieved by the 10% worst achievers on average for the three subjects – and the median in the same distribution – the score that splits the student population into two equal numbers of students achieving above and below this score. It is a relative measure within each national context. Through a standardization process (technically using z-scores), an indicator is produced for each country, then measured relative to the unweighted OECD average. The resulting standardized scores are either positive or negative: they are positive when the national relative standardized score is smaller than that of the OECD average – the larger the difference to the OECD score, the smaller the “bottom-end achievement gap”, as it is called by UNICEF; they are negative when the national relative standardized score is larger than that of the OECD average – the larger the negative difference to the OECD score, the larger the “botton-end achievement gap”. The cross country comparison then rests on how distant the lowest performers are from the median performer, nationally, relative to the OECD average. These calculations are briefly presented in the main UNICEF report and elaborated in greater detail in an Innocenti Working Paper. The UNICEF Report Card’s Achievement Gap is reported in column 4 of Table 1 below, and the associated ranking of countries in column 5.
The UNICEF standardized scores extend from highs of 1.92 in Chile and 1.59 in Estonia (Mexico and Turkey are excluded from the UNICEF ranking as their high scores rest on a reduced participation in school at age 15, that directly affect the calculated score) to lows of -1.96 in Israel, -1.39 in Belgium and -1.36 in France. In analysing the results, UNICEF pays most attention to the situation of countries that present the larger distances from the OECD average, both on the positive and the negative side. The two countries ranking highest, Chile and Estonia, with the best performance in the measured achievement gap, appear, however, very different in terms of the performance of their respective lowest students: the 10th percentile in Estonia achieves a three-subject average score of 423, close to the mean score attained by the Chilean students (436) – the lowest performing Estonian students perform as well as the average student in Chile. At the other end of the ranking, UNICEF comments that two high income countries, Belgium and France, show very large achievement gaps. But the scores of their lowest 10th percentiles are respectively 373 and 363, significantly higher than the 335 score of the “best performer” Chile. This is why I ran to the metric used for the ranking.
Along with this measure of the achievement gap, the UNICEF Report Card presents an additional measure, the “share of children below proficiency level 2 in all three subjects”, reported in Table 1 column 7. PISA maps test scores against six levels of achievement, each capturing milestones in each of the three subjects. Proficiency at level 2 is considered the baseline required to participate fully in modern society, implying that serious deficiencies affect those students not passing this mark. In itself, it is also an indicator of inequality in education. Sure, it is a different measure than the proposed achievement gap, but it is somewhat disturbing that a ranking along this line of inequality measurement has no decent correlation with the ranking based on the achievement gap. This, in fact, comes as no surprise, the main critical point of difference in the approach for both metrics is that the share of children below proficiency level 2 is based on a reference to international milestones (the way the proficiency levels are delimited on score points) while the achievement gap refers uniquely to a comparison of scores within a national context.
An alternative (more appropriate) measure of inequality in education
The alternative approach I propose as a measure of inequality in student academic performance (which I consider more appropriate) just starts from this last point. My argument is that, with PISA, we have the privilege of benchmarking various aspects of national education systems’ performance against internationally defined landmarks, backed by top-notched statistical design. The establishment of proficiency levels is one of the aspects of this approach. Using the “standard” of 500 whose design has been described earlier is another aspect of this approach. My proposal for an alternative measure of inequality in student performance, also one representing a “bottom-end” measure, is “simply” to size up the average gap between the nationally measured 10th percentile and the internationally set “standard” average performance of 500.
Also using results in PISA 2012, I calculated the average score of the 10th percentile across the three subjects, then subtracted the obtained average score from 500, the benchmark OECD average for each of the three subjects. The result is what I call the Academic gap, i.e. the number of score points that separate the 10th percentile from the OECD “standard” average. This is reported in Table 1 column 3 and in Chart 1. Expressed in simple score points, this Academic gap is easy to understand and present to policy makers. It is also quite appropriate to measure progress over time (we may look at this in another post). It is used here to rank countries as well. For the same main reason presented about the comparison between the two measures offered in the UNICEF Report Card, the ranking proposed with this gap metric does not correlate with the one in the UNICEF Report Card, but it does with the share of children below proficiency level 2 in all three subjects. This gap measure removes the performance of other national students from the calculation, thus concentrates on giving an appropriate perspective on how deep societies let their academically challenged kids sink, against an international benchmark.
Based on data from PISA 2012.
Note: In the table, I have retained 40 countries, all 34 OECD countries plus OECD accession or partner countries. The UNICEF analysis included a few more European countries and none of the non-European partner countries. The rankings have been adjusted to reflect the countries presented in the table.
The two rankings lead to considerable differences while no systematic pattern can be recognized. Media jumped on quick conclusions from the UNICEF report – I read reports published in France (Le Monde, Le Figaro, Le Nouvel Observateur, Libération) and Canada (Toronto Star, La Presse). Would the reporting had been different with a different approach to measuring the achievement gap? Here are a few analytical points that stand out from the ranking of inequality in education based on the Academic gap:
- Considering that a score point difference between two consecutive proficiency levels in PISA is about 70 score points (averaged across all three subjects), students with performance as low as below the 10th percentile mark have at least a one level difference to those performing at the OECD mean. This is not a considerable gap for the first countries in the ranking, up to a gap of one and a half level, about 105 points – the first eight countries among the 40: Korea, Estonia, Japan, Finland, Poland, Canada, Ireland and Switzerland. At the other end of the ranking, the gap is more than two proficiency levels (above 140 score points) in 12 countries. This last group includes five European countries: Iceland, Luxembourg, Sweden, Greece and Slovak Republic. It is difficult to draw a bold line in the ranking, but these 12 countries present the most severe academic gap, those who let some of their children sink deep – in larger numbers as well as the share of children below proficiency level 2 in all three subjects reaches, then exceeds, 15%.
- One could imagine that a high mean score for a country does not preclude that the lowest performing students may have very low scores. At least when considering the students in the lowest 10th percentile, this is not the case: high performing countries (as measured by their mean score for all three subjects together, provided in Table 1 column 6) do not have a long tail of their distribution on the low performance side. The first ten countries with a smaller gap are also the ten countries with the highest global performance. Among countries performing, in all three subjects aggregated, at or above the OECD standard of 500, France appears as the one with the longest tail on the low performance side, i.e. with the highest relative deficit for its lowest performing student group.
- In all first 15 countries’ ranking of academic gap, some among the 10% weakest students get a score placing them at proficiency level 2 in at least one of the subjects of PISA tests, hence with a minimum proficiency required to participate fully in modern society. In Table 1, these are the countries where the share of children below proficiency level 2 in all three subjects is below 10%.
- Some groups of countries are often recognized as hanging together in various aspects of economic or social dimensions. One such group is the Nordic countries, often exemplified as countries with high equality: in terms of inequality in academic achievement, there is no such group holding together – Finland is 4th, Denmark 13th, Norway 22nd, Iceland 29th and Sweden 32nd.
- Estonia keeps a similar high rank in both the UNICEF and this version of the ranking. This is not the case of Chile (36th).
Based on data from PISA 2012.
The UNICEF Report Card #13, and the Working Paper Education for All? Measuring inequality of educational outcomes among 15-year-olds across 39 industrialized nations, with a focus on educational inequality, provide excellent research and analytical insights on the situation of inequality faced by young people across countries. What I have found problematic is the metric of inequality selected to rank countries’ inequality in education. It appears difficult to interpret and not providing a clear message to policy makers, if one does not delve into the extended research. The Academic gap, average gap for the 10th percentile to OECD “standard” average performance appears as a simple, yet more effective, tool to benchmark how deep societies let their kids sink academically.
According to the metrics I propose, it is clear that some countries let their children sink rather deep. This is not just a metaphor. The PISA assessment is carried out with 15 year-old students, near the end of compulsory schooling in most countries, always determined by age rather than by acquired level of knowledge. Advancement through school is built by an accumulation of knowledge and academic and social experiences from an early age and the outcome at age 15 is rarely a sudden revelation. Academic gap too is often “built” on the experiences of previous years.
Non solum data – Data sine monito oculo nihil sunt.
Through this blog, we invite constructive comments and constructive article contributions – review our blog policy and have your say!
References
Bruckauf, Z. and Y. Chzhen (2016). Education for All? Measuring inequality of educational outcomes among 15-year-olds across 39 industrialized nations, Innocenti Working Paper No.2016-08, UNICEF Office of Research, Florence. https://www.unicef-irc.org/publications/843/
OECD (2013), PISA 2012 Results: Excellence Through Equity: Giving Every Student the Chance to Succeed (Volume II), PISA, OECD Publishing. http://dx.doi.org/10.1787/9789264201132-en
OECD (2014), PISA 2012 Technical Report, PISA, OECD Publishing. http://www.oecd.org/pisa/pisaproducts/PISA-2012-technical-report-final.pdf
UNICEF Office of Research (2016). ‘Fairness for Children: A league table of inequality in child well-being in rich countries’, Innocenti Report Card 13, UNICEF Office of Research – Innocenti, Florence. https://www.unicef-irc.org/publications/830/
Patrice
Latest posts by Patrice (see all)
- Facing aging – how employment plays a critical role - April 10, 2019
- Education as a construction of self - August 11, 2018
- Aging in France: Special challenge of a population of working age that does not grow anymore - May 8, 2018
- What has changed in the lives (and pattern of death) of American women and men? - April 6, 2018
- Aging in the United States - March 5, 2018