Sunshine State TESOL Journal





Educational Resources









Penguin Readers








Sunshine State TESOL Journal

Volume 6, Number 1
Spring 2007


High-stakes Tests, English Language Learners, and

Linguistic Modification

 

Jamal Abedi
University of California, Davis
Davis, California

 

Abstract

English language learners are faced with a major challenge of learning and being tested in a language in which they are not quite proficient. Current research on the assessment of ELL students clearly suggests that assessments developed for native speakers of English may not provide reliable and valid outcomes for ELL students. This paper summarizes results of studies that discuss major limitations in the existing assessment tools for ELL students. The paper also provides research-based recommendations on how to increase authenticity of existing assessment tools for ELL students.

 

 

 

Rationale

Current legislation mandates the inclusion of all students into national and state assessments including English language learners (ELLs). For example, the most recent reauthorization of the Elementary and Secondary Education Act of 1965 known as the No Child Left Behind Act (NCLB; Public Law No.107-110, 115 Stat. 1425, 2002) requires schools, districts, and states receiving Title I support to report Adequate Yearly Progress (AYP) for all students and separately report four major subgroup categories including ELL students. They must also report ELL student progress in learning English as required by the NCLB Title III accountability provisions. ELL students among others must reach to the 100 percent proficiency target by the year 2014.

While members of the education community, particularly those involved in the instruction and assessment of ELLs see this as a positive move toward a high quality education system for these students, they are concerned about major consequences of this requirement. In this paper, I will elaborate on some of the challenges faced by ELLs in reaching the NCLB goal of 100 percent proficiency by the target year of 2014.

Current research on the assessment and accommodations of ELLs strongly suggests that the assessment instruments developed for native speakers of English may not provide reliable and valid results for ELLs due to the strong impact of language factors on their assessment outcomes. For example, it would be extremely difficult to interpret low performance of an ELLs on a mathematics test with a complex linguistic structure (Abedi, 2006). As a result, we would have to ask: Is the low performance due to lack of content knowledge in mathematics, difficulty in understanding the question, or a combination of both?

Thus, we are faced with a difficult dilemma regarding the including of ELL students in national and state assessment and accountability systems. If they are included in the assessments and accountability systems, they are faced with major assessment issues, and if they are not included they will be dropped out of the national accountability picture. More specifically, if ELLs are included, they can be placed at a disadvantage because

  • assessment outcomes may not be valid due to the impact of their limited English proficiency on content knowledge performance;
  • invalid test results affect decisions regarding their promotion or graduation;
  • they may be incorrectly placed into special educational programs where they receive inappropriate instruction;
  • ELLs may not have received the same curriculum as non-ELLs and are tested on content for which they have not received instruction; and
  • assessment tools in large-scale assessments usually constructed for native speakers of English may be biased toward these students.

However, if ELLs are not included in the assessments,

  • their quality of instruction may be affected due to the powerful impact of assessment on instruction;
  • they will be dropped out of the accountability picture;
  • institutions will not be held responsible for their performance in school;
  • ELLs will not be included in state or federal policy decisions; and
  • ELLs’ academic progress, skills, and needs may not be appropriately assessed.

In this paper  I present a summary of results of some current studies on the impact of linguistic factors on the assessment of ELLs that clearly demonstrate how performance outcomes of ELL students are confounded with their limitations in English proficiency. I will then elaborate on the challenges facing ELLs, their teachers and their families.

 

A Summary of Research on ELL Student Assessment



Over a decade ago, researchers at the UCLA National Center for Evaluation, Standards, and Student Testing (CRESST) started a series of studies to examine assessment issues for ELLs. The results of these studies along with the findings of other national studies have demonstrated that the unnecessary linguistic complexity of content-based assessments (e.g., mathematics and science) is a likely source of measurement error differentially affecting the reliability and validity of assessments for the ELL subgroup (Abedi & Lord, 2001). To control for the impact of language factors on the assessment of ELLs in content-based areas, the concept of linguistic modification was introduced in which linguistic complexity unrelated to the content in test items has been reduced or modified.

Results of analyses of extant data from the National Assessment of Educational Progress (NAEP) suggested that ELLs had difficulty with the linguistically complex test items. Studies also found that ELLs exhibited a substantially higher number of omitted or not-reached test items (Abedi, Lord, & Plummer, 1997). Results of these studies led to the formation of the linguistic modification approach. Several linguistic features were identified that slow down the reader, making misinterpretation more likely, and add to the reader’s cognitive load, thus interfering with concurrent tasks. A subsequent study (Abedi & Lord, 2001) examined the effects of the linguistic modification approach with 1,031 eighth grade students in Southern California. In this study, NAEP mathematics items were modified to reduce the complexity of sentence structures and to replace potentially unfamiliar vocabulary with more familiar words. The results showed significant improvements in the scores of ELL students and non-ELLs in low and average-level mathematics classes, but  no significant changes to scores among other non-ELLs. Among the linguistic features that appeared to contribute to the differences were low-frequency vocabulary and passive voice verb constructions.

Abedi, Lord, and Hofstetter (1998) further examined the impact of language modification on the mathematics performance of English learners and non-English learners on a sample of 1,394 eighth graders in schools with high enrollments of Spanish speakers. Results showed that modification of the language of items contributed to improved performance on 49 percent of the items; the ELLs generally scored higher on shorter and less linguistically complex problem statements. Another study (Abedi, Lord, Hofstetter, & Baker, 2000) on a sample of 946 eighth graders found that among four different accommodation strategies for ELLs, only the linguistically modified English form narrowed the score gap between English learners and other students. Other studies also found linguistic modifications of assessments help ELLs to validly show what they know as well as what they are able to do (Abedi, Courtney, & Leon, 2003; Maihoff, 2002; Kiplinger, Haug, and Abedi, 2000; Rivera and Stansfield, 2001). Findings from these studies helped to identify assessment issues specific to ELL students.

The results of analyses of existing data from several locations nationwide show a substantial gap in reliability (internal consistency) and validity (concurrent validity) between ELLs and non-ELLs on test items with a substantial language demand. This gap in the reliability and validity coefficients reduces as the level of language demand of assessments decreases. For example, the results of analyses in one of the data sites showed that the reliability coefficients (alpha) for English-only students range from .898 for mathematics to .805 for science and social science. For ELLs, however, alpha coefficients differ considerably across the content areas. In mathematics, where language factors might not have much influence on performance, the alpha coefficient for ELLs (.802) was slightly lower than the alpha for English-only students (.898). In language, science, and social science, however, the gap on alpha between English-only and ELLs was large. Averaging over language, science, and social science results, the average alpha for English-only was .808 as compared to an average alpha of .603 for ELL students. Thus, language factors introduce another source of measurement error for ELLs test results which might not have much impact on the native/fluent speakers of English (for a more detailed description, see Abedi, 2006; Abedi, Leon, and Mirocha, 2003). However, when we reduced the level of linguistic complexity in science and social science tests, the gap in the reliability coefficient was reduced substantially.

 

Teacher-based Research on the Impact of Linguistic Modification of Test Items on ELL Student Performance



            The concept of language modification in assessments can be applied to instruction of ELL students as well. For example, ELLs can also benefit from instruction with simple linguistic structure. Therefore, familiarity with the linguistic modification approach would be beneficial to teachers of ELLs. Based on this premise, a group of teachers of ELLs (grades 8 through 10) at the Los Angeles County Office of Education (LACOE) were invited to participate in a study to perform linguistic modification of mathematics and reading tests and administer a linguistically modified test with students during one 50- to 60-minute class period in the 2003 to 2004 school year (Abedi, Koency, Courtney, Leon, & Tiwana, 2006). In a one-day training session, researchers from the UCLA Center for the Study of Evaluation discussed the concept of linguistic modification of assessments with the LACOE teachers and presented them summaries of research indicating the effectiveness of this approach. Teachers were then provided with instruction and a rubric to rate a set of released test items from the California High School Exit Exam (CAHSEE). They applied linguistic modification guidelines to the CAHSEE released items. Linguistically modified mathematics and language arts exams were created. The modifications were approved by content experts who found no change in the construct of each modified exam item. Written protocol scripts and testing timers were provided to the participating teachers. There were 26 items on the mathematics exam, covering number power, algebra, and geometry standards, as well as 16 items on the language arts exam, consisting of reading passages and comprehension questions.

            The total sample of students from the classes taught by the participating teachers was 698 and included students from grades 8 through 10 with 316 students taking the mathematics exam. Of these, there were 111 ELL and 205 non-ELL students. The number of students who took the language arts exams was 382, which included 274 ELL and 108 non-ELL students.

            In studying whether high school exit exam testing with language modification of items for less linguistic complexity improves the performance of non-ELL students, my colleagues and I found that both ELL and non-ELLs benefited from the linguistic modifications made to the exam items. This result is similar to a recent CRESST study with grade 8 students from a similar population of low-performing students (Abedi, Courtney, Leon, Kao, and Azzam, in press). Teachers who participated in the experiment indicated that the experience in modifying linguistic complexity of items was quite valuable and that they learned some techniques to apply in their classrooms. Furthermore, findings of this study suggest that not only ELLs, but native speakers of English at the lower tail of the academic performance distribution may also benefit from assessment with simple linguistic structure while it may not alter the construct being assessed.

 

What is Language Modification of Test Items?

 

Indices of language complexity include word frequency and familiarity, word length, and sentence length. Other linguistic features causing difficulty include passive voice constructions, comparative structures, prepositional phrases, sentence and discourse structure, subordinate clauses, conditional clauses, relative clauses, concrete versus abstract or impersonal presentations, and negation (for a more detailed description of language modification see, Abedi & Lord, 2001; Abedi, Lord & Plummer, 1997).

There are, however, issues raised on the concept and application of language modification of text used in the assessment and instruction of ELL students. For example, to be successful academically, ELL students must be proficient in academic language which is not necessarily the same as conversational fluency. Academic language includes less frequent vocabulary and ability to interpret and produce complex written language. Students should be able to understand complex linguistic structure related to content areas such as science, social sciences and mathematics. Reducing the complexity of language required to perform such complex tasks may not be productive (Bielenberg & Wong Fillmore, 2004; Celedon-Pattichis, 2003).

I understand the importance of academic language proficiency in content-based instruction and assessment. Reducing the level of complexity of academic content may change the construct being taught and being assessed. However, I distinguish between complex linguistic structures related to the content of assessment and instruction as opposed to the unnecessary linguistic complexity of text in both assessment and instruction. To illustrate our point, I use a few released test items to show how unnecessary linguistic complexity may hinder students’ ability to provide a valid picture of what they know and can do. I first present these items in their original form and then propose some linguistic revisions that help facilitate students’ understanding of the text. In these revisions, different linguistic features that contributed to the complexity of assessment questions were modified.

 

Example 1.

Original

A certain reference file contains approximately six billion facts. About how many millions is that?

Example 1.

Original

A certain reference file contains approximately six billion facts. About how many millions is that?

(a) 6,000,000

(b) 600,000

(c) 60,000

(d) 6,000

(e) 600

Revised

Mack’s company sold six billion hamburgers. How many millions is that?

(a) 6,000,000

(b) 600,000

(c) 60,000

(d) 6,000

(e) 600

In this example, potentially unfamiliar, low-frequency lexical terms (certain, reference, file) were replaced with more familiar, higher frequency terms (company, hamburger).

 

Example 2.

Original

Raymond must buy enough paper to print 28 copies of a report that contains 64 sheets of paper. Paper is only available in packages of 500 sheets. How many whole packages of paper will he need to buy to do the printing?

Revised

Raymond has to buy papers to print 28 copies of a report. He needs 64 sheets of paper for each report. There are 500 sheets of paper in each package. How many whole packages of paper must Raymond buy?

In this example, the original version contains information in a complex sentences including relative clauses, whereas, the revised item contains the same information in separate, simple sentences.

 

Example 3.

Original

If Y represents the number of newspapers that Lee delivers each day, which of the following represents the total number of newspapers that Lee delivers in 5 days?

(a) 5 + Y

(b) 5 x Y

(c) Y + 5

(d) (Y + Y) x 5

Revised

Lee delivers Y newspapers each day. How many newspapers does he deliver in 5 days?

(a) 5 + Y

(b) 5 x Y

(c) Y + 5

(d) (Y + Y) x 5

In the original question, the conditional clause could be hard to understand even for native speakers of English. Separate sentences, rather than subordinate if clauses may be easier for some students to understand.

 

Why Language Modification of Assessments is Needed for ELLs

 

            Research has demonstrated that ELLs may have content knowledge but may lack language facilities to express the content knowledge they master (see, for example, Abedi, 2006). Linguistic complexity of assessment as a constructed but irrelevant source may have negative impact on the validity of assessment outcomes; therefore, reducing the impact of such a source should help to provide a more valid assessment outcome for ELLs.

However, for some content areas such as reading, the target construct is language; therefore, the concept of linguistic complexity may not apply since simplifying the language of test items may change the construct being measured. In other content areas such as math, science and social sciences, unnecessary language demand may introduce a bias into the assessment that could jeopardize the validity of assessment. As such, results of analyses of data from many locations nationwide show that the higher the level of language demand of test items the larger the performance-gap between ELLs and non-ELLs (see, for example, Abedi, Leon, & Mirocha, 2003). Data in Table 1 illustrate this trend. For grade 10 students there is a substantial performance-gap of 14 (38.0-24.0) score points between the mean score of ELLs and non-ELLs in reading. This performance-gap was reduced to 9.7 (42.6-32.9) in science and further reduced to 2.5 in linguistically modified mathematics. Similarly, for grade 11 students the performance-gap between ELL and non-ELLs in reading was 15.9 score points which was reduced to 11.2 score points in science and further reduced to 0 in mathematics. The results of these analyses clearly suggest the high level of impact of language demand on test items particularly mathematics.


Table 1
Means & Standard Deviations for Students in Grades 10 and 11 in a Large School District

 

 

 

Reading

 

Science

 

Mathematics

 

 

M

SD

M

SD

M

SD

 

 

Grade 10

 

 

 

 

 

 

 

ELL only

 

24.0

 

16.4

 

32.9

 

15.3

 

36.8

 

16.0

 

Non-ELL/SWD

 

38.0

 

16.0

 

42.6

 

17.2

 

39.6

 

16.9

 

All students

 

36.0

 

 

16.9

 

41.3

 

17.5

 

38.5

 

17.0

 

Grade 11

 

 

 

 

 

 

 

ELL Only

 

22.5

 

16.1

 

28.4

 

14.4

 

45.5

 

18.2

 

Non-ELLP/SWD

 

38.4

 

18.3

 

39.6

 

18.8

 

45.2

 

21.1

 

All Students

 

36.2

 

19.0

 

38.2

 

18.9

 

44.0

 

21.2

 

Note. ELL = English language learners. SWD = students with disabilities.

 

 

In summary, for some content areas such as reading, the target construct is language; therefore, the concept of linguistic complexity may not apply since simplifying the language of test items may change the construct being measured. However, in other content areas such as mathematics, science and social sciences, unnecessary language demand may introduce a bias into the assessment that could jeopardize the validity of assessment.

 

How Do Students React to Simple-Structure Test Items?

 

            In a study using released mathematics items for the National Assessment of Educational Progress (NAEP) assessment, Abedi, Lord & Plummer (1997) modified the NAEP test items to reduce the level of unnecessary linguistic complexity. Two mathematics content experts independently compared the original versus revised items to ensure that the language related to the mathematics content was not changed. Minor issues were identified by the experts and were addressed. Two test booklets were then prepared, one containing the original mathematics items and the other containing the revised items. In an interview study, a group of 38 grade 8 students including both ELLs and non-ELLs were asked to compare the two sets of items and share their views with the interviewer on which set of items they would prefer to use. Over 80 percent of the respondents (both ELLs and non-ELLs) indicated that they would prefer the revised version of the items and expressed that they understood those items more clearly. Below are some comments from these students:

 

  • “Well, it makes more sense.”
  • “It explains it better.”
  • “Because that one’s more confusing.”
  • “It seems simpler. You get a clear idea of what they want you to do.”
  • “It’s easier to read, and it gets to the point, so you won’t have to waste time.”
  • “I might have a faster time completing that one ’cause there’s less reading.”
  • “Less reading; then I might be able to get to the other one in time to finish both of them.”
  • “’Cause it’s, like, a little bit less writing.”
  • “This one uses words like ‘sector’ and ‘approximation,’ and this one uses words that I can relate to.”
  • “It doesn’t sound as technical.”
  • “I can’t read that word.”
  • “Because it’s shorter and doesn’t have, like, complicated words.”

 

    Once again, it is quite clear from the results of the analyses of data and from the input and feedback that language modification of test items helps create a better and more valid assessment tool not only for ELLs but also for their low performing native English speaking counterparts. Linguistic modification of assessment may be used as a form of accommodation for ELLs. While such accommodation may also help low-performing native speakers of English, it does not alter the construct being measured (Abedi, Hofstetter & Lord, 2004).

 

Consequences of Assessment Issues for ELL Students, Their Teachers and Their Families

 

Technical issues in the assessment of ELL students may not only have a big effect on the performance of these students but also have major consequences for their teachers and families. ELLs are faced with dual challenges. They need to learn the necessary content knowledge, but they also need to become proficient enough in English to be able to learn that content. Similarly, teachers of these students have to deal with such challenges. In addition to teaching the content, they must be aware of their students’ language limitations and try to help them with these language limitations without sacrificing instructional programs for their non-ELL students.

ELLs’ families are also faced with the challenge of helping their children with complex academic programs. Regardless of how smart ELLs may be, without fully understanding the language of instruction, they may not be able to learn from the teacher’s instruction and perform well on required assignments and assessments. While  I have discussed the serious impact that unnecessary linguistic complexity may have on the assessment outcome for ELLs, there are many other factors that could influence validity of assessments for these students. It is therefore imperative to examine all issues and factors that influence assessment outcome for these students.

 

Conclusions

 

            ELLs are faced with a challenging task in their academic lives. They have to work much harder than their native English-speaking peers to learn the content knowledge in a language with which they may not feel quite comfortable. In addition, they have to respond to test items that require a substantially proficient command of English, as observable in non-modified test prompts. Due to ELLs’ possible language limitations, they have less opportunity to learn than their native English-speaking peers, and they have difficulty understanding the wording of test items—thus impeding an accurate reflection of their content knowledge.

To help ELLs overcome problems they are facing in their instruction and assessment, one must first clearly understand the factors affecting their academic performance. While ELLs come from different family and cultural backgrounds, they all need assistance with language. Schools and districts around the nation provide accommodations for these students to help them to understand instructional materials and participate in content-based assessment. Unfortunately, many of these accommodations are adopted from those used for students with disabilities and, therefore, are not relevant for these students (see, for example, Abedi, Hofstetter & Lord, 2004). Studies have suggested that helping these students with their language needs provides the most appropriate assistance to these students.

Among different forms of accommodations and assistance provided to these students, language modification of assessments and instructional materials has shown to be promising. As indicated in this paper, my colleagues and I have been able to modify test items linguistically while minimally changing the construct being measured, and students have found these test items easy to understand and respond to without sacrificing our understanding of their proficiency in subject areas, particularly mathematics.

This paper may help educators, students, and their families to understand some of the major challenges these students are faced with and provides research-based recommendations to resolve these issues. The results of data analyses and research summaries presented in this paper show how language factors affect the performance of ELL students. It was demonstrated that the higher the level of language demand, the higher the performance-gap between ELLs and non-ELLs. The paper also highlighted an important form of assistance we may provide these students so that they may overcome linguistic obstacles facing them in school.

I urge teachers, other school officials, and families to gain a better understanding of the nature of problems that students are struggling with and provide appropriate solutions. Our delving into specific linguistic modification hopefully contributes to these solutions.


References

 

Abedi, J. (2006). Language Issues in Item-Development. In Downing, S. M. and Haladyna, T. M. Handbook of Test Development (Ed.). New Jersey: Lawrence Erlbaum Associates, Publishers.

Abedi, J., Koency, G., Courtney, M., Leon, S. & Tiwana, R. (2006). The modification of math and reading rest language for English language learners. Los Angeles: CRESST/ University of California, Los Angeles and the Los Angeles County Office of Education.

Abedi, J., Leon, S., Mirocha, J. (2003). Impact of students’ language background on content-based data: Analyses                 of extant data (CSE Tech. Rep. No. 603). Los Angeles: University of California: Center for the Study of                         Evaluation/National Center for Research on Evaluation, Standards, and Student Testing.

Abedi, J., Courtney, M., & Leon, S. (2003). Effectiveness and validity of accommodations for English language learners in large-scale assessments (CSE Tech. Rep. No. 608). Los Angeles: University of California, National Center for Research on Evaluation, Standards, and Student Testing.

Abedi, J. & Lord, C. (2001). The language factor in mathematics tests. Applied Measurement in Education, 14(3), 219-234.

Abedi, J., Lord, C., & Hofstetter, C. (1998). Impact of selected background variables on students’ NAEP math performance (CSE Tech. Rep. No. 478). Los Angeles: University of California, National Center for Research on Evaluation, Standards, and Student Testing.

Abedi, J., Lord, C., Hofstetter, C., & Baker, E. (2000). Impact of accommodation strategies on English language learners’ test performance. Educational Measurement: Issues and Practice, 19 (3), 16-26.

Abedi, J., Lord, C., & Plummer, J. (1997). Language background as a variable in NAEP mathematics performance (CSE Tech. Rep. No. 429). Los Angeles: University of California, National Center for Research on Evaluation, Standards, and Student Testing.

Beilenberg, B & Wong Fillmore, L. (2004). ELLs and high stakes testing: Enabling students to make the grade. Educational Leadership, 62 (4), 45-49.

Celedon-Pattichis, S. (2003). Construction meaning: Think-aloud protocols of ELLs on English and Spanish word Problems. Education for Urban Minorities, 2 (2), 74-90.

Hakuta, K., & Beatty, A. (Eds.). (2000). Testing English-language learners in U.S. schools. Washington, DC: National Academy Press.

Kindler, A. L. (2002). Survey of the states’ limited English proficient students & available educational programs and services, 2000-2001 Summary Report. Washington, DC: National Clearinghouse for English Language Acquisition and Language Instruction Educational Programs.

Kiplinger, V. L., Haug, C. A., & Abedi, J. (2000, April). Measuring math – not reading – on a math assessment: A                 language accommodations study of English language learners and other special populations. Presented at             the annual meeting of the American Educational Research Association, New Orleans, LA.

Maihoff, N. A. (2002, June). Using Delaware data in making decisions regarding the education of LEP students. Paper presented at the Council of Chief State School Officers 32nd Annual National Conference on Large-Scale Assessment, Palm Desert, CA.

Rivera, C., & Stansfield, C. W. (2001, April). The effects of linguistic simplification of science test items on performance of limited English proficient and monolingual English-speaking students. Paper presented at the annual meeting of the American Educational Research Association, Seattle, WA.

Author Bio

Jamal Abedi is a Professor at the School of Education of University of California, Davis and a research partner at the National Center for Research on Evaluation, Standards, and Student Testing (CRESST). His research interests include studies in the area of psychometrics and test and scale development focusing on the validity of assessment and accommodation for English language learners (ELL) and research on the opportunity to learn for ELLs. Dr. Abedi has developed a culture-free instrument for measuring creativity, which has become translated into a number of languages and administered in several countries.

 


 







 

 







Sunshine State TESOL Journal
ISSN 1934-7030
Copyright rests with authors