|
|
|
Sunshine
State TESOL
Journal
Volume 6, Number 1
Spring 2007
High-stakes Tests, English
Language Learners, and
Linguistic
Modification
Jamal Abedi
University
of California,
Davis
Davis, California
Abstract
English language
learners are faced with a major challenge of learning and being tested
in a
language in which they are not quite proficient. Current research on
the
assessment of ELL students clearly suggests that assessments developed
for
native speakers of English may not provide reliable and valid outcomes
for ELL
students. This paper summarizes results of studies that discuss major
limitations in the existing assessment tools for ELL students. The
paper also
provides research-based recommendations on how to increase authenticity
of
existing assessment tools for ELL students.
Rationale
Current legislation
mandates the inclusion of all
students into national and state assessments including English language
learners (ELLs). For example, the most
recent
reauthorization of the Elementary and Secondary Education Act of 1965 known
as the No Child Left Behind Act (NCLB; Public Law No.107-110, 115 Stat. 1425,
2002) requires
schools, districts, and states receiving Title I support to report
Adequate
Yearly Progress (AYP) for all students and separately report four major
subgroup categories including ELL students. They must also report ELL
student
progress in learning English as required by the NCLB Title III
accountability
provisions. ELL students among others must reach to the 100 percent
proficiency
target by the year 2014.
While members of the education
community,
particularly those involved in the instruction and assessment of ELLs
see this
as a positive move toward a high quality education system for these
students,
they are concerned about major consequences of this requirement. In
this paper,
I will elaborate on some of the challenges faced by ELLs in reaching
the NCLB
goal of 100 percent proficiency by the target year of 2014.
Current research on the assessment
and
accommodations of ELLs strongly suggests that the assessment
instruments
developed for native speakers of English may not provide reliable and
valid
results for ELLs due to the strong impact of language factors on their
assessment outcomes. For example, it would be extremely difficult to
interpret
low performance of an ELLs on a mathematics test with a complex
linguistic
structure (Abedi, 2006). As a result, we would have to ask: Is the low
performance due to lack of content knowledge in mathematics, difficulty
in
understanding the question, or a combination of both?
Thus, we are faced
with a difficult dilemma regarding the including of ELL students in
national
and state assessment and accountability systems. If they are
included in the assessments and accountability systems, they
are faced with major assessment issues, and if they are not included
they will
be dropped out of the national accountability picture. More
specifically, if
ELLs are included, they can be placed at a disadvantage because
- assessment
outcomes may not be valid due to the impact of their limited English
proficiency on content knowledge performance;
- invalid test
results affect decisions regarding their promotion or graduation;
- they may be
incorrectly placed into special educational programs where they receive
inappropriate instruction;
- ELLs may not
have received the same curriculum as non-ELLs and are tested on content
for which they have not received instruction; and
- assessment
tools in large-scale assessments usually constructed for native
speakers of English may be biased toward these students.
However, if ELLs are not included in the assessments,
- their quality
of instruction may be affected due to the powerful impact of assessment
on instruction;
- they will be
dropped out of the accountability picture;
- institutions
will not be held responsible for their performance in school;
- ELLs will not
be included in state or federal policy decisions; and
- ELLs’ academic
progress, skills, and needs may not be appropriately assessed.
In this
paper I present a summary of results of
some current studies on the impact of linguistic factors on the
assessment of
ELLs that clearly demonstrate how performance outcomes of ELL students
are
confounded with their limitations in English proficiency. I will then
elaborate
on the challenges facing ELLs, their teachers and their families.
A Summary of Research on ELL Student Assessment
Over a decade ago,
researchers at the UCLA
National Center
for Evaluation, Standards, and Student Testing (CRESST) started a
series of
studies to examine assessment issues for ELLs. The results of these
studies
along with the findings of other national studies have demonstrated
that the
unnecessary linguistic complexity of content-based assessments (e.g.,
mathematics and science) is a likely source of measurement error
differentially
affecting the reliability and validity of assessments for the ELL
subgroup
(Abedi & Lord, 2001). To control for the impact of language factors
on the
assessment of ELLs in content-based areas, the concept of linguistic
modification was introduced in which linguistic
complexity unrelated to the content in test items has been reduced or
modified.
Results of
analyses of extant data from the National Assessment of Educational
Progress
(NAEP) suggested that ELLs had difficulty with the linguistically
complex test
items. Studies also found that ELLs exhibited a substantially higher
number of
omitted or not-reached test items (Abedi, Lord, & Plummer, 1997).
Results
of these studies led to the formation of the linguistic
modification approach. Several linguistic features were
identified that slow down the reader, making misinterpretation more
likely, and
add to the reader’s cognitive load, thus interfering with concurrent
tasks. A
subsequent study (Abedi & Lord, 2001) examined the effects of the
linguistic modification approach with 1,031 eighth grade students in Southern California. In this study, NAEP
mathematics
items were modified to reduce the complexity of sentence structures and
to
replace potentially unfamiliar vocabulary with more familiar words. The
results
showed significant improvements in the scores of ELL students and
non-ELLs in
low and average-level mathematics classes, but
no significant changes to scores among other non-ELLs.
Among the
linguistic features that appeared to contribute to the differences were
low-frequency
vocabulary and passive voice verb constructions.
Abedi, Lord, and
Hofstetter (1998) further examined the impact of language modification
on the
mathematics performance of English learners and non-English learners on
a
sample of 1,394 eighth graders in schools with high enrollments of
Spanish
speakers. Results showed that modification of the language of items
contributed
to improved performance on 49 percent of the items; the ELLs generally
scored
higher on shorter and less linguistically complex problem statements.
Another
study (Abedi, Lord, Hofstetter, & Baker, 2000) on a sample of 946
eighth
graders found that among four different accommodation strategies for
ELLs, only
the linguistically modified English form narrowed the score gap between
English
learners and other students. Other studies also found linguistic
modifications
of assessments help ELLs to validly show what they know as well as what
they
are able to do (Abedi, Courtney, & Leon, 2003; Maihoff, 2002;
Kiplinger,
Haug, and Abedi, 2000; Rivera and Stansfield, 2001). Findings from
these
studies helped to identify assessment issues specific to ELL students.
The
results of analyses of existing data from several locations
nationwide show a substantial gap in reliability (internal consistency)
and
validity (concurrent validity) between ELLs and non-ELLs on test items
with a
substantial language demand. This gap in the reliability and validity
coefficients reduces as the level of language demand of assessments
decreases.
For example, the results of analyses in one of the data sites showed
that the
reliability coefficients (alpha) for English-only students range from
.898 for
mathematics to .805 for science and social science. For ELLs, however,
alpha
coefficients differ considerably across the content areas. In
mathematics, where
language factors might not have much influence on performance, the
alpha
coefficient for ELLs (.802) was slightly lower than the alpha for
English-only
students (.898). In language, science, and social science, however, the
gap on
alpha between English-only and ELLs was large. Averaging over language,
science, and social science results, the average alpha for English-only
was
.808 as compared to an average alpha of .603 for ELL students. Thus,
language
factors introduce another source of measurement error for ELLs test
results
which might not have much impact on the native/fluent speakers of
English (for
a more detailed description, see Abedi, 2006; Abedi, Leon,
and Mirocha, 2003). However, when we reduced the level of linguistic
complexity
in science and social science tests, the gap in the reliability
coefficient was
reduced substantially.
Teacher-based Research on the Impact of Linguistic
Modification of Test Items on ELL Student Performance
The concept of language
modification in
assessments can be applied to instruction of ELL students as well. For
example,
ELLs can also benefit from instruction with simple linguistic
structure.
Therefore, familiarity with the linguistic modification approach would
be
beneficial to teachers of ELLs. Based on this premise, a group of
teachers of
ELLs (grades 8 through 10) at the Los Angeles County Office of
Education
(LACOE) were invited to participate in a study to perform linguistic
modification of mathematics and reading tests and administer a
linguistically
modified test with students during one 50- to 60-minute class period in
the
2003 to 2004 school year (Abedi, Koency, Courtney, Leon, & Tiwana,
2006).
In a one-day training session, researchers from the UCLA Center
for the Study of Evaluation discussed the concept of linguistic
modification of
assessments with the LACOE teachers and presented them summaries of
research
indicating the effectiveness of this approach. Teachers were then
provided with
instruction and a rubric to rate a set of released test items from the
California High School Exit Exam (CAHSEE). They applied linguistic
modification
guidelines to the CAHSEE released items. Linguistically modified
mathematics
and language arts exams were created. The modifications were approved
by
content experts who found no change in the construct of each modified
exam
item. Written protocol scripts and testing timers were provided to the
participating teachers. There were 26 items on the
mathematics exam,
covering number power, algebra, and geometry standards, as well as 16
items on
the language arts exam, consisting of reading passages and
comprehension
questions.
The total sample of students from
the
classes taught by the participating teachers was 698 and included
students from
grades 8 through 10 with 316 students taking the mathematics exam. Of
these,
there were 111 ELL and 205 non-ELL students. The number of students who
took
the language arts exams was 382, which included 274 ELL and 108 non-ELL
students.
In studying whether high school
exit exam
testing with language modification of items for less linguistic
complexity
improves the performance of non-ELL students, my colleagues and I found
that
both ELL and non-ELLs benefited from the linguistic modifications made
to the
exam items. This result is similar to a recent CRESST study with grade
8
students from a similar population of low-performing students (Abedi, Courtney, Leon,
Kao, and Azzam, in press). Teachers who participated in the experiment
indicated that the experience in modifying linguistic complexity of
items was
quite valuable and that they learned some techniques to apply in their
classrooms. Furthermore, findings of this study suggest that not only
ELLs, but
native speakers of English at the lower tail of the academic
performance
distribution may also benefit from assessment with simple linguistic
structure
while it may not alter the construct being assessed.
What
is Language Modification of Test
Items?
Indices of
language complexity include
word frequency and familiarity, word length, and sentence length. Other
linguistic features causing difficulty include passive voice
constructions,
comparative structures, prepositional phrases, sentence and discourse
structure, subordinate clauses, conditional clauses, relative clauses,
concrete
versus abstract or impersonal presentations, and negation (for a more
detailed
description of language modification see, Abedi & Lord, 2001;
Abedi, Lord
& Plummer, 1997).
There are,
however, issues raised on the
concept and application of language modification of text used in the
assessment
and instruction of ELL students. For example, to be successful
academically,
ELL students must be proficient in academic language which is not
necessarily
the same as conversational fluency. Academic language includes less
frequent
vocabulary and ability to interpret and produce complex written
language.
Students should be able to understand complex linguistic structure
related to
content areas such as science, social sciences and mathematics.
Reducing the
complexity of language required to perform such complex tasks may not
be
productive (Bielenberg & Wong Fillmore, 2004; Celedon-Pattichis,
2003).
I
understand the importance of academic
language proficiency in content-based instruction and assessment.
Reducing the
level of complexity of academic content may change the construct being
taught
and being assessed. However, I distinguish between complex linguistic
structures related to the content of assessment and instruction as
opposed to
the unnecessary linguistic complexity of text in both assessment and
instruction. To illustrate our point, I use a few released test items
to show
how unnecessary linguistic complexity may hinder students’ ability to
provide a
valid picture of what they know and can do. I first present these items
in
their original form and then propose some linguistic revisions that
help
facilitate students’ understanding of the text. In these revisions,
different
linguistic features that contributed to the complexity of assessment
questions
were modified.
Example 1.
Original
A certain
reference file contains approximately six billion facts. About how many
millions is that?
Example 1.
Original
A certain
reference file contains approximately six billion facts. About how many
millions is that?
(a) 6,000,000
(b) 600,000
(c) 60,000
(d) 6,000
(e) 600
Revised
Mack’s company
sold six billion hamburgers. How many millions is that?
(a) 6,000,000
(b) 600,000
(c) 60,000
(d) 6,000
(e) 600
In this example, potentially
unfamiliar,
low-frequency lexical terms (certain, reference, file) were replaced
with more
familiar, higher frequency terms (company, hamburger).
Example 2.
Original
Raymond must buy
enough paper to print 28 copies of a report that contains 64 sheets of
paper.
Paper is only available in packages of 500 sheets. How many whole
packages of
paper will he need to buy to do the printing?
Revised
Raymond
has to buy
papers to print 28 copies of a report. He needs 64 sheets of paper for
each
report. There are 500 sheets of paper in each package. How many whole
packages
of paper must Raymond buy?
In
this example, the original version contains information in a complex
sentences
including relative clauses, whereas, the revised item contains the same
information in separate, simple sentences.
Example 3.
Original
If Y represents
the number of newspapers that Lee delivers each day, which of the
following
represents the total number of newspapers that Lee delivers in 5 days?
(a) 5 + Y
(b) 5 x Y
(c) Y + 5
(d) (Y + Y) x 5
Revised
Lee delivers Y
newspapers each day. How many newspapers does he deliver in 5 days?
(a) 5 + Y
(b) 5 x Y
(c) Y + 5
(d) (Y + Y) x 5
In
the original question, the conditional clause could be hard to
understand even
for native speakers of English. Separate sentences, rather than
subordinate if clauses may be easier for some
students to understand.
Why Language Modification of Assessments is Needed for ELLs
Research
has demonstrated that ELLs may have content knowledge but may lack
language
facilities to express the content knowledge they master (see, for
example,
Abedi, 2006). Linguistic complexity of assessment as a constructed but
irrelevant source may have negative impact on the validity of
assessment
outcomes; therefore, reducing the impact of such a source should help
to
provide a more valid assessment outcome for ELLs.
However,
for some content areas such as reading, the target construct is
language;
therefore, the concept of linguistic complexity may not apply since
simplifying
the language of test items may change the construct being measured. In
other
content areas such as math, science and social sciences, unnecessary
language
demand may introduce a bias into the assessment that could jeopardize
the
validity of assessment. As such, results of analyses of data from many
locations nationwide show that the higher the level of language demand
of test
items the larger the performance-gap between ELLs and non-ELLs (see,
for
example, Abedi, Leon, & Mirocha, 2003). Data in Table 1 illustrate
this
trend. For grade 10 students there is a substantial performance-gap of
14
(38.0-24.0) score points between the mean score of ELLs and non-ELLs in
reading. This performance-gap was reduced to 9.7 (42.6-32.9) in science
and
further reduced to 2.5 in linguistically modified mathematics.
Similarly, for
grade 11 students the performance-gap between ELL and non-ELLs in
reading was
15.9 score points which was reduced to 11.2 score points in science and
further
reduced to 0 in mathematics. The results of these analyses clearly
suggest the
high level of impact of language demand on test items particularly
mathematics.
Table
1
Means & Standard
Deviations for Students in Grades 10 and 11 in a Large School District
|
|
Reading
|
Science
|
Mathematics
|
|
|
M
|
SD
|
M
|
SD
|
M
|
SD
|
|
Grade 10
|
|
|
|
|
|
|
|
ELL only
|
24.0
|
16.4
|
32.9
|
15.3
|
36.8
|
16.0
|
|
Non-ELL/SWD
|
38.0
|
16.0
|
42.6
|
17.2
|
39.6
|
16.9
|
|
All
students
|
36.0
|
16.9
|
41.3
|
17.5
|
38.5
|
17.0
|
|
Grade 11
|
|
|
|
|
|
|
|
ELL Only
|
22.5
|
16.1
|
28.4
|
14.4
|
45.5
|
18.2
|
|
Non-ELLP/SWD
|
38.4
|
18.3
|
39.6
|
18.8
|
45.2
|
21.1
|
|
All
Students
|
36.2
|
19.0
|
38.2
|
18.9
|
44.0
|
21.2
|
Note. ELL = English
language
learners. SWD = students with disabilities.
In summary, for
some content areas such as reading, the target construct is language;
therefore, the concept of linguistic complexity may not apply since
simplifying
the language of test items may change the construct being measured.
However, in
other content areas such as mathematics, science and social sciences,
unnecessary language demand may introduce a bias into the assessment
that could
jeopardize the validity of assessment.
How Do Students React to Simple-Structure Test
Items?
In
a study using released mathematics items for the National Assessment of
Educational Progress (NAEP) assessment, Abedi, Lord & Plummer
(1997)
modified the NAEP test items to reduce the level of unnecessary
linguistic
complexity. Two mathematics content experts independently compared the
original
versus revised items to ensure that the language related to the
mathematics
content was not changed. Minor issues were identified by the experts
and were
addressed. Two test booklets were then prepared, one containing the
original
mathematics items and the other containing the revised items. In an
interview
study, a group of 38 grade 8 students including both ELLs and non-ELLs
were
asked to compare the two sets of items and share their views with the
interviewer on which set of items they would prefer to use. Over 80
percent of
the respondents (both ELLs and non-ELLs) indicated that they would
prefer the
revised version of the items and expressed that they understood those
items
more clearly. Below are some comments from these students:
- “Well, it
makes more sense.”
- “It explains
it better.”
- “Because that
one’s more confusing.”
- “It seems
simpler. You get a clear idea of what they want you to do.”
- “It’s easier
to read, and it gets to the point, so you won’t have to waste time.”
- “I might have
a faster time completing that one ’cause there’s less reading.”
- “Less reading;
then I might be able to get to the other one in time to finish both of
them.”
- “’Cause it’s,
like, a little bit less writing.”
- “This one uses
words like ‘sector’ and ‘approximation,’ and this one uses words that I
can relate to.”
- “It doesn’t
sound as technical.”
- “I can’t read
that word.”
- “Because it’s
shorter and doesn’t have, like, complicated words.”
Once again, it is
quite clear from the results of the analyses of data and from the input
and
feedback that language modification of test items helps create a better
and
more valid assessment tool not only for ELLs but also for their low
performing
native English speaking counterparts. Linguistic modification of
assessment may
be used as a form of accommodation for ELLs. While such accommodation
may also
help low-performing native speakers of English, it does not alter the
construct
being measured (Abedi, Hofstetter & Lord, 2004).
Consequences of Assessment Issues for ELL
Students, Their Teachers and
Their Families
Technical issues
in the assessment of ELL students may not only have a big effect on the
performance of these students but also have major consequences for
their
teachers and families. ELLs are faced with dual challenges. They need
to learn
the necessary content knowledge, but they also need to become
proficient enough
in English to be able to learn that content. Similarly, teachers of
these
students have to deal with such challenges. In addition to teaching the
content, they must be aware of their students’ language limitations and
try to
help them with these language limitations without sacrificing
instructional
programs for their non-ELL students.
ELLs’ families are
also faced with the challenge of helping their children with complex
academic
programs. Regardless of how smart ELLs may be, without fully
understanding the
language of instruction, they may not be able to learn from the
teacher’s
instruction and perform well on required assignments and assessments.
While I have discussed the serious
impact that unnecessary linguistic complexity may have on the
assessment
outcome for ELLs, there are many other factors that could influence
validity of
assessments for these students. It is therefore imperative to examine
all
issues and factors that influence assessment outcome for these students.
Conclusions
ELLs
are faced with a challenging task in their academic lives. They have to
work
much harder than their native English-speaking peers to learn the
content
knowledge in a language with which they may not feel quite comfortable.
In
addition, they have to respond to test items that require a
substantially
proficient command of English, as observable in non-modified test
prompts. Due
to ELLs’ possible language limitations, they have less opportunity to
learn
than their native English-speaking peers, and they have difficulty
understanding the wording of test items—thus impeding an accurate
reflection of
their content knowledge.
To help ELLs
overcome problems they are facing in their instruction and assessment,
one must
first clearly understand the factors affecting their academic
performance.
While ELLs come from different family and cultural backgrounds, they
all need
assistance with language. Schools and districts around the nation
provide
accommodations for these students to help them to understand
instructional
materials and participate in content-based assessment. Unfortunately,
many of
these accommodations are adopted from those used for students with
disabilities
and, therefore, are not relevant for these students (see, for example,
Abedi,
Hofstetter & Lord, 2004). Studies have suggested that helping these
students with their language needs provides the most appropriate
assistance to
these students.
Among different
forms of accommodations and assistance provided to these students,
language
modification of assessments and instructional materials has shown to be
promising. As indicated in this paper, my colleagues and I have been
able to
modify test items linguistically while minimally changing the construct
being
measured, and students have found these test items easy to understand
and
respond to without sacrificing our understanding of their proficiency
in
subject areas, particularly mathematics.
This paper may
help educators, students, and their families to understand some of the
major
challenges these students are faced with and provides research-based
recommendations to resolve these issues. The results of data analyses
and
research summaries presented in this paper show how language factors
affect the
performance of ELL students. It was demonstrated that the higher the
level of
language demand, the higher the performance-gap between ELLs and
non-ELLs. The
paper also highlighted an important form of assistance we may provide
these
students so that they may overcome linguistic obstacles facing them in
school.
I urge teachers,
other school officials, and families to gain a better understanding of
the
nature of problems that students are struggling with and provide
appropriate
solutions. Our delving into specific linguistic modification hopefully
contributes to these solutions.
References
Abedi, J. (2006). Language Issues in
Item-Development. In Downing, S. M. and Haladyna, T. M. Handbook
of Test Development (Ed.). New Jersey: Lawrence Erlbaum
Associates, Publishers.
Abedi,
J., Koency, G., Courtney, M., Leon, S. & Tiwana, R. (2006). The modification of math and reading rest
language for English language learners. Los Angeles: CRESST/
University of
California, Los Angeles and the Los Angeles County Office of Education.
Abedi, J., Leon, S., Mirocha, J.
(2003). Impact of students’ language background on
content-based data: Analyses
of extant data (CSE Tech.
Rep. No. 603). Los
Angeles: University of California: Center for the Study of
Evaluation/National
Center for Research on Evaluation, Standards, and Student Testing.
Abedi,
J., Courtney, M., & Leon, S. (2003). Effectiveness
and validity of accommodations for English language learners in
large-scale
assessments (CSE Tech. Rep. No. 608). Los Angeles: University of
California, National Center for Research on Evaluation, Standards, and
Student
Testing.
Abedi,
J. & Lord, C. (2001). The language factor in mathematics tests. Applied Measurement in Education, 14(3),
219-234.
Abedi,
J., Lord, C., & Hofstetter, C. (1998). Impact
of selected background variables on students’ NAEP math performance
(CSE
Tech. Rep. No. 478). Los Angeles: University of California, National
Center for
Research on Evaluation, Standards, and Student Testing.
Abedi, J., Lord, C., Hofstetter, C., &
Baker, E. (2000). Impact of accommodation strategies on English
language
learners’ test performance. Educational
Measurement: Issues and Practice, 19 (3),
16-26.
Abedi,
J., Lord, C., & Plummer, J. (1997). Language
background as a variable in NAEP mathematics performance (CSE
Tech. Rep.
No. 429). Los Angeles: University of California, National Center for
Research
on Evaluation, Standards, and Student Testing.
Beilenberg,
B & Wong Fillmore, L. (2004). ELLs and high stakes testing:
Enabling
students to make the grade. Educational
Leadership, 62 (4), 45-49.
Celedon-Pattichis,
S. (2003). Construction meaning: Think-aloud protocols of ELLs on
English and
Spanish word Problems. Education for
Urban Minorities, 2 (2), 74-90.
Hakuta,
K., & Beatty, A. (Eds.). (2000). Testing
English-language learners in U.S. schools. Washington, DC: National
Academy
Press.
Kindler,
A. L. (2002). Survey of the states’
limited English proficient students & available educational
programs and
services, 2000-2001 Summary Report. Washington, DC: National
Clearinghouse
for English Language Acquisition and Language Instruction Educational
Programs.
Kiplinger, V. L., Haug, C. A., & Abedi, J. (2000,
April). Measuring math – not reading – on a math
assessment: A
language accommodations study of English language
learners and
other special populations. Presented at
the annual meeting of the
American
Educational Research Association, New Orleans, LA.
Maihoff,
N. A. (2002, June). Using Delaware data in making
decisions regarding the education of LEP students. Paper presented
at the
Council of Chief State School Officers 32nd Annual National Conference
on
Large-Scale Assessment, Palm
Desert, CA.
Rivera,
C., & Stansfield, C. W. (2001, April). The
effects of linguistic simplification of science test items on
performance of
limited English proficient and monolingual English-speaking students.
Paper
presented at the annual meeting of the American Educational Research
Association, Seattle,
WA.
Author Bio
Jamal Abedi
is a Professor at the School of Education of
University of California, Davis and a research partner at the National
Center
for Research on Evaluation, Standards, and Student Testing (CRESST).
His
research interests include studies in the area of psychometrics and
test and
scale development focusing on the validity of assessment and
accommodation for
English language learners (ELL) and research on the opportunity to
learn for
ELLs. Dr.
Abedi has developed a culture-free instrument for measuring creativity,
which
has become translated into a number of languages and administered in
several
countries.
|
|