RemembeR

" hidup sekali, hiduplah yang berarti"

Mengenai Saya

Foto saya
Allow cendekiawan baru, ktemu dengan aq dlm blog ini. q asli reog city.blog ini berisi secara keseluruhan tentang pengetahuan. harapanq bermanfaat wuat QM-QM

Kamis, 07 Januari 2010

listening

1.3 Which skills can be assessed?Skill Assessment by computer Assessment of electronic output by human being
Listening Computer can assess a limited range of different types of responses to test comprehension. Listening tests can be presented on a computer, students' answers can be stored electronically and assessed by a teacher. Self-assessment and peer assessment are also possible.
Speaking Very limited as yet. Automatic Speech Recognition (ASR) software is developing rapidly but it is still too unreliable to be used in accurate testing. Students can record their own voices on a computer for assessment by a teacher. Self-assessment and peer assessment are also possible.
Reading Computer can assess a limited range of different types of responses to test comprehension. Reading tests can be presented on a computer, students' answers can be stored electronically and assessed by a teacher. Self-assessment and peer assessment are also possible.
Writing Very limited as yet, but spellchecking, grammar checking and style checking are possible, and some progress is being made in the development of programs that can assess continuous text. Students' answers can be stored electronically and assessed by a teacher. Self-assessment and peer assessment are also possible.


Listening

At a basic level it is simple to assess listening comprehension in much the same way as it is possible to assess reading comprehension, e.g. with multiple-choice, drag-and-drop and fill-in-the-blank tests. If well designed, this form of assessment works effectively and instant feedback can be offered to the student, which has a beneficial effect on learning. The main ways of assessing listening skills can be summarised as follows:
Multiple-choice, drag-and-drop, and fill-in-the-blank tests with single-words or very short sentences, but these types of tests cannot easily assess more open-ended aspects such as the ability to infer; and in multiple-choices tests students can get the answers right by guesswork.
Completely open-ended answers cannot be assessed. Single-word answers or answers consisting of very short sentences can be assessed to a limited extent.

Despite these limitations, assessment of listening comprehension by computer can be of great value to students, offering a form of comprehensible input (Krashen 1985). Moreover, computer-based listening comprehension can combine sound with text, still images, video, animation and on-screen interactivity which creates thereby a much richer environment than is otherwise possible: see Module 2.2, Introduction to multimedia CALL, and Module 3.2, CALL software design and implementation. A measure of student control to allow ease of navigation, options to retry or move to a different section, to attempt different tasks or roles etc are vital to ensure active participation. Good equipment is also vital: headphones, fast network/Internet access and/or networked CD-ROMs.

Speaking

Limited assessment of speaking skills is possible. Self-assessment and peer assessment can be managed if facilities are available (e.g. microphone and headphones) to allow students to record themselves and listen to the playback. A number of multimedia CD-ROMs have this feature: see Module 2.2. The Encounters series of CD-ROMs, for example, allows students to take part in a role-play by recording their own voices - and re-recording them until they are satisified with the results - and then saving the whole role-play on disk, with their own voices slotted into the appropriate positions in the role-play, for assessment by a teacher: http://www.camsoftpartners.co.uk/encounters.htm. To assess speaking skills solely by a computer, using Automatic Speech Recognition (ASR), is a very complex task and research in this area is developing rapidly. ASR can be motivating for students working independently, but computers are still not completely reliable as assessors. For further information on ASR see:
Section 3.4.7, Module 2.2, headed CD-ROMs incorporating Automatic Speech Recognition (ASR)
Section 4, Module 3.5, headed Speech technologies

See the software for automated testing of spoken English produced by Versant.
http://www.ict4lt.org/en/en_mod4-1.htm
Golden Boys Reported Speech
1. Class : X/2
2. Competency Standard : Merespon makna dan langkah retorika teks tulis esei secara akurat, lancar dan berterima dalam konteks kehidupan sehari-hari dan untuk mengakses ilmu pengetahuan dalam teks berbentuk: narrative.
3. Indicator : Mengidentifikasi makna kata dalam teks yang berjudul’Golden Boys’.
Mengidentifikasi makna kalimat dalam teks yang berjudul’Golden Boys’.
Mengidentifikasi kalimat reported speech dalam teks ’Golden Boys’.
4. Time : 45 menit
5. Focus Skill : Reading
6. Remarks : Apabila sudah ada fasilitas LCD dan internet, guru bisa memanfaatkannya untuk memperlihatkan/ menandai bentuk kalimat yang dipakai di dalam teks yang dibahas ( bisa diunduh terlebih dahulu).
Pertanyaan-pertanyaan(Guiding Questions) yang mengarah ke pengertian siswa tentang ‘Reported Speech’ bisa ditayangkan di layar.
7. Background Knowledge : simple present tense, simple past tense

Teaching Steps :
High Tech Option :
mengamati Grammar yang digunakan di dalam teks yaitu ‘Reported Speech’ dari text yang ada di link:

http://www.britishcouncil.org/learningenglish-central-stories-golden-boys.htm

Guru memberikan instruksi: “Read the text and pay attention to the form of the sentences in the bold!”
Siswa menjawab pertanyaan guru tentang bentuk kalimat ‘Reported Speech’ di dalam teks yang berjudul ‘Golden Boys’.

Pertanyaan / Guiding Questions yang diajukan guru:
What did Mr Morelli say to the Writer’s Parents?
When Richard and Philip were going to cook, what did they ask everyone?
What did the writer’s parents say to him when they compare the writer with Philip and Richard?
What kind of the sentences are they?

Teacher’s answers:
Mr. Morelli said”Good morning” to the writer paremts.
They asked for everyone,”Forget your cheese sandwiches”, they laughed,”Come and have some hamburgers or barbecue chicken with us! We’re going to cook!”
“Why can’t you be more like Richard and Philip?” they said to me. “Look at them! They make friends with everyone! They are polite, good boys! You just sit here reading books and doing nothing!”
Reported Speech
Siswa menyimak penjelasan tentang ‘Reported Speech’ yang diambil dari link ( sudah diunduh dari):
http://www.englisch-hilfen.de/en/grammar/reported.htm
http://www.englischhilfen.de/en/grammar/reported_aufforderung.htm
http://www.englisch-hilfen.de/en/grammar/reported_frage.htm
Siswa mengerjakan Latihan2 berdasarkan penjelasan di atas. Soal-soal latihan sudah diunduh dari:
http://www.englischhilfen.de/en/exercises/reported_speech/commands.htm
http://www.englischhilfen.de/en/exercises/reported_speech/questions.htm
http://www.englischhilfen.de/en/exercises/reported_speech/questions2.htm

(Kunci jawaban, bisa dilihat di Lampiran)



LAMPIRAN
READING TEXT

Pay attention to the sentences in the bold!

THE GOLDEN BOYS

Every August. Every August for twelve years. Every August for twelve years we went to the same small town on holiday. Every August for twelve years we went to the same beach. Every August for twelve years my parents rented the same small house in the same small town near the same beach, so every morning of every August for twelve years I woke up and walked down to the same beach and sat under the same umbrella or on the same towel in front of the same sea.

There was a small café on the beach where we sat every day, and every day Mr. Morelli in the café said “Good morning!” to my parents, and then always patted me on the head like a dog. Every day we walked down to our red and white umbrella, every day my father sat on his deckchair and read the newspaper then went to sleep, every day my mother went for a swim in the sea and then went to sleep. Every lunch time we ate the same cheese sandwiches which my mother made, and then every afternoon we went up to the café and ate an ice cream while my parents talked to Mr Morelli about the weather. Every summer for twelve years I sat there and read books and sometimes played volleyball with some of the other boys and girls who were there, but I never made any friends.

It was so boring.

Every August for twelve years the same family sat next to us. They were called the Hamiltons. We had a red and white umbrella, they had a green one. Every morning my parents said “Good morning!” to Mr and Mrs Hamilton, and Mr and Mrs Hamilton said “Good morning!” to my parents. Sometimes they talked about the weather.

Mr and Mrs Hamilton had two sons. Richard was the same age as me, and his brother Philip was two years older than me. Richard and Philip were both taller than me. Richard and Philip were very friendly, and both very handsome. They were much friendlier and more handsome than me. They made friends with everyone, and organised the games of volleyball on the beach or swimming races in the sea with the other children. They always won the games of volleyball and the swimming races. My parents liked Richard and Philip a lot. “Why can’t you be more like Richard and Philip?” they said to me. “Look at them! They make friends with everyone! They are polite, good boys! You just sit here reading books and doing nothing!”

I, of course, hated them.

Richard and Philip, Richard and Philip, Richard and Philip – it was all I ever heard from my parents every August for twelve years. Richard and Philip were perfect. Everything about them was better than anything about me. Even their green beach umbrella was better than our red and white one.

I was sixteen years old the last summer we went there. Perfect Richard and perfect Philip came to the beach one day and said that they were going to have a barbecue at lunch time. They were going to cook for everyone! “Forget your cheese sandwiches”, they laughed, “Come and have some hamburgers or barbecue chicken with us! We’re going to cook!”

My parents, of course, thought this was wonderful. “Look at how good Richard and Philip are! They’re going to do a barbecue and they’ve invited everybody! You couldn’t organise a barbecue!”

Every summer for twelve years, on the other side of my family, sat Mrs Moffat. Mrs Moffat was a very large woman who came to the same beach every summer for twelve years on her own. Nobody knew if she had a husband or a family, but my parents said that she was very rich. Mrs Moffat always came to the beach wearing a large hat, a pair of sunglasses and a gold necklace. She always carried a big bag with her. She never went swimming, but sat under her umbrella reading magazines until lunchtime when she went home.

Richard and Philip, of course, also invited Mrs Moffat to their barbecue.

Richard and Philip’s barbecue was, of course, a great success. About twenty people came and Richard and Philip cooked lots of hamburgers and chicken and made a big salad and brought big pieces of watermelon and everyone laughed and joked and told Mr and Mrs Hamilton how wonderful their sons were. I ate one hamburger and didn’t talk to anybody. After a while, I left, and made sure that nobody saw me leave.

Mrs. Moffat ate three plates of chicken and two hamburgers. After that she said she was very tired and was going to go and have a sleep. She walked over to her umbrella and sat down on her deckchair and went to sleep. When she woke up later, everybody on the beach was surprised to hear her screaming and shouting.

“My bag!!!! My bag!!!” she shouted. “It’s gone!!! It’s GONE!!!” Everybody on the beach ran over to Mrs. Moffat to see what the problem was. “Someone has taken my bag!!!” she screamed, “Someone has stolen my bag!!!”

“Impossible!” said everybody else. “This is a very safe, friendly beach! There are no thieves here!” But it was true. Mrs. Moffat’s big bag wasn’t there anymore.

Nobody had seen any strangers on the beach during the barbecue, so they thought that Mrs Moffat had perhaps taken her bag somewhere and forgotten it. Mr Morelli from the café organised a search of the beach. Everybody looked everywhere for Mrs Moffat’s big bag.

Eventually, they found it. My father saw it hidden in the sand under a deckchair. A green deckchair. Richard and Philip’s deckchair. My father took it and gave it back to Mrs. Moffat. Everybody looked at Richard and Philip. Richard and Philip, the golden boys, stood there looking surprised. Of course, they didn’t know what to say.

Mrs. Moffat looked in her bag. She started screaming again. Her purse with her money in it wasn’t in the big bag. “My purse!” she shouted, “My purse has gone! Those boys have stolen it! They organized a barbecue so they could steal my purse!”

Everybody tried to explain to Mrs. Moffat that this couldn’t possibly be true, but Mrs. Moffat called the police. The police arrived and asked golden Richard and golden Philip lots of questions. Richard and Philip couldn’t answer the questions. Eventually, they all got into a police car and drove away to the police station.

I sat there, pretending to read my book and trying to hide a big, fat purse under the sand on the beach.

That was the last summer we went to the beach. My parents never talked about Richard and Philip again. (THE END)



Key answer dari soal (a):

http://www.englisch-hilfen.de/en/exercises/reported_speech/commands.htm
to clean the blue bike
to write a text message
to help Peter’s sister
to wash my hands
to open the window



Key answer dari soal (b):

http://www.englisch-hilfen.de/en/exercises/reported_speech/questions.htm
if I wanted to dance
when I came
if John had arrived
where Maria parked her car
if I watched the latest film



Key answer dari soal (c):

http://www.englisch-hilfen.de/en/exercises/reported_speech/questions2.htm
if the boys were reading the book
who had given me the laptop
if Tim was leaving on Friday
if it would rain today
where Jennifer played football



Produced by: Kidam and Wilin

Edited by: Rum Hera Ria, Gumawang Jati, Itje Chodidjah

Have Something Done Game
1. Class : XI/1
2. Competency Standard : Mengungkapkan makna dalam bentuk teks fungsional pendek (misalnya : banner, poster, pamphlet, dll) resmi dan tidak resmi dengan menggunakan ragam bahasa tulis secara akurat, lancar dan berterima dalam konteks.
3. Indicator : Siswa dapat menyusun kata-kata adjective menjadi adjective phrases yang benar melalui game
4. Time : 20 menit
5. Focus Skill : Writing
6. Remarks : Guru dan siswa menggunakan Lab Komputer dan jaringan internet untuk kegiatan belajar
7. Background Knowledge : Causative verbs

Teaching Steps :
HIGH TECH OPTION :

Siswa bekerja secara individu.

Link ke: http://www.britishcouncil.org/learnenglish-central-grammar-grammar-games-archive.htm

Klik bagian huruf “H”

Klik kata “Have something done”

Pilih kata2 di dalam box yang cocok untuk masing2 kalimat dibawahnya.

Drag kata-kata tsb pada kalimat yang sesuai.

Jika sudah selesai, klik ”Submit”

Cek jawaban dengan mengklik ”your answers”.



Produced by: KIDAM

Edited by: Rum Hera Ria
“Bringing Avril Lavigne into Classroom” 1 Day, 2 Hours ago Karma: 0
“Bringing Avril Lavigne into Classroom”

Written by: Herri Mulyono

Abstract

Researches on humanistic approach have given us clear understanding that human psychological elements must also interfere the classroom teaching and learning in order to sustain the improvement of human language learning. If not, teaching and learning process likely would never achieve its success. In such view, teaching and learning process is about to treat human as “the whole person” that teacher not only deals with cognitive side of his or her students, but also emotion or I would say as “affective interference.” I agree with Harmer with his argument that “the decision to use humanistic-style activities will depend on how comfortable teachers and students are working with real lives and feelings” (Harmer, 2007: 91).

This article would briefly describe the process of teaching and learning carried out under the principles of humanistic approach, particularly Suggestopedia. I would like to take a sample from my teaching and learning activities for grade XI science class at public senior high school of 92 (SMAN 92 Jakarta) with total of 36 students. Half from the total number of those students have been so much familiar with me as they joined my previous grade X class while others are to be strange and seemingly need time for adaptation. Classroom used for teaching and learning process was in a common setting where whiteboard was placed on the front of the class wall, just near the teacher’s desk. Students’ desks were set in eight rows with 6 students in each and were sat in group of two. The teaching and learning process was carried out in classroom context of SMAN 92 Jakarta without making any changes.

In addition, justification of such practice above would be conducted further to provide correspondence between the real practice of Suggestopedia and its fundamental principles. Each step from my teaching and learning practice would be explained if they are tailored to the teaching and learning principles under Suggestopedia. At the final session, conclusion would be made to give general possibility of practicing Suggestopedia in Indonesian classroom context.

To read full article please visit my web: myenglish01.wordpress.com/2008/11/07/%E2...-classroom%E2%80%9D/
http://h2te.depdiknas.go.id/index.php/2-sma/113-have-something-done-game.html
Annual Review of Applied Linguistics (2005) 25, 228–242. Printed in the USA.
Copyright © 2005 Cambridge University Press 0267-1905/05 $12.00
228
12. TRENDS IN COMPUTER-BASED SECOND LANGUAGE ASSESSMENT
Joan Jamieson
In the last 20 years, several authors have described the possible changes that
computers may effect in language testing. Since ARAL’s last review of general
language testing trends (Clapham, 2000), authors in the Cambridge Language
Assessment Series have offered various visions of how computer technology could
alter the testing of second language skills. This chapter reflects these perspectives as
it charts the paths recently taken in the field. Initial steps were made in the
conversion of existing item types and constructs already known from paper-andpencil
testing into formats suitable for computer delivery. This conversion was
closely followed by the introduction of computer-adaptive tests, which aim to make
more, and perhaps, better, use of computer capabilities to tailor tests more closely to
individual abilities and interests. Movement toward greater use of computers in
assessment has been coupled with an assumption that computer-based tests should
be better than their traditional predecessors, and some related steps have been taken.
Corpus linguistics has provided tools to create more authentic assessments; the quest
for authenticity has also motivated inclusion of more complex tasks and constructs.
Both these innovations have begun to be incorporated into computer-based language
tests. Natural language processing has also provided some tools for computerized
scoring of essays, particularly relevant in large-scale language testing programs.
Although computer use has not revolutionized all aspects of language testing, recent
efforts have produced some of the research, technological advances, and improved
pedagogical understanding needed to support progress.
Since 2000, six new books describing the assessment of reading, listening,
language for special purposes, speaking, vocabulary, and writing have been
published in the Cambridge Language Assessment Series edited by J. Charles
Alderson and Lyle Bachman (Alderson, 2000; Buck, 2001; Douglas, 2000; Loumi,
2004; Read, 2000; Weigle, 2002). In the last chapter of every book (except for
Assessing Speaking), the authors addressed the potential and the challenges of using
the computer for assessment, following in the footsteps of Canale (1986) by alerting
us to limitations while charting directions for the future. Each author mentioned
some of the efficiency arguments that have, for many years now, been suggeted for
computer-based assessment such as less time needed for testing, faster score
COMPUTER-BASED ASSESSMENT 229
reporting, and provision of more convenient times for test-takers to take the test.
Authors also mentioned the arguments that have been put forward for computeradaptive
assessment, such as more precise information about each individual’s level
of ability, which is thought to result in more challenging and more motivating
experiences for test takers than traditional pencil-and-paper tests (e.g., Chalhoub-
Deville & Deville, 1999; Madsen, 1986). The authors cautioned us that computerbased
assessment is expensive, and that there is a danger in letting advances in both
technology and in statistical procedures in psychometrics lead us down misguided
paths of development. Most importantly, though, these experts articulated their
visions of how computers may redirect second language assessment by providing us
means to enhance the authenticity of our tests and to clarify our understanding of
language abilities. In this chapter, I survey the field of computerized second
language assessment mainly in terms of English as a second language (ESL),
reflecting the directions of these commentators, looking at where we are, and, like
the others, trying to foresee what lies ahead. For current reviews of widely used ESL
tests, see Stoynoff and Chapelle (2005).
First Steps: Using Traditional Item Types and Constructs in Computerized
Formats
The ease with which multiple choice or fill-in-the-blank items testing
vocabulary, grammar, and reading comprehension can be adapted to a computerized
environment is evident in the number of language tests on the Web. Glenn Fulcher
maintains a very useful web site with links, as well as brief reviews, to many second
language tests. Dave’s ESL Cafe has a Quiz Center that provides students with a
number of short tests which are immediately scored; these quizzes illustrate typical
“low-tech” online assessments in which a hodgepodge of topics are grouped together
in one site where a learner can answer a few questions, click on a button, and receive
a percentage correct score. Somewhat more sophisticated examples offer quick
checks of a learner’s English level. There are many examples currently on the web;
for the purpose of illustration, only a few examples are included here.
ForumEducation.net offers learners two multiple choice vocabulary tests to estimate
word knowledge as an indication of English proficiency. Wordskills.com offers
visitors to its site three levels of tests (with 25 items each) which are said to
approximate (disclaimer provided) the Cambridge First Certificate, Certificate in
Advanced English, and the Certificate of Proficiency in English; the site has other
tests including assessment of negotiation skills and e-commerce. Also, Churchill
House offers online tests for students to check their level to prepare for the most
appropriate test in the University of Cambridge Local Examination Syndicate
(UCLES) suite of exams. All of the items are in multiple choice format.
Netlanguages.com offers learners a two-part test to determine their level and
appropriate placement in online courses. Learners initially select what they think
their level of English is. The first section then tests grammar by having the test taker
drag and drop a word into a sentence; if the test-taker’s score is low after the first 10
items, for example, a message indicates that another level may be more appropriate.
The second section consists of choosing a few questions from a larger set and then
writing two or three sentences in response to them and submitting the responses to
230 JOAN JAMIESON
get scored by a human rater. Many of these tests are aligned with online learning
programs. At study.com learners are provided with EFI (English for Internet)
placements tests in listening, speaking, and writing in addition to vocabulary,
reading, and grammar in preparation for free, online instruction.
In 2001, Irene Thompson edited a special issue of Language Learning &
Technology, which was devoted to the topic of the theory and practice of computerbased
language testing (Thompson, 2001). Several of the columns and reviews
pointed out web resources (Leloup & Ponterio, 2001) and tools for web test
development (Godwin-Jones, 2001; Polio, 2001; Winke & MacGregor, 2001). In his
article on web-based testing in the same issue, Roever (2001) stated that these types
of tests are most suitable for low-stakes decisions when the tests are used for checks
on learning, test preparation, or research in which learners have little or no reason to
cheat, and in situations in which test takers can take the test whenever they want, in
the privacy of their own homes, and at their own pace.
In some cases, rather than creating a new set of items, a paper-and-pencil
version of a test has been directly converted to a computerized format, which may be
on the web or not. As in the examples given previously, language ability is often
characterized in terms of performance on subtests such as vocabulary, grammar,
reading comprehension, and/or listening comprehension; in other words, constructs
that have been used in earlier tests are adapted to a new medium. In the case of the
Test of English Proficiency at Seoul National University, new technology was used
to measure English skills with accessible administration and fast score reporting
(Choi, Sung Kim, & Boo, 2003). In the case of the Test of English as a Foreign
Language (TOEFL) computer-based test, this was done for a research project to
investigate the potential threat to validity caused by lack of computer familiarity
(Taylor, Kirsch, Jamieson, & Eignor, 1999; also see ETS, 2000). Differences that
may affect performance because of the ways computer-based tests and paper-andpencil
tests present tasks to test-takers are discussed briefly by both Alderson (2000)
and Douglas (2000). In her comparison of paper-and-pencil and computerized
reading tests, Sawaki (2001) provides us with a thorough review of many factors that
need to be considered when the mode of delivery is changed.
Second Steps: “Added Value” to Computerized Assessments
Computer-Adaptive Tests
Because the cost of developing computer-based tests is much higher than
that of traditional paper-and-pencil tests, Buck (2001) wrote that “there need to be
significant advantages to make that worthwhile” (p. 255). In many language testing
programs, a “value-added” approach to computerization of the paper-and-pencil
format was the introduction of computer-adaptive sections. An added benefit of a
computer-adaptive test is that theoretically test-takers are given items that are wellsuited
to their abilities, thus making the test both more interesting and a more
accurate measure.
COMPUTER-BASED ASSESSMENT 231
As described in Wainer et al. (2000) and Eignor (1999), information such as
the content and difficulty of an item needs to be determined before it can be
administered in a computer-adaptive test. Item function is characterized by item
parameters—item difficulty, item discrimination, and guessing—and the
measurement model used to estimate these parameters is Item Response Theory
(IRT). Some computer-adaptive tests use only the first parameter, item difficulty;
other tests use all three parameters. To get stable item parameters, items are typically
pretested on what is considered a representative sample of test takers. Pretesting is a
necessary, but expensive, condition for a computer-adaptive test. For example, the
ACT ESL test used over 2,000 test-takers to estimate its item parameters (ACT, Inc.,
2000).
The basic idea of computer adaptive tests is that a test taker will respond to
an item with known item parameters. That response will be automatically scored by
the computer so that a type of profile of the test taker can be initiated. As the test
taker proceeds, his or her “profile” is continually updated, based on the correctness
of response to items that are being selected as representative of a certain range of
parameters. The necessity of automatic scoring for subsequent item selection was a
relatively easy condition to meet for language tests using the traditional constructs of
vocabulary, grammar, reading comprehension, and listening comprehension that
were originally presented in a paper-and-pencil format with multiple choice items.
There are currently a number of computer-adaptive second language tests
that reflect this trend. Examples range from the English Comprehension Level Test
administered at the Defense Language Institute, the ACT ESL Placement Test
(COMPASS/ESL), the Business Language Testing Service (BULATS), and the
Structure and Written Expression section and the Listening Section of the computerbased
TOEFL; the reading section was not computer-adaptive so that test takers
could skip items and return to them later (ETS, 2000). Readers interested in issues
related to computer-adaptive testing of reading are directed to the edited volume by
Chalhoub-Deville (1999).
Because computer-adaptive tests for the most part rely on item response
theory, a necessary assumption is that items must be conditionally independent.
When several items are based on the same passage, in a reading comprehension test,
for example, this situation can be handled by treating all of the items in a set as one
“large” item, which is polytomously scored. In other words, instead of having six
reading comprehension items each scored 0–1, there would just be one item, scored
0–6. Work reported in the mid-1990s found that polytomous IRT models could be
used to combine scores from traditional, dichotomously scored items
(correct/incorrect or 1/0) with these polytomously scored items (Tang, 1996; Tang &
Eignor, 1997). However, the assumption cannot be met when a reading passage is
not only used for reading comprehension items, but also as the content for a writing
item.
232 JOAN JAMIESON
“Adaptive” Tests in Different Guises
The idea of adapting test delivery and content to more closely match the testtaker’s
interests and/or ability level has been appealing to language test developers,
whose tests, for one reason or another, do not meet the stringent requirements of
traditional computer-adaptive tests. Consequently, there are a number of
computerized language tests currently available or in the research and development
stages that make adaptations from a format in which all test takers are given the same
items in the same order. Two techniques which allow for the tailoring of tests to the
individual are self-assessment and short screening tests.
There are cases in which computer-adaptive tests are not currently possible,
such as speaking tests in which oral discourse is scored by human raters. Here, the
condition of automatic scoring one item, or task, to aid in the selection of subsequent
items cannot be met. Still, test developers are interested in incorporating some
flexibility, or “adaptivity,” into the selection of tasks. The COPI (Computerized Oral
Proficiency Interview) is the latest development in assessing speaking from the
Center for Applied Linguistics, following the SOPI (Simulated Oral Proficiency
Interview—tape recorded) and the OPI (oral proficiency interview—humans). In the
original COPI project, existing material from SOPIs formed a test bank. Test takers
were asked to assess their speaking level, ranging from intermediate to superior.
Seven speaking tasks were then generated from the pool—four at the level of selfassessment
and three at the next level above that (Kenyon, Malabonga, & Carpenter,
2001). Contrasting test-takers’ attitudes between the SOPI and the COPI, Kenyon
and Malabonga (2001) reported that test-takers felt that the COPI was less difficult
than the SOPI and that “the adaptive nature of the COPI allowed the difficulty level
of the assessment task to be matched more appropriately to the proficiency level of
the examinee” (p. 60) although, as Norris (2001) pointed out, this claim needs further
investigation. The Center for Applied Linguistics web site indicates that efforts are
underway to create operational versions of the COPI in Arabic and Spanish which
should be available on CD-ROM in 2006.
In other cases, items can be automatically scored, but the prerequisite pilot
testing of items needed to estimate item parameters is not feasible. Still, test
developers are interested in “adapting” the selection of items for each test taker.
Two techniques have been used recently. One technique has test takers answer
survey questions to select content of interest. The other has test takers respond to
items in a screening test to estimate their ability level so that subsequent test sections
can be selected at an appropriate level of difficulty. Longman English Assessment
(Jamieson & Chapelle, 2002; Chapelle, Jamieson, & Hegelheimer, 2003) illustrates
both of these techniques. This low-stakes test was designed to provide English
language learners with an interesting experience that would give them feedback
about their level of proficiency and recommendations for improving their English.
This test begins with a fifteen minute “interest and ability finder.” First, test-takers
answer survey questions about why they want to learn English; responses are used to
select business or more general content. Second, all test-takers are then administered
a set of vocabulary and grammar questions; responses are used to recommend a level
COMPUTER-BASED ASSESSMENT 233
(i.e., beginning, intermediate, advanced) for the subsequent sections of the test.
Because of the inability to pretest items in order to determine their difficulty
empirically, the test developers relied on theory to create items and sections at
different ability levels. For example, the vocabulary items were based on word
frequency counts, the grammar items on developmental sequences, the written
structures on textbook analysis, and the reading and listening comprehension items
on a set of processing conditions (Rupp, Garcia, & Jamieson, 2002). The text
passages were analyzed using corpus linguistic techniques together with the
judgments of experienced teachers.
A final example in this section is DIALANG, which incorporates both selfassessment
and a screening test to create an “adaptive” test for individual language
learners. Like Longman English Assessment, DIALANG is a low-stakes test with
which individual language learners can find out about their ability levels in
vocabulary, grammar, writing, reading, and listening. It is currently available in 14
languages: Danish, Dutch, English, Finnish, French, German, Greek, Icelandic, Irish,
Italian, Norwegian, Portuguese, Spanish and Swedish. The theoretical foundation for
DIALANG is the Common European Framework. Test takers first select the
language and the skill in which they want to be tested. Then, DIALANG begins with
a vocabulary test which is used for initial screening; this is followed by a selfassessment
section. These are followed by a test in whatever language skill was
chosen, which is modified for the test taker based on information from the screening
test and self-assessment. A pilot version can be downloaded from the DIALANG
website.
In this section, I have illustrated language tests that are different from
traditional tests mainly because they are administered on computer rather then with
paper-and-pencil or with an interviewer, and because many of them make use of
computer technology to branch test takers to different subsets of items or tasks.
These tests can be adaptive or linear, and they can be administered via the web, CD,
or network. Each method of delivery offers potential benefits and problems, as
summarized by Brown (2004). However, many of these tests do not provide us with
an alternate construct of language ability. So, although the tests are innovative in
terms of technology, they are not particularly innovative in their operationalization of
communicative language ability.
Next Steps: Innovation in How Language is Perceived
As Douglas wrote, “language testing that is driven by technology, rather
than technology being employed in the service of language testing, is likely to lead
us down a road best not traveled” (2000, p. 275). Moving beyond technical
innovation and into the realm of how we perceive and measure language ability
illustrates progress in our ability to make use of computers. In this section, I
highlight innovations involved with language tasks and score reporting.
234 JOAN JAMIESON
Language tasks
Considering both the nature of the input and the nature of the expected
response as elements in a task, we can find examples of computerized tests that are
trying to more closely replicate the target language use domain (TLU; Bachman &
Palmer, 1996), addressing the test quality of authenticity and also enriching our
understanding of language use in different situations. Read (2000) discussed how a
broader social view of vocabulary in testing would take into account the concept of
register, addressing questions such as what sort of vocabulary particular tasks
require. Douglas (2000) pointed to the potential role of corpus linguistics
investigating TLU contexts to increase our understanding of how grammar,
vocabulary, and rhetorical structure are used in different contexts.
Informed by research in corpus linguistics, the development of the new
TOEFL provides us with an example of how ideas such as these have been put into
practice. Biber and his colleagues (2002) created a corpus of spoken and written
academic language that was subsequently analyzed for lexical and syntactic
characteristics. This corpus then served as a type of baseline for language used in
different registers and genres. Spoken texts, for example, included lectures with
different amounts of interaction, service encounters, and classroom management.
Listening passages were developed following real-life examples. Biber et al. (2004)
created tools which allowed the features of the created passages to be compared to
the features of the texts analyzed in the corpus. Instead of incorporating this
interactionist construct perspective, a trait perspective was used by Lanfer and her
colleagues (2004a, 2004b) in their innovative vocabulary test of size and four
degrees of strength of knowledge; the test was piloted in a computer-adaptive format.
Another innovation of the new TOEFL is that it includes tasks that resemble
those performed by students in North American colleges and universities (Rosenfeld,
Leung, & Oltman, 2001). These “integrated skills” tasks require test takers to use the
information provided in the reading and listening passages in essay and/or spoken
responses. The same passages also serve as the input for reading and listening
comprehension questions. This use of more authentic tasks provides evidence for the
representation inference in the validity argument for the new TOEFL, and it has also
forced language testers to grapple with the traditional, yet conflicting, inferences of
performance on language tests between underlying ability and task completion
(Chapelle, Enright, & Jamieson, forthcoming). Moreover, this decision to include
integrated tasks resulted in the violation of the assumption of IRT that tasks on a
single scale be conditionally independent, and without IRT, the necessary
psychometric base for calibrating items for computer-adaptive delivery was lost. The
current need for human raters of speaking and writing tasks also precluded the use of
computer-adaptive delivery (Jamieson, Eignor, Grabe, & Kunnan, forthcoming). The
decision to develop computerized tasks that better represent authentic language use
rather than to be constrained in task development by relying on known technology
and psychometrics marks a new direction in large scale, high-stakes testing.
COMPUTER-BASED ASSESSMENT 235
Another task factor that has been discussed in the literature is response type;
associated with this is factor is whether or not the response can be scored by a
computer. Up to this point, most of the examples discussed so far use multiple
choice responses, which may appear as familiar types where test takers click on a
circle before one of four choices, or as more novel drag and drop item types. These
items share the essential quality of having one correct answer and being
dichotomously scored as either correct or incorrect. More innovative item types
require the test taker to identify several important points from a reading or listening
passage, or to select sentences from a pool to form a paragraph. These items are
often scored polytymously—on a scale from 0 to 3, for example. However, to date
the rationales for associating the partial scores for each item have more to do with
controlling for guessing (Enright, Bridgeman, Eignor, Lee, & Powers, forthcoming)
than they have to do with the construct definition, as Chapelle illustrated in an
example of decisions for partial correction in a c-test (2001). This is because the
computer programs used to analyze responses in most current operational language
tests are not based on analysis of linguistic output, but rather on matching the pattern
of the correct response to the response given by the test-taker.
A test which stands in sharp contrast to this situation is the Spoken English
Test, formerly known as PhonePass. SET-10 is administered over a telephone; it
uses a speech recognition computer program to analyze and to score the phonological
representation of responses to each item. “The system generates scores based on the
exact words used in the spoken responses, as well as the pace, fluency, and
pronunciation of the words. . . Base measures are derived from the linguistic units . . .
based on statistical models of native speakers” (See ordinate website listed at the end
of this chapter; Ordinate, 1999).
Score Reports
The desire to communicate meaningful interpretations of test scores is not
limited only to computer-based assessment. However, recent language assessments
delivered with computers have incorporated some innovations in the types of
information about test-takers given to test score users.
“The possibility of recording response latencies and time on test or task
opens up a whole new world of exploration of rates of reading or word recognition”
(Alderson, 2000, p. 351). Although I think that Alderson was addressing a
fundamental change in construct definition, the closest example of a large scale,
standardized test which makes use of test takers’ response times is ACT’s
COMPASS/ESL. ACT began development of this test in 1993, in consultation with
ESL professionals. Its purpose is mainly placement of students in community
colleges in the United States. Having chosen to deliver the test in an adaptive
format, “students can proceed at their own pace on the computer, yet have their
assessment time monitored automatically, allowing institutions to use the amount of
time the student spends taking the test as an additional factor in placement decisions”
(ACT, Inc., 2000, p. 132).
236 JOAN JAMIESON
Each of the three tests in the ACT battery (grammar/usage, reading, and
listening) includes a five-level score scale. The levels are mapped to the tests’
specifications and proficiency descriptors are provided for each level. These
proficiency descriptors were created by not only examining what was measured
internally on the test, but by also examining external sources such as learning
outcomes tied to college-level ESL curricula and well known benchmarks such as the
ACTFL Guidelines. An example from the listening scale follows:
Proficiency Level 2 (67-81): Students at Level 2 typically
have the ability to understand brief questions and answers relating
to personal information, the immediate setting, or predictable
aspects of everyday need. They understand short conversations
supported by context but usually require careful or slowed speech,
repetitions, or rephrasing. Their comprehension of main ideas and
details is still incomplete. They can distinguish time forms, some
question forms (Wh-, yes/no, tag questions), most common wordorder
patterns, and most simple contractions, but the students may
have difficulty with tense shifts and more complex sentence
structures. (ACT, 2000, p. 45)
Proficiency scaling approaches assume that the abilities used to explain test
scores are defined by task characteristics, which are hypothesized to be responsible
for task difficulty. IRT, used as the psychometric model in the COMPASS/ESL test,
allows items and person ability to be placed on the same scale. Similar proficiency
scales were considered for test score reporting in the new TOEFL (Jamieson, Jones,
Kirsch, Mosenthal, & Taylor, 2000). However, because IRT was not going to be
used, other types of performance descriptors, for example, teacher assessments of
what a person at a certain score level could probably be expected to do, were
considered to accompany score reports (Powers, Roever, Huff, & Trapani, 2003).
Another way to make score reports meaningful to the test taker is to provide
information on how to improve one’s language ability. This type of feedback is
provided by both DIALANG and Longman English Assessment. In the latter, the
computer program generates a list of study strategies based on test takers responses
to where they are studying English (in an English-speaking country or not), together
with their level scores.
Future Steps: Computer Scoring of Constructed Responses, New Item Types
The authors in the Cambridge Language Assessment Series look forward to
corpus-based vocabulary lists (Read, 2000) and corpora of field specific discourse
(Douglas, 2001), realistic simulations, perhaps through virtual reality (Buck, 2001),
and the capture of every detail of a learner’s progress through a test (Alderson,
2000). All of these are possible, of course, yet it is the scoring of writing by
computer (Weigle, 2002) that I think is the most probable advancement in
technology that will occur in large scale language assessment in the near future.
COMPUTER-BASED ASSESSMENT 237
Three computer scoring systems are clearly described by Weigle: Latent
Semantic Analysis, Project Essay Grade (Tru-Judge, Inc.), and E-rater (Educational
Testing Service, ETS). Another automatic scoring system is the IntelliMetric Essay
Scoring Engine (Vantage Learning). Intelligent Essay Assessor uses Latent
Semantic Analysis (formerly owned by Knowledge Assessment Technologies, its
acquisition by Pearson Education was announced on June 29, 2004). According to
Weigle (2002), Latent Semantic Analysis has the practical disadvantage of not taking
word order into account. As described by Chodorow and Burstein (2004, p. 1), each
system trains on essays that have been read by human raters who have assigned a
holistic score. From the training examples, the system learns which features are the
best predictors of the essay score. “The differences among the systems lie in the
particular features that they extract and in the ways in which they combine the
features in generating a score” (Chodorow & Burstein, 2004, p. 1). E-rater’s
particular features are based on four kinds of analysis: syntactic, discourse, topical,
and lexical. In version 2.0, because essay length was found to be “the single most
important objectively calculated variable in predicting human holistic scores” essay
length as measured by number of words was included in e-rater’s updated feature set
(Attali & Burstein, 2004, p. 4). Although Weigle stated that this trend is
controversial among writing teachers, these automated systems have none the less
reported high correlations with human raters (Burstein & Chodorow, 1999).
COMPASS/ESL offers a writing section called COMPASS e-Write that is
scored using IntelliMetric. An overall score ranging in value from 2–8 is given to
each essay, along with feedback on content, style, organization, focus, and
mechanics. This technology is available on the web; the instructional part of the
software is available at MyAccess! Although not clear from the COMPASS/ESL
website, MyAccess! is available for ESL students, and has been developed to score
essays written by nonnative speakers of English. E-rater has been used by ETS for
the Graduate Management Admissions Test Analytic Writing Assessment since
1999. Criterion Online Essay Evaluation is the instructional software using e-rater
from ETS (Burstein, Chodorow, & Leacock, 2003). ETS currently uses automated
scoring of essays, e-rater, in the Next Generation TOEFL Practice Test, as well as
ScoreItNow. Research is underway to evaluate the feasibility of using e-rater
operationally as second rater for new TOEFL independent writing samples. At
present there are no plans for automated scoring of speech for new TOEFL (personal
communication, M. K. Enright, October 27, 2004).
As research progresses, it will be interesting to watch for advances in
computer scoring of constructed, or free, responses. Another ETS program, C-rater,
has been developed to measure a student’s understanding of specific content without
regard for writing skills (Leacock, 2004). Also, the Spoken English Test has included
a free response section on its Set-10 Form for many years. Information from this
section has not been included in score reports, but rather has been included to collect
data for research on automated scoring of spoken free responses. It was announced
on September 29, 2004, that the company that developed PhonePass, Ordinate, was
acquired by the large testing company, Harcourt Assessment, Inc.
238 JOAN JAMIESON
Finally, to return to the topic of task types on language assessments, nothing
has yet been mentioned about the use of video. Certainly this is one area in test
development that seems natural to include in the creation of authentic language input.
Multimedia computer-assisted language learning software such as Longman English
Interactive (Rost & Fuchs, 2003) includes video in its quizzes and tests, but most
stand-alone computer-based language tests do not. Why? It is expensive to develop,
it requires high-end technology to transmit, and it is unclear how it affects the
construct. Douglas (2000) commented that language testers have so far not made
much progress in adapting multimedia to the goals of measurement; he cautions us
that, although it is tempting to make use of video to provide authentic content and
contextualization cues, we need to understand and control for test method effects
associated with these technological resources. With this caution in mind, it is still
intriguing to imagine new types of tasks to broaden our perspective of what is
possible for language tests. (A good place to begin is at DIALANG’s web site; click
on “New item types.”)
Summary
As we read in Burstein, Frase, Ginther, and Grant (1996), the role of
computers in assessment is broad because computers have a number of functions
essential for an operational testing program such as the TOEFL, including item
creation and presentation; response collection and scoring; statistical analysis; and
storage, transmission, and retrieval of information. This review has not gone into
depth on many of these issues, but has rather highlighted a few recent trends mainly
surveying the literature in which authors predict that technology will play a role in
language assessment in the future. We can also see that the future envisioned by
many of these commentators is the distant future. While discussing the advancement
of applied linguistics through assessment, Chapelle (2004) referred to “the tunnel of
efficiency” and “the panorama of theory.” Although we may be mesmerized by
panoramic visions, a journey begins on a path where we place one foot in front of the
other, making what appears at first to be modest progress.
Web sites (Addresses functional as of December 8, 2004)
BULATS (http://www.bulats.org)
Churchill House (http://www.churchillhouse.com/english/exams.html)
COMPASS/ESL (http://www.act.org/compass/ or http://www.act.org/esl)
COMPASS e-Write (http://www.act.org/e-write/index.html)
COPI (http://www.cal.org/projects/copi/)
Dave Sperling’s ESL Cafe Quiz Center (http://www.pacificnet.net/~sperling/quiz/)
DIALANG (http://www.dialang.org/)
COMPUTER-BASED ASSESSMENT 239
DIALANG’s experimental items http://www.lancs.ac.uk/fss/projects/linguistics/
experimental/new/start.htm)
Educational Testing Service, Criterion, e-rater, and c-rater (http://www.ets.org/
research/erater.html)
Educational Testing Service, Next Generation TOEFL (http://www.ets.org/toefl/
nextgen/)
Elite Skills Ltd. (http://wordskills.com/level)
English Comprehension Level Test (http://www.dlielc.org/testing/ecl_test.html)
Forum Education (http://www.forumeducation.net/servlet/pages/vil/mat/index.htm)
Glenn Fulcher’s web site (http://www.dundee.ac.uk/languagestudies/ltest/ltr.html)
Intelligent Essay Assessor (http://www.knowledge-technologies.com/)
Language projects supported by European Union (http://europa.eu.int/comm/
education/programmes/socrates/lingua/index_en.html)
MyAccess! (http://www.vantagelearning.com/product_pages/myaccess.html)
Net Languages (http://www.netlanguages.com/home/courses/level_test.htm)
Ordinate, Spoken English Tests (http://www.ordinate.com/content/prod/prodlevel1.
shtml)
StudyCom English For Internet (http://study.com/tests.html)
Vantage Laboratories (http://www.vantage.com/)
REFERENCES
ACT, Inc. (2000). COMPASS/ESL reference manual. Iowa City, Iowa: Author.
Alderson, J. C. (2000). Assessing reading. New York: Cambridge University Press.
Attali, Y., & Burstein, J. (2004). Automated essay scoring with E-rater V.2.0.
Available on-line at http://www.ets.org/research/erater.html
Bachman, L., & Palmer, A. (1996). Language testing in practice. New York: Oxford
University Press.
240 JOAN JAMIESON
Biber, D., Conrad, S., Reppen, R., Byrd, P., & Helt, M. (2002). Speaking and writing
in the university: A multidimensional comparison. TESOL Quarterly, 36, 9–
48.
Biber, D., Conrad, S., Reppen, R., Byrd, P., Helt, M., Clark, V., Cortez, V., Csomay,
E., Urzua, A. (2004). Representing language use in the university: Analysis
of the TOEFL 2000 Spoken and Written Academic Language Corpus.
TOEFL Monograph MS-25. Princeton, NJ: Educational Testing Service.
Brown, J. D. (2004). For computerized language tests, potential benefits outweigh
problems. Essential Teacher, 1, 4, 37–40.
Buck, G. (2001). Assessing listening. New York: Cambridge University Press.
Burstein, J., & Chodorow, M. (1999, June). Automated essay scoring for nonnative
English speakers. In Proceedings of the ACL99 Workshop on Computer-
Mediated Language Assessment and Evaluation of Natural Language
Processing. College Park, MD. Available online at
http://www.ets.org/research/dload/acl99rev.pdf
Burstein, J., Chodorow, M., & Leacock, C. (2003). Criterion online essay evaluation:
An application for automated evaluation of student essays. Proceedings of
the Fifteenth Annual Conference on Innovative Applications of Artificial
Intelligence, Acapulco, Mexico. Available online at:
http://www.ets.org/research/dload/iaai03bursteinj.pdf
Burstein, J., Frase, L., Ginther, A., & Grant, L. (1996). Technologies for language
assessment. Annual Review of Applied Linguistics, 16, 240–260.
Canale, M. (1986). The promise and threat of computerized adaptive assessment of
reading comprehension. In C. Stansfield (Ed.), Technology and language
testing (pp. 30–45). Washington, DC: TESOL Publications.
Chalhoub-Deville, M. (Ed.) (1999). Issues in computer adaptive testing of reading
proficiency. Studies in Language Testing, 10. New York: Cambridge
University Press.
Chalhoub-Deville, M., & Deville, C. (1999). Computer adaptive testing in second
language contexts. Annual Review of Applied Linguistics, 19, 273–299.
Chapelle, C. A. (2004). English language learning and technology. Philadelphia:
John Benjamins.
Chapelle, C. A. (2001). Computer applications in second language acquisition. New
York: Cambridge University Press.
Chapelle, C. A., Enright, M. K., & Jamieson, J. (forthcoming). Challenges in
developing a test of academic English. In C. A. Chapelle, M. K. Enright, &
J. Jamieson (Eds.), Building a validity argument for TOEFL. Mahwah, NJ:
Erlbaum.
Chapelle, C. A., Jamieson, J., & Hegelheimer, V. (2003). Validation of a web-based
ESL test. Language Testing, 20, 409–439.
Chodorow, M., & Burstein, J. (2004). Beyond essay length: Evaluating e-rater’s
performance on TOEFL essays. TOEFL Research Reports, Report 73.
Princeton, NJ: Educational Testing Service.
Choi, I-C., Sung Kim, K., & Boo, J. (2003). Comparability of a paper-based
language test and a computer-based language test. Language Testing, 20,
295–320.
COMPUTER-BASED ASSESSMENT 241
Clapham, C. (2000). Assessment and testing. Annual Review of Applied Linguistics,
20, 147–161.
Douglas, D. (2000). Assessing languages for specific purposes. New York:
Cambridge University Press.
Educational Testing Service (ETS). (2000). The computer-based TOEFL score user
guide. Princeton, NJ: Author.
Eignor, D. (1999). Selected technical issues in the creation of computer-adaptive
tests of second language reading proficiency. In Issues in computer adaptive
testing of reading proficiency: Studies in Language Testing, 10 (pp. 167–
181). New York: Cambridge University Press.
Enright, M. K., Bridgeman, B., Eignor, D., Lee, & Powers, D. E. (forthcoming).
Designing measures of listening, reading, writing, and speaking. In C. A.
Chapelle, M. K. Enright, & J. Jamieson (Eds.), Building a validity argument
for TOEFL. Mahwah, NJ: Erlbaum.
Godwin-Jones, R. (2001). Emerging technologies. Language Learning &
Technology, 5(2), 8–12.
Jamieson, J., & Chapelle, C. A. (2002). Longman English Assessment. New York:
Pearson Longman.
Jamieson, J., Eignor, D., Grabe, W., & Kunnan, A. (forthcoming). The frameworks
for reconceptualization of TOEFL. In C. A. Chapelle, M. K. Enright, & J.
Jamieson (Eds.), Building a validity argument for TOEFL. Mahwah, NJ:
Erlbaum.
Jamieson, J., Jones, S., Kirsch, I., Mosenthal, P., & Taylor, C. (2000). TOEFL 2000
framework: A working paper. TOEFL Monograph Series 16. Princeton, NJ:
Educational Testing Service.
Kenyon, D., & Malabonga, V. (2001). Comparing examinee attitudes toward
computer-assisted and other oral proficiency assessments. Language
Learning & Technology, 5(2), 60–83.
Kenyon, D., Malabonga, V., & Carpenter, H. (2001). Response to the Norris
commentary. Language Learning & Technology, 5(2), 106–108.
Lanfer, B., Elder, C., Hill, K., & Congdon, P. (2004a). Size and strength: Do we need
both to measure vocabulary knowledge? Language Testing, 21, 202–226.
Lanfer, B., & Goldstein, Z. (2004b). Testing vocabulary knowledge: Size, strength,
and computer adaptiveness. Language Learning, 54, 399–436.
Leacock, C. (2004). Scoring free-responses automatically: A case study of a largescale
assessment. Examens, 1, 3. English version available online at
http://www.ets.org/research/erater.html
LeLoup, J., & Pontierio, R. (2001). On the net, language testing resources. Language
Learning & Technology, 5(2), 4–7.
Loumi, S. (2004). Assessing speaking. New York: Cambridge University Press.
Madsen, H. (1986). Evaluating a computer-adaptive ESL placement test. CALICO
Journal, 4(2), 41–50.
Norris, J. (2001). Concerns with computerized adaptive oral proficiency assessment.
Language Learning & Technology, 5(2), 99–105.
Ordinate Corp. (1999). PhonePass testing: Structure and content. Ordinate
Corporation Technical Report. Menlo Park, CA: Author.
242 JOAN JAMIESON
Polio, C. (2001). Review of Test Pilot. Language Learning & Technology, 5(2), 34–
37.
Powers, D. E., Roever, C., Huff, K. L., & Trapani, C. S. (2003). Validating
LanguEdge courseware scores against faculty ratings and student selfassessments.
(ETS Research Report 03–11). Princeton, NJ: Educational
Testing Service.
Read, J. (2000). Assessing vocabulary. New York: Cambridge University Press.
Roever, C. (2001). Web-based language testing. Language Learning & Technology,
5(2), 84–94.
Rosenfeld, M., Leung, S., & Oltman, P. (2001). The reading, writing, speaking, and
listening tasks important for academic success at the undergraduate and
graduate levels. (TOEFL Monograph Series, MS-21). Princeton, NJ:
Educational Testing Service.
Rost, M., & Fuchs, M. (2003). Longman English Interactive. New York: Pearson
Education.
Rupp, A., Garcia, P., & Jamieson, J. (2002). Combining multiple regression and
CART to understand difficulty in second language reading and listening
comprehension test items. International Journal of Language Testing, 1,
185–216.
Sawaki, Y. (2001). Comparability of conventional and computerized tests of reading
in a second language. Language Learning & Technology, 5(2), 38–59.
Stoynoff, S., & Chapelle, C. A. (2005). ESOL tests and testing: A resource for
teachers and administrators. Alexandria, VA: Teachers of English to
Speakers of Other Languages.
Tang, K. L. (1996). Polytomous item response theory models and their applications
in large-scale testing programs: Review of literature. (TOEFL Monograph
Series MS-2). Princeton, NJ: Educational Testing Service.
Tang, K. L., & Eignor, D. R. (1997). Concurrent calibration of dichotomously and
polytomously scored TOEFL items using IRT models. (TOEFL Technical
Report TR-13). Princeton, NJ: Educational Testing Service.
Taylor, C., Kirsch, I., Jamieson, J., & Eignor, D. (1999). Examining the relationship
between computer familiarity and performance on computer-based language
tasks. Language Learning, 49, 219–274.
Thompson, I. (2001). From the special issue editor (Introduction). Language
Learning & Technology, 5(2), 2–3.
Wainer, H., Dorans, N., Eignor, D., Flaugher, R., Green, B., Mislevy, R., Steinberg,
L., & Thissen, D. (2000). Computer adaptive testing: A primer (2nd ed.).
Mahwah, NJ: Erlbaum.
Weigle, S. C. (2002). Assessing writing. New York: Cambridge University Press.
Winke, P., & MacGregor, D. (2001). Review of Version 5, Hot Potatoes. Language
Learning & Technology, 5(2), 28–33.
http://journals.cambridge.org/download.php?file=%2FAPL%2FAPL25%2FS0267190505000127a.pdf&code=546f8cea4f09a0d85695a37a3d8fa71f
Several years back there was a public service announcement that ran on television. It talked about the importance of good listening skills and the difference between hearing and listening. Hearing is a physical ability while listening is a skill. Listening skills allow one to make sense of and understand what another person is saying. In other words, listening skills allow you to understand what someone is "talking about".

In 1991 the United States Department of Labor Secretary's Commission on Achieving Necessary Skills (SCANS) identified five competencies and three foundation skills that are essential for those entering the workforce. Listening skills were among the foundation skills SCANS identified.
Why You Need Good Listening Skills

Good listening skills make workers more productive. The ability to listen carefully will allow you to:

better understand assignments and what is expected of you;

build rapport with co-workers, bosses, and clients;

show support;

work better in a team-based environment;

resolve problems with customers, co-workers, and bosses;

answer questions; and

find underlying meanings in what others say.
How to Listen Well
The following tips will help you listen well. Doing these things will also demonstrate to the speaker that you are paying attention. While you may in fact be able to listen while looking down at the floor, doing so may imply that you are not.

maintain eye contact;

don't interrupt the speaker;

sit still;

nod your head;

lean toward the speaker;

repeat instructions and ask appropriate questions when the speaker has finished.

A good listener knows that being attentive to what the speaker doesn't say is as important as being attentive to what he does say. Look for non-verbal cues such as facial expressions and posture to get the full gist of what the speaker is telling you.
Barriers to Listening
Beware of the following things that may get in the way of listening.

bias or prejudice;

language differences or accents;

noise;

worry, fear, or anger; and

lack of attention span.
Listening Starts Early
If you have children you know what it's like to feel like you're talking to a wall. Kids have an uncanny ability to appear to be listening to you while actually paying no attention at all. While this is something that may pass with age it is important to help children develop good listening skills early. They will do better in school and you will keep your sanity. As the SCANS report points out, good listening skills will prepare children to eventually succeed in the workforce.

When you tell your child to do something, ask him to repeat your instructions;

Teach your child to maintain eye contact when talking to or listening to someone;

Read out loud to your child and then engage her in a conversation about what you have read; and

Engage your child in age-appropriate activities that promote good listening skills

LISTENING SKILLS
We were given two ears but only one mouth.

This is because God knew that listening was twice as hard as talking.

People need to practice and acquire skills to be good listeners, because a speaker cannot throw you information in the same manner that a dart player tosses a dart at a passive dartboard. Information is an intangible substance that must be sent by the speaker and received by an active listener.

THE FACE IT SOLUTION FOR EFFECTIVE LISTENING
Many people are familiar with the scene of the child standing in front of dad, just bursting to tell him what happened in school that day. Unfortunately, dad has the paper in front of his face and even when he drops the paper down half-way, it is visibly apparent that he is not really listening.

A student solved the problem of getting dad to listen from behind his protective paper wall. Her solution was to say, "Move your face, dad, when I'm talking to you.'' This simple solution will force even the poorest listener to adopt effective listening skills because it captures the essence of good listening.

GOOD LISTENERS LISTEN WITH THEIR FACES
The first skill that you can practice to be a good listener is to act like a good listener. We have spent a lot of our modern lives working at tuning out all of the information that is thrust at us. It therefore becomes important to change our physical body language from that of a deflector to that of a receiver, much like a satellite dish. Our faces contain most of the receptive equipment in our bodies, so it is only natural that we should tilt our faces towards the channel of information.

A second skill is to use the other bodily receptors besides your ears. You can be a better listener when you look at the other person. Your eyes pick up the non-verbal signals that all people send out when they are speaking. By looking at the speaker, your eyes will also complete the eye contact that speakers are trying to make. A speaker will work harder at sending out the information when they see a receptive audience in attendance. Your eyes help complete the communication circuit that must be established between speaker and listener.

When you have established eye and face contact with your speaker, you must then react to the speaker by sending out non-verbal signals. Your face must move and give the range of emotions that indicate whether you are following what the speaker has to say. By moving your face to the information, you can better concentrate on what the person is saying. Your face must become an active and contoured catcher of information.

It is extremely difficult to receive information when your mouth is moving information out at the same time. A good listener will stop talking and use receptive language instead. Use the I see . . . un hunh . . . oh really words and phrases that follow and encourage your speaker's train of thought. This forces you to react to the ideas presented, rather than the person. You can then move to asking questions, instead of giving your opinion on the information being presented. It is a true listening skill to use your mouth as a moving receptor of information rather than a broadcaster.

A final skill is to move your mind to concentrate on what the speaker is saying. You cannot fully hear their point of view or process information when you argue mentally or judge what they are saying before they have completed. An open mind is a mind that is receiving and listening to information.

If you really want to listen, you will act like a good listener. Good listeners are good catchers because they give their speakers a target and then move that target to capture the information that is being sent. When good listeners aren't understanding their speakers, they will send signals to the speaker about what they expect next, or how the speaker can change the speed of information delivery to suit the listener. Above all, a good listener involves all of their face to be an active moving listener.

THINGS TO REMEMBER
If you are really listening intently, you should feel tired after your speaker has finished. Effective listening is an active rather than a passive activity.
When you find yourself drifting away during a listening session, change your body position and concentrate on using one of the above skills. Once one of the skills is being used, the other active skills will come into place as well.
Your body position defines whether you will have the chance of being a good listener or a good deflector. Good listeners are like poor boxers: they lead with their faces.
Meaning cannot just be transmitted as a tangible substance by the speaker. It must also be stimulated or aroused in the receiver. The receiver must therefore be an active participant for the cycle of communication to be complete.
This page is from the book CASAA Student Activity Sourcebook. You can purchase this book from our resource library. http://www.casaaleadership.ca/mainpages/resources/sourcebook/listening-skills.html

Tidak ada komentar:

Pengikut