| |
| |
Acknowledgements | |
| |
| |
Preface | |
| |
| |
| |
Introduction: the rubber ruler | |
| |
| |
| |
Why test language learning? | |
| |
| |
| |
What is a language test? | |
| |
| |
| |
What are the properties of measuring devices? | |
| |
| |
| |
The rubber ruler | |
| |
| |
| |
Tests, measurement and evaluation | |
| |
| |
| |
Evaluation without measurement | |
| |
| |
| |
Measurement without a test | |
| |
| |
| |
Tests | |
| |
| |
| |
Ethical uses of language tests | |
| |
| |
| |
Reliability | |
| |
| |
| |
Validity | |
| |
| |
Further reading | |
| |
| |
Exercises | |
| |
| |
| |
Measuring language ability and making decisions | |
| |
| |
| |
Measuring language ability | |
| |
| |
| |
Making decisions about learners, teachers, programmes, and policies | |
| |
| |
| |
Contexts of language use | |
| |
| |
| |
Contextual features | |
| |
| |
| |
Authenticity | |
| |
| |
| |
Making valid interpretations of test performance | |
| |
| |
| |
Consistency of measurement | |
| |
| |
| |
Validity: evidence for interpretations of test performance | |
| |
| |
| |
Conclusion: bias for best | |
| |
| |
Further reading | |
| |
| |
Exercises | |
| |
| |
| |
Test development | |
| |
| |
| |
What do I need to test? Needs analysis | |
| |
| |
| |
Define the purpose of the test | |
| |
| |
| |
Conduct a preliminary investigation | |
| |
| |
| |
Collect primary data | |
| |
| |
| |
Collect secondary data | |
| |
| |
| |
Analyse target language use task and language characteristics | |
| |
| |
| |
How am I going to test language abilities? Turning target language use tasks into test tasks | |
| |
| |
| |
Developing a test task | |
| |
| |
| |
Developing a blueprint for the test | |
| |
| |
| |
Options for test tasks | |
| |
| |
| |
How am I going to give the test? Test administration | |
| |
| |
| |
Test environment | |
| |
| |
| |
Personnel | |
| |
| |
| |
Procedures | |
| |
| |
| |
Scoring | |
| |
| |
| |
How can my computer assist me in test development? Computer-based tools | |
| |
| |
| |
Hot Potatoes | |
| |
| |
| |
Moodle | |
| |
| |
| |
WebCT | |
| |
| |
| |
Conclusion | |
| |
| |
Further reading | |
| |
| |
Exercises | |
| |
| |
| |
Alternatives in assessment | |
| |
| |
| |
Norm-referenced and criterion-referenced tests | |
| |
| |
| |
Communicative language tests | |
| |
| |
| |
'General' and 'specific purpose' language tests | |
| |
| |
| |
Discrete-point and integrative tests | |
| |
| |
| |
Formative and summative assessment | |
| |
| |
| |
Alternative approaches to assessment | |
| |
| |
| |
Conference assessments | |
| |
| |
| |
Portfolio assessment | |
| |
| |
| |
Self- and peer-assessments | |
| |
| |
| |
Task-based and performance assessment | |
| |
| |
| |
Dynamic assessment | |
| |
| |
| |
Summary | |
| |
| |
| |
Conclusion | |
| |
| |
Further reading | |
| |
| |
Exercises | |
| |
| |
| |
By the numbers: a statistics mini-course | |
| |
| |
| |
Introduction | |
| |
| |
| |
Normal distribution | |
| |
| |
| |
The average or mean | |
| |
| |
| |
Standard deviation | |
| |
| |
| |
Standard deviation as a unit of measurement | |
| |
| |
| |
Correlation | |
| |
| |
| |
Probability and statistical significance | |
| |
| |
| |
The t-test of the difference between two averages | |
| |
| |
| |
Analysis of variance | |
| |
| |
| |
Reliability | |
| |
| |
| |
Split-half method | |
| |
| |
| |
Internal consistency method | |
| |
| |
| |
Standard error of measurement | |
| |
| |
| |
The reliability of human raters | |
| |
| |
| |
Conclusion | |
| |
| |
Further reading | |
| |
| |
Exercises | |
| |
| |
| |
Technology and language testing | |
| |
| |
| |
Introduction | |
| |
| |
| |
Issues in technology and language testing | |
| |
| |
| |
Technology and test taker attitudes | |
| |
| |
| |
Language performance and different media | |
| |
| |
| |
Technology and the construct to be measured | |
| |
| |
| |
Technology and assessment tasks | |
| |
| |
| |
The limits of automated scoring | |
| |
| |
| |
Technology and language task types | |
| |
| |
| |
Listening tasks | |
| |
| |
| |
Integrated listening and speaking tasks | |
| |
| |
| |
Writing tasks | |
| |
| |
| |
Reading tasks | |
| |
| |
| |
The promise and threats of automated scoring | |
| |
| |
| |
Examples of current automated scoring programs | |
| |
| |
| |
Concerns about automated scoring | |
| |
| |
| |
Test feedback and reporting | |
| |
| |
| |
Online and computer-based resources for statistics | |
| |
| |
| |
Microsoft“ Excel | |
| |
| |
| |
Online resources | |
| |
| |
| |
Conclusion | |
| |
| |
Further reading | |
| |
| |
Exercises | |
| |
| |
Afterword: the rubber ruler revisited | |
| |
| |
References | |
| |
| |
Index | |