| |
| |
| |
An Introduction to Contemporary Educational Testing And Measurement | |
| |
| |
Tests Are Only Tools: Their Usefulness Can Vary | |
| |
| |
Why We Developed This Text: Enhancing Test Usefulness | |
| |
| |
Technical Adequacy | |
| |
| |
Test User Competency | |
| |
| |
Matching the Test's Intended Purpose | |
| |
| |
Matching Diverse Test-Takers to the Test | |
| |
| |
Test Results and Diversity Considerations | |
| |
| |
Tests Are Only Tools: A Video Beats a Photo | |
| |
| |
Defining Some Test-Related Terms | |
| |
| |
Tests, Assessments, and the Assessment Process | |
| |
| |
Types of Tests/Assessments | |
| |
| |
Recent Developments: Impact on Classroom Testing and Measurement | |
| |
| |
Education Reform Meets Special Education Reform: NCLB and IDEIA | |
| |
| |
The Impact on Regular Education Teachers of the IDEIA and NCLB | |
| |
| |
Other Trends: Technology, Globalization, and International Competitiveness | |
| |
| |
Competency Testing for Teachers | |
| |
| |
Increased Interest from Professional Groups | |
| |
| |
A Professional Association-Book Publisher Information Initiative | |
| |
| |
Effects on the Classroom Teacher | |
| |
| |
About the Text | |
| |
| |
What if You're �No Good in Math� | |
| |
| |
Summary | |
| |
| |
For Discussion | |
| |
| |
| |
High-Stakes Testing | |
| |
| |
Comparing NCLB and State High-Stakes Testing Programs | |
| |
| |
High-Stakes Testing: A Nationwide Phenomenon | |
| |
| |
High-Stakes Tests Are Only Tools | |
| |
| |
Why Does High-Stakes Testing Matter? | |
| |
| |
Promotion and Graduation Decisions Affect Students | |
| |
| |
Principal and Teacher Incentives Are Linked to HST Performance | |
| |
| |
Property Values, Business Decisions, and Politics and HST | |
| |
| |
The Lake Wobegon Effect and HST | |
| |
| |
The History of High-Stakes Testing | |
| |
| |
Education Reform | |
| |
| |
Standards-Based Reform | |
| |
| |
Types of High-Stakes Tests | |
| |
| |
Criterion-Referenced High-Stakes Tests | |
| |
| |
Norm-Referenced High-Stakes Tests | |
| |
| |
Benchmark Tests and High-Stakes Tests | |
| |
| |
The High-Stakes Testing Backlash | |
| |
| |
Is There Really a High-Stakes Testing Backlash? | |
| |
| |
What Do National Organizations Say About High-Stakes Tests? | |
| |
| |
AERA's Twelve Conditions for HST Programs | |
| |
| |
How Can a Teacher Use the Twelve Conditions? | |
| |
| |
Helping Students (and Yourself) Prepare for High-Stakes Tests | |
| |
| |
Focus on the Task, Not Your Feelings About It | |
| |
| |
Inform Students and Parents About the Importance of the Test | |
| |
| |
Teach Test-Taking Skills as Part of Regular Instruction | |
| |
| |
As the Test Day Approaches, Respond to Student Questions Openly and Directly | |
| |
| |
Take Advantage of Whatever Preparation Materials Are Available | |
| |
| |
Summary | |
| |
| |
For Discussion | |
| |
| |
| |
Response-to-Intervention (RTI) and the Regular Classroom Teacher | |
| |
| |
What Is RTI? | |
| |
| |
What if You Have Not Heard of RTI Before? | |
| |
| |
How New Is RTI? | |
| |
| |
Do Regular Education Teachers Need to Know About RTI? | |
| |
| |
An RTI Scenario | |
| |
| |
How Important Is RTI to Regular Education Teachers? | |
| |
| |
Can a Special Education Law Reform Regular Education? | |
| |
| |
How Is RTI Supposed to Help Students and Schools? | |
| |
| |
RTI Definitions, Components, and Implementation Approaches | |
| |
| |
RTI Definitions | |
| |
| |
RTI Components | |
| |
| |
RTI Implementation Approaches | |
| |
| |
How Widely Is RTI Being Implemented1? | |
| |
| |
Some Benefits of RTI | |
| |
| |
RTI: The Promise and Some Controversies | |
| |
| |
Technical Issues: Reliability, Validity, and Fairness | |
| |
| |
Implementation Issues | |
| |
| |
| |
The Purpose Of Testing | |
| |
| |
Testing, Accountability, and the Classroom Teacher | |
| |
| |
Types of Educational Decisions | |
| |
| |
A Pinch of Salt | |
| |
| |
�Pinching� in the Classroom | |
| |
| |
What to Measure | |
| |
| |
How to Measure | |
| |
| |
Written Tests | |
| |
| |
Summary | |
| |
| |
For Discussion | |
| |
| |
| |
Norm-Referenced and Criterion-Referenced Tests and Content Validity Evidence | |
| |
| |
Defining Norm-Referenced and Criterion-Referenced Tests | |
| |
| |
Comparing Norm-Referenced and Criterion-Referenced Tests | |
| |
| |
Differences in the Construction of Norm-Referenced and Criterion-Referenced Tests | |
| |
| |
Norm- and Criterion-Referenced Tests and Linguistic and Cultural Diversity | |
| |
| |
Norm- and Criterion-Referenced Tests and Validity Evidence | |
| |
| |
A Three-Stage Model of Classroom Measurement | |
| |
| |
Why Objectives? Why Not Just Write Test Items? | |
| |
| |
Where Do Goals Come From? | |
| |
| |
Are There Different Kinds of Goals and Objectives? | |
| |
| |
How Can Instructional Objectives Make a Teacher's Job Easier? | |
| |
| |
Summary | |
| |
| |
For Discussion | |
| |
| |
| |
Measuring Learning Outcomes | |
| |
| |
Writing Instructional Objectives | |
| |
| |
Identifying Learning Outcomes | |
| |
| |
Identifying Observable and Directly Measurable Learning Outcomes | |
| |
| |
Stating Conditions | |
| |
| |
Stating Criterion Levels | |
| |
| |
Keeping It Simple and Straightforward | |
| |
| |
Matching Test Items to Instructional Objectives | |
| |
| |
Taxonomy of Educational Objectives | |
| |
| |
Cognitive Domain | |
| |
| |
Affective Domain | |
| |
| |
The Psychomotor Domain | |
| |
| |
The Test Blueprint | |
| |
| |
Content Outline | |
| |
| |
Categories | |
| |
| |
Number of Items | |
| |
| |
Functions | |
| |
| |
Summary | |
| |
| |
For Practice | |
| |
| |
| |
Writing Objective Test Items | |
| |
| |
Which Format? | |
| |
| |
True-False Items | |
| |
| |
Suggestions for Writing True-False Items | |
| |
| |
Matching Items | |
| |
| |
Faults Inherent in Matching Items | |
| |
| |
Suggestions for Writing Matching Items | |
| |
| |
Multiple-Choice Items | |
| |
| |
Higher-Level Multiple-Choice Questions | |
| |
| |
Suggestions for Writing Multiple-Choice Items | |
| |
| |
Completion Items | |
| |
| |
Suggestions for Writing Completion Items | |
| |
| |
Gender and Racial Bias in Test Items | |
| |
| |
Guidelines for Writing Test Items | |
| |
| |
Advantages and Disadvantages of Different Objective Item Formats | |
| |
| |
Summary | |
| |
| |
For Practice | |
| |
| |
| |
Writing Essay Test Items | |
| |
| |
What Is an Essay Item? | |
| |
| |
Essay Items Should Measure Complex Cognitive Skills or Processes | |
| |
| |
Essay Items: Extended or Restricted Response | |
| |
| |
Examples of Restricted Response Essays | |
| |
| |
Pros and Cons of Essay Items | |
| |
| |
Advantages of the Essay Item | |
| |
| |
Disadvantages of the Essay Item | |
| |
| |
Suggestions for Writing Essay Items | |
| |
| |
Scoring Essay Questions | |
| |
| |
Scoring Extended Response and Higher Level Questions | |
| |
| |
General Essay Scoring Suggestions | |
| |
| |
Assessing Knowledge Organization | |
| |
| |
Open-Book Questions and Exams | |
| |
| |
Some Open-Book Techniques | |
| |
| |
Guidelines for Planning Essays, Knowledge Organization, and Open-Book Questions and Exams | |
| |
| |
Summary | |
| |
| |
For Practice | |
| |
| |
| |
Performance-Based Assessment | |
| |
| |
Performance Tests: Direct Measures of Competence | |
| |
| |
Performance Tests Can Assess Processes and Products | |
| |
| |
Performance Tests Can Be Embedded in Lessons | |
| |
| |
Performance Tests Can Assess Affective and Social Skills | |
| |
| |
Developing Performance Tests for Your Learners | |
| |
| |
| |
Deciding What to Test | |
| |
| |
| |
Designing the Assessment Context | |
| |
| |
| |
Specifying the Scoring Rubrics | |
| |
| |
| |
Specifying Testing Constraints | |
| |
| |
A Final Word | |
| |
| |
Summary | |
| |
| |
For Discussion and Practice | |
| |
| |
| |
Portfolio Assessment | |
| |
| |
Ensuring Validity of the Portfolio | |
| |
| |
Developing Portfolio Assessments | |
| |
| |
| |
Deciding on the Purposes for a Portfolio | |
| |
| |
| |
Identifying Cognitive Skills and Dispositions | |
| |
| |
| |
Deciding Who Will Plan the Portfolio | |
| |
| |
| |
Deciding Which Products to Put in the Portfolio and How Many Samples of Each Product | |
| |
| |
| |
Building the Portfolio Rubrics | |
| |
| |
| |
Developing a Procedure to Aggregate All Portfolio Ratings | |
| |
| |
| |
Determining the Logistics | |
| |
| |
Summary | |
| |
| |
For Practice | |
| |
| |
| |
Administering, Analyzing, And Improving The Test Or Assessment | |
| |
| |
Assembling the Test | |
| |
| |
Packaging the Test | |
| |
| |
Reproducing the Test | |
| |
| |
Administering the Test | |
| |
| |
Scoring the Test | |
| |
| |
Analyzing the Test | |
| |
| |
Quantitative Item Analysis | |
| |
| |
Qualitative Item Analysis | |
| |
| |
Item Analysis Modifications for the Criterion-Referenced Test | |
| |
| |
Debriefing | |
| |
| |
Debriefing Guidelines | |
| |
| |
The Process of Evaluating Classroom Achievement | |
| |
| |
Summary | |
| |
| |
For Practice | |
| |
| |
| |
Marks And Marking Systems | |
| |
| |
What Is the Purpose of a Mark? | |
| |
| |
Why Be Concerned About Marking? | |
| |
| |
What Should a Mark Reflect? | |
| |
| |
Marking Systems | |
| |
| |
Types of Comparisons | |
| |
| |
Types of Symbols | |
| |
| |
Combining and Weighting the Components of a Mark | |
| |
| |
Who Is the Better Teacher? | |
| |
| |
Combining Grades into a Single Mark | |
| |
| |
Practical Approaches to Equating Before Weighting in the Busy Classroom | |
| |
| |
Front-End Equating | |
| |
| |
Back-End Equating | |
| |
| |
Summary | |
| |
| |
For Practice | |
| |
| |
| |
Summarizing Data And Measures Of Central Tendency | |
| |
| |
What Are Statistics? | |
| |
| |
Why Use Statistics? | |
| |
| |
Tabulating Frequency Data | |
| |
| |
The List | |
| |
| |
The Simple Frequency Distribution | |
| |
| |
The Grouped Frequency Distribution | |
| |
| |
Steps in Constructing a Grouped Frequency Distribution | |
| |
| |
Graphing Data | |
| |
| |
The Bar Graph, or Histogram | |
| |
| |
The Frequency Polygon | |
| |
| |
The Smooth Curve | |
| |
| |
Measures of Central Tendency | |
| |
| |
The Mean | |
| |
| |
The Median | |
| |
| |
The Mode | |
| |
| |
The Measures of Central Tendency in Various Distributions | |
| |
| |
Summary | |
| |
| |
For Practice | |
| |
| |
| |
Variability, The Normal Distribution, And Converted Scores | |
| |
| |
The Range | |
| |
| |
The Semi-Interquartile Range (SIQR) | |
| |
| |
The Standard Deviation | |
| |
| |
The Deviation Score Method for Computing the Standard Deviation | |
| |
| |
The Raw Score Method for Computing the Standard Deviation | |
| |
| |
The Normal Distribution | |
| |
| |
Properties of the Normal Distribution | |
| |
| |
Converted Scores | |
| |
| |
z-Scores | |
| |
| |
T-Scores | |
| |
| |
Summary | |
| |
| |
For Practice | |
| |
| |
| |
Correlation | |
| |
| |
The Correlation Coefficient | |
| |
| |
Strength of a Correlation | |
| |
| |
Direction of a Correlation | |
| |
| |
Scatterplots | |
| |
| |
Where Does r Come From? | |
| |
| |
Causality | |
| |
| |
Other Interpretive Cautions | |
| |
| |
Summary | |
| |
| |
For Practice | |
| |
| |
| |
Validity Evidence | |
| |
| |
Why Evaluate Tests? | |
| |
| |
Types of Validity Evidence | |
| |
| |
Content Validity Evidence | |
| |
| |
Criterion-Related Validity Evidence | |
| |
| |
Construct Validity Evidence | |
| |
| |
What Have We Been Saying? A Review | |
| |
| |
Interpreting Validity Coefficients | |
| |
| |
Content Validity Evidence | |
| |
| |
Concurrent and Predictive Validity Evidence | |
| |
| |
Summary | |
| |
| |
For Practice | |
| |
| |
| |
Reliability | |
| |
| |
Methods of Estimating Reliability | |
| |
| |
Test-Retest or Stability | |
| |
| |
Alternate Forms or Equivalence | |
| |
| |
Internal Consistency | |
| |
| |
Interpreting Reliability Coefficients | |
| |
| |
Summary | |
| |
| |
For Practice | |
| |
| |
Error-What Is It? | |
| |
| |
The Standard Error of Measurement | |
| |
| |
Using the Standard Error of Measurement | |
| |
| |
More Applications | |
| |
| |
Standard Deviation or Standard Error of Measurement? | |
| |
| |
Why All the Fuss About Error? | |
| |
| |
Error Within Test-Takers | |
| |
| |
Error Within the Test | |
| |
| |
Error in Test Administration | |
| |
| |
Error in Scoring | |
| |
| |
Sources of Error Influencing Various Reliability Coefficients | |
| |
| |
Test-Retest | |
| |
| |
Alternate Forms | |
| |
| |
Internal Consistency | |
| |
| |
Band Interpretation | |
| |
| |
Steps: Band Interpretation | |
| |
| |
A Final Word | |
| |
| |
Summary | |
| |
| |
For Practice | |
| |
| |
| |
Standardized Tests | |
| |
| |
What Is a Standardized Test? | |
| |
| |
Do Test Stimuli, Administration, and Scoring Have to Be Standardized? | |
| |
| |
Standardized Testing: Effects of Accommodations and Alternative Assessments | |
| |
| |
Uses of Standardized Achievement Tests | |
| |
| |
Will Performance and Portfolio Assessment Make Standardized Tests Obsolete? | |
| |
| |
Administering Standardized Tests | |
| |
| |
Types of Scores Offered for Standardized Achievement Tests | |
| |
| |
Grade Equivalents | |
| |
| |
Age Equivalents | |
| |
| |
Percentile Ranks | |
| |
| |
Standard Scores | |
| |
| |
Interpreting Standardized Tests: Test and Student Factors | |
| |
| |
Test-Related Factors | |
| |
| |
Student-Related Factors | |
| |
| |
Aptitude-Achievement Discrepancies | |
| |
| |
Interpreting Standardized Tests: Parent-Teacher Conferences and Educational Decision Making | |
| |
| |
An Example: Pressure to Change an Educational Placement | |
| |
| |
A Second Example: Pressure from the Opposite Direction | |
| |
| |
Interpreting Standardized Tests: Score Reports from Publishers | |
| |
| |
The Press-On Label | |
| |
| |
A Criterion-Referenced Skills Analysis or Mastery Report | |
| |
| |
An Individual Performance Profile | |
| |
| |
Other Publisher Reports and Services | |
| |
| |
Summary | |
| |
| |
For Practice | |
| |
| |
| |
Types of Standardized Tests | |
| |
| |
Standardized Achievement Tests | |
| |
| |
Achievement Test Batteries, or Survey Batteries | |
| |
| |
Single-Subject Achievement Tests | |
| |
| |
Diagnostic Achievement Aptitude Tests | |
| |
| |
Standardized Academic Aptitude Tests420 | |
| |
| |
The History of Academic Aptitude Testing | |
| |
| |
Stability of IQ Scores | |
| |
| |
What Do IQ Tests Predict? | |
| |
| |
Individually Administered Academic Aptitude Tests | |
| |
| |
Group-Administered Academic Aptitude Tests | |
| |
| |
Standardized Personality Assessment Instruments | |
| |
| |
What Is Personality? | |
| |
| |
Objective Personality Tests | |
| |
| |
Projective Personality Tests | |
| |
| |
Summary | |
| |
| |
For Discussion | |
| |
| |
| |
In The Classroom: A Summary Dialogue | |
| |
| |
High-Stakes Testing and NCLB | |
| |
| |
Response-to-intervention (RTI) | |
| |
| |
Criterion-Referenced Versus Norm-Referenced Tests | |
| |
| |
New Responsibilities for Teachers Under IDEIA | |
| |
| |
Instructional Objectives | |
| |
| |
The Test Blueprint | |
| |
| |
Essay Items and the Essay Scoring Guides | |
| |
| |
Reliability, Validity Evidence, and Test Statistics | |
| |
| |
Grades And Marks | |
| |
| |
Some Final Thoughts | |
| |
| |
| |
Math Skills Review | |
| |
| |
| |
Preparing For The Praxis II: Principles Of Learning And Teaching Assessment | |
| |
| |
| |
Determining The Median When There Are Multiple Tied Middle Scores | |
| |
| |
| |
Pearson Product-Moment Correlation | |
| |
| |
| |
Statistics And Measurement Texts | |
| |
| |
| |
Answers for Practice Questions | |
| |
| |
Suggested Readings | |
| |
| |
References | |
| |
| |
Credits | |
| |
| |
Index | |