| |
| |
Preface | |
| |
| |
The Conceptual Orientation of This Book, Its Purpose, and the Intended Audience | |
| |
| |
Organizational Overview | |
| |
| |
Acknowledgments | |
| |
| |
About the Authors | |
| |
| |
| |
Psychometrics and the Importance of Psychological Measurement | |
| |
| |
Why Psychological Testing Matters to You | |
| |
| |
Observable Behavior and Unobservable Psychological Attributes | |
| |
| |
Psychological Tests: Definition and Types | |
| |
| |
What Is a Psychological Test? | |
| |
| |
Types of Tests | |
| |
| |
Psychometrics | |
| |
| |
What Is Psychometrics? | |
| |
| |
Francis Galton and the Scope of Psychometrics | |
| |
| |
Challenges to Measurement in Psychology | |
| |
| |
Theme: The Importance of Individual Differences | |
| |
| |
Suggested Readings | |
| |
| |
| |
Basic Concepts in Measurement | |
| |
| |
| |
Scaling | |
| |
| |
Fundamental Issues With Numbers | |
| |
| |
The Property of Identity | |
| |
| |
The Property of Order | |
| |
| |
The Property of Quantity | |
| |
| |
The Number 0 | |
| |
| |
Units of Measurement | |
| |
| |
Additivity and Counting | |
| |
| |
Additivity | |
| |
| |
Counts: When Do They Qualify as Measurement? | |
| |
| |
Four Scales of Measurement | |
| |
| |
Nominal Scales | |
| |
| |
Ordinal Scales | |
| |
| |
Interval Scales | |
| |
| |
Ratio Scales | |
| |
| |
Scales of Measurement: Practical Implications | |
| |
| |
Additional Issues Regarding Scales of Measurement | |
| |
| |
Summary | |
| |
| |
Suggested Readings | |
| |
| |
| |
Individual Differences and Correlations | |
| |
| |
The Nature of Variability | |
| |
| |
Importance of Individual Differences | |
| |
| |
Variability and Distributions of Scores | |
| |
| |
Central Tendency | |
| |
| |
Variability | |
| |
| |
Distribution Shapes and Normal Distributions | |
| |
| |
Quantifying the Association Between Distributions | |
| |
| |
Interpreting the Association Between Two Variables | |
| |
| |
Covariance | |
| |
| |
Correlation | |
| |
| |
Variance and Covariance for "Composite Variables" | |
| |
| |
Binary Items | |
| |
| |
Interpreting Test Scores | |
| |
| |
z Scores (Standard Scores) | |
| |
| |
Converted Standard Scores (Standardized Scores) | |
| |
| |
Percentile Ranks | |
| |
| |
Normalized Scores | |
| |
| |
Test Norms | |
| |
| |
Representativeness of the Reference Sample | |
| |
| |
Summary | |
| |
| |
Suggested Readings | |
| |
| |
| |
Test Dimensionality and Factor Analysis | |
| |
| |
Test Dimensionality | |
| |
| |
Three Dimensionality Questions | |
| |
| |
Unidimensional Tests | |
| |
| |
Multidimensional Tests With Correlated Dimensions (Tests With Higher-Order Factors) | |
| |
| |
Multidimensional Tests With Uncorrelated Dimensions | |
| |
| |
The Psychological Meaning of Test Dimensions | |
| |
| |
Factor Analysis: Examining the Dimensionality of a Test | |
| |
| |
The Logic and Purpose of Exploratory Factor Analysis: A Conceptual Overview | |
| |
| |
Conducting and Interpreting an Exploratory Factor Analysis | |
| |
| |
A Quick Look at Confirmatory Factor Analysis | |
| |
| |
Summary | |
| |
| |
Suggested Readings | |
| |
| |
| |
Reliability | |
| |
| |
| |
Reliability: Conceptual Basis | |
| |
| |
Overview of Reliability and Classical Test Theory | |
| |
| |
Observed Scores, True Scores, and Measurement Error | |
| |
| |
Variances in Observed Scores, True Scores, and Error Scores | |
| |
| |
Four Ways to Think of Reliability | |
| |
| |
Reliability as the Ratio of True Score Variance to Observed Score Variance | |
| |
| |
Lack of Error Variance | |
| |
| |
The (Squared) Correlation Between Observed Scores and True Scores | |
| |
| |
Lack of (Squared) Correlation Between Observed Scores and Error Scores | |
| |
| |
Reliability and the Standard Error of Measurement | |
| |
| |
Parallel Tests | |
| |
| |
Domain Sampling Theory | |
| |
| |
Summary | |
| |
| |
Suggested Readings | |
| |
| |
| |
Empirical Estimates of Reliability | |
| |
| |
Alternate Forms Reliability | |
| |
| |
Test-Retest Reliability | |
| |
| |
Internal Consistency Reliability | |
| |
| |
Split-Half Estimates of Reliability | |
| |
| |
"Raw" Coefficient Alpha | |
| |
| |
"Standardized" Coefficient Alpha | |
| |
| |
Raw Alpha for Binary Items: KR<sub>20</sub> | |
| |
| |
Accuracy and Use of Internal Consistency Estimates of Reliability: Theory and Reality | |
| |
| |
Internal Consistency Versus Dimensionality | |
| |
| |
Factors Affecting the Reliability of Test Scores | |
| |
| |
Sample Homogeneity and Reliability Generalization | |
| |
| |
Reliability of Difference Scores | |
| |
| |
Estimating the Reliability of Difference Scores | |
| |
| |
Factors Affecting the Reliability of Difference Scores | |
| |
| |
The Problem of Unequal Variability | |
| |
| |
Difference Scores: Summary and Caution | |
| |
| |
Summary | |
| |
| |
Note | |
| |
| |
Suggested Readings | |
| |
| |
| |
The Importance of Reliability | |
| |
| |
Applied Behavioral Practice: Evaluation of an Individual's Test Score | |
| |
| |
Point Estimates of True Scores | |
| |
| |
True Score Confidence Intervals | |
| |
| |
Behavioral Research | |
| |
| |
Reliability, True Associations, and Observed Associations | |
| |
| |
Measurement Error (Low Reliability) Attenuates the Observed Associations Between Measures | |
| |
| |
Reliability, Effect Sizes, and Statistical Significance | |
| |
| |
Implications for Conducting and Interpreting Behavioral Research | |
| |
| |
Test Construction and Refinement | |
| |
| |
Item Discrimination and Other Information Regarding Internal Consistency | |
| |
| |
Item Difficulty (Mean) and Item Variance | |
| |
| |
Summary | |
| |
| |
Suggested Readings | |
| |
| |
| |
VALIDITY | |
| |
| |
| |
Validity: Conceptual Basis | |
| |
| |
What Is Validity? | |
| |
| |
The Importance of Validity | |
| |
| |
Validity Evidence: Test Content | |
| |
| |
Threats to Content Validity | |
| |
| |
Content Validity Versus Face Validity | |
| |
| |
Validity Evidence: Internal Structure of the Test | |
| |
| |
Validity Evidence: Response Processes | |
| |
| |
Validity Evidence: Associations With Other Variables | |
| |
| |
Validity Evidence: Consequences of Testing | |
| |
| |
Other Perspectives on Validity | |
| |
| |
Contrasting Reliability and Validity | |
| |
| |
Summary | |
| |
| |
Suggested Readings | |
| |
| |
| |
Estimating and Evaluating Convergent and Discriminant Validity Evidence | |
| |
| |
Methods for Evaluating Convergent and Discriminant Validity | |
| |
| |
Focused Associations | |
| |
| |
Sets of Correlations | |
| |
| |
Multitrait-Multimethod Matrices | |
| |
| |
Quantifying Construct Validity | |
| |
| |
Factors Affecting a Validity Coefficient | |
| |
| |
Associations Between Constructs | |
| |
| |
Measurement Error and Reliability | |
| |
| |
Restricted Range | |
| |
| |
Skew and Relative Proportions | |
| |
| |
Method Variance | |
| |
| |
Time | |
| |
| |
Predictions of Single Events | |
| |
| |
Interpreting a Validity Coefficient | |
| |
| |
Squared Correlations and "Variance Explained" | |
| |
| |
Estimating Practical Effects: Binomial Effect Size Display, Taylor-Russell Tables, Utility Analysis, and Sensitivity/Specificity | |
| |
| |
Guidelines or Norms for a Field | |
| |
| |
Statistical Significance | |
| |
| |
Summary | |
| |
| |
Notes | |
| |
| |
Suggested Readings | |
| |
| |
| |
Threats to Psychometric Quality | |
| |
| |
| |
Response Biases | |
| |
| |
Types of Response Biases | |
| |
| |
Acquiescence Bias ("Yea-Saying and Nay-Saying") | |
| |
| |
Extreme and Moderate Responding | |
| |
| |
Social Desirability | |
| |
| |
Malingering | |
| |
| |
Careless or Random Responding | |
| |
| |
Guessing | |
| |
| |
Methods for Coping With Response Biases | |
| |
| |
Minimizing the Existence of Bias by Managing the Testing Context | |
| |
| |
Minimizing the Existence of Bias by Managing Test Content | |
| |
| |
Minimizing the Effects of Bias by Managing Test Content or Scoring | |
| |
| |
Managing Test Content to Detect Bias and Intervene | |
| |
| |
Using Specialized Tests to Detect Bias and Intervene | |
| |
| |
Response Biases, Response Sets, and Response Styles | |
| |
| |
Summary | |
| |
| |
Suggested Readings | |
| |
| |
| |
Test Bias | |
| |
| |
Why Worry About Test Score Bias? | |
| |
| |
Detecting Construct Bias: Internal Evaluation of a Test | |
| |
| |
Item Discrimination Index | |
| |
| |
Factor Analysis | |
| |
| |
Differential Item Functioning Analyses | |
| |
| |
Rank Order | |
| |
| |
Summary | |
| |
| |
Detecting Predictive Bias: External Evaluation of a Test | |
| |
| |
Basics of Regression Analysis | |
| |
| |
One Size Fits All: The Common Regression Equation | |
| |
| |
Intercept Bias | |
| |
| |
Slope Bias | |
| |
| |
Intercept and Slope Bias | |
| |
| |
Outcome Score Bias | |
| |
| |
The Effect of Reliability | |
| |
| |
Other Statistical Procedures | |
| |
| |
Test Fairness | |
| |
| |
Example: Is the SAT Biased in Terms of Race or Socioeconomic Status? | |
| |
| |
Race/Ethnicity | |
| |
| |
Socioeconomic Status | |
| |
| |
Summary | |
| |
| |
Suggested Readings | |
| |
| |
| |
Advanced Psychometric Approaches | |
| |
| |
| |
Confirmatory Factor Analysis | |
| |
| |
On the Use of EFA and CFA | |
| |
| |
The Frequency and Roles of EFA and CFA | |
| |
| |
Using CFA to Evaluate Measurement Models | |
| |
| |
The Process of CFA for Analysis of a Scale's Internal Structure | |
| |
| |
Overview of CFA and Example | |
| |
| |
Preliminary Steps | |
| |
| |
| |
Specification of Measurement Model | |
| |
| |
| |
Computations | |
| |
| |
| |
Interpreting and Reporting Output | |
| |
| |
| |
Model Modification and Reanalysis (If Necessary) | |
| |
| |
Comparing Models | |
| |
| |
Summary | |
| |
| |
CFA and Reliability | |
| |
| |
CFA and Validity | |
| |
| |
Summary | |
| |
| |
| |
Generalizability Theory | |
| |
| |
Multiple Facets of Measurement | |
| |
| |
Generalizability, Universes, and Variance Components | |
| |
| |
G Studies and D Studies | |
| |
| |
Conducting and Interpreting Generalizability Theory Analysis: A One-Facet Design | |
| |
| |
| |
G Study | |
| |
| |
| |
D Study | |
| |
| |
Conducting and Interpreting Generalizability Theory Analysis: A Two-Facet Design | |
| |
| |
| |
G Study | |
| |
| |
| |
D Study | |
| |
| |
Other Measurement Designs | |
| |
| |
Number of Facets | |
| |
| |
Random Versus Fixed Facets | |
| |
| |
Crossed Versus Nested Designs | |
| |
| |
Relative Versus Absolute Decisions | |
| |
| |
Summary | |
| |
| |
Suggested Readings | |
| |
| |
| |
Item Response Theory and Rasch Models | |
| |
| |
Factors Affecting Responses to Test Items | |
| |
| |
Respondent Trait Level as a Determinant of Item Responses | |
| |
| |
Item Difficulty as a Determinant of Item Responses | |
| |
| |
Item Discrimination as a Determinant of Item Responses | |
| |
| |
Guessing | |
| |
| |
IRT Measurement Models | |
| |
| |
One-Parameter Logistic Model (or Rasch Model) | |
| |
| |
Two-Parameter Logistic Model | |
| |
| |
Graded Response Model | |
| |
| |
Obtaining Parameter Estimates: A IPL Example | |
| |
| |
Item and Test Information | |
| |
| |
Item Characteristic Curves | |
| |
| |
Item Information and Test Information | |
| |
| |
Applications of IRT | |
| |
| |
Test Development and Improvement | |
| |
| |
Differential Item Functioning | |
| |
| |
Person Fit | |