**Predictive Validity**

Predictive Validity Activity PURPOSE In this activity, the predictive validity of two tests for selecting applicants to a university will be studied. Test scores for 40 individuals examined in their senior year of high school and their subsequent grade point average (GPA) after the first year of college studies are provided. In this exercise, the student computes correlations between each test and GPA, prepares expectancy tables to show the relationship graphically, and sets up cutting scores to examine different types of prediction errors. DATA FOR THE EXERCISE Table 1 presents the data. The first column includes test scores for a “College Aptitude Test,” a test of verbal comprehension, quantitative reasoning, and abstract thinking. The second test, “World Affairs Test,” covers knowledge of current events, political affairs, cultural and sports activities, and general information in recent history. These tests were administered to the students early in their high school, but the results were not used by university officials in making decisions about whom to select. The test results were “locked up” and not made available to anyone selecting or counselling students, or to any faculty members in the courses. The reason for not making test scores available to any of the decision makers is to provide a clear examination of the usefulness of the tests over and above the current method of selection. Keeping the test scores from the university faculty reduces the chances that the criterion (GPA) is artificially related to, that is, “contaminated” by, the test scores. In addition, teachers may give additional encouragement to high scorers and ignore low scorers. The research design described here is the only appropriate way to study predictive validity. Unfortunately, practical considerations often lead to the premature use of test scores before they have been properly validated for selection. The criterion for the research is grade point average (GPA) after one year of studies. This is the cumulative GPA for all courses taken. A word of caution: GPA at any one point in time may be unreliable and it certainly measures only a limited range of academic performance. CORRELATION Tables 2 and 3 should be used to compute the Spearman rank-order correlation between each test and the criterion. (The instructor may ask the class to compute Pearson product-moment correlation coefficients.) EXPECTANCY TABLE Next, the student should prepare an expectancy table for the College Aptitude Test. An expectancy table shows the probability of reaching various levels on the criterion (GPA) for each level on the test. For example, if an examinee gets a low score on the test, what is the probability of earning a GPA of 3.5 or above? Tables 4 and 5 should be used for this step. In Table 4, make a tally mark in the “box” corresponding to the test and criterion scores for each subject. On the right-hand side of the table, record the number of subjects at each score level. In Table 5, record the percent of subjects at each score level who earn various grade point averages; for example, if 10 students get test scores from 501-600, figure out what percent of subjects get a GPA of less than 1.0, 1.0-1.4, and so on. Note that these are.” horizontal” percents. Of the total subjects at each score level on the test, what is the probability they will achieve each level of GPA? The validity coefficient and the expectancy table for any test should be cross-validated on another comparable sample of subjects before it is actually used. Cross-validation should always be done where it is technically feasible. For the purpose of this exercise, cross-validation will not be carried out. A test which has predictive validity can be used for selection purposes along with other application information. The class should identify what score level could be used to screen future applicants. DECISION THEORY The correlation coefficient gives a single, summary index of the predictive validity of the test. The expectancy table shows the probability of getting various GPA’s if a student makes a given test score. The scatter plot in Table 4 provides additional information about correct and incorrect predictions for individuals. Note that there is a heavy line at the 2.0 GPA. This can be called the criterion cut-off. Below this level, students are “on probation,” and if they continue to perform this way, they will not graduate. We might say that these students have been “unsuccessful” (at least thus far in their academic careers). Also note that there is a heavy black line at the test score of 500. This can be called the test cut-off. Above this line, there is a high probability of success; below the line, a low probability of success. What are these probabilities? The scatter plot is divided into four segments: 1. High hits-students who scored high on the test and were successful. Correct predictions. 2. False positives-students who scored high on the test but were unsuccessful. Errors in prediction. 3. Low hits-students who scored low on the test and were unsuccessful. Correct predictions. 4. False negatives-students who scored low on the test but were successful. Errors in prediction. In general, a valid test will result in more correct predictions (hits) than errors. The interesting, and sometimes tragic, cases are the errors in prediction (segments 2 and 4). If the test were used to make selection decisions about who should be admitted to the university, some individuals would be rejected who would have, in fact, succeeded (false negatives). Some individuals would be accepted who would, in fact, fail (false positives). The realization that a single test results in errors of prediction (i.e. false positives and false negatives), leads us to use more than one selection device for assessment. Additional information about applicants can be gathered by interviews, application blanks, letters of recommendation, etc. Even under the best conditions, predictions about academic success are not always accurate. Many unknown and changing factors in the individual and school environment prevent error-free prediction.