 Have a nice day. Well, this major was having a pretty good day until a familiar voice called him on the phone. That's true. Now, wait a minute, Doc. You don't mean that a half of our graduates can't... You saw it yourself? I don't know what to tell you. I'm fresh out of alibis. Hold it for a second, will you? Hey, Chief. Yes, sir. Get me the test results on the 211 class, will you? The part on using clinical instruments. All right. I'm going to check the test scores on that class, Doc. But if half of them can't take a decent blood pressure reading for you, something's got to be haywire. Yeah, for sure. And hey, why don't you call me... Yeah, call me with some good news once in a while, will you? I'll get back to you. Results you asked for. As you can see, they all passed. Yeah. On the, uh, what is that, the sphigmomometer, they had, let's see, how many was that? Two questions in the written test. And 94% of them got the right answers. 94%. So if they passed the test, why can't they perform on the job? Possibly the reason is the school didn't test them on their ability to perform, but on their ability to answer some questions. Having knowledge about a job is one thing. Being able to do that job is something else entirely. That's where performance testing comes in. This is another film in the Air Force series on instructional technology, the state of the art. As you know, criterion reference testing doesn't measure each student against all other students, but rather against a criterion, a standard. The criterion test derives from the course objectives, stating what skill the student must demonstrate, the conditions of the test, and the minimum acceptable performance. Whatever the stated objectives of a training program may be, the true objectives show up in the test at the end. If the student knows he's going to have to actually fix a malfunctioning radio in his test, he'll practice with that radio. Take it apart, handle its components. If he knows he's just going to have to write answers on a question sheet, why handle the radio all that much? Just study the tech order. Students aren't down. They'll learn whatever they're going to be tested on. Let's hear now from Dr. F. W. Shofoltoski, Technical Director of DCS Plans, Headquarters Air Training Command. Finding out if the student has learned enough from the instruction to meet the objectives is a critical part of the ISD process. Where the objectives call for the student to be able to do something, the best way to determine whether the student is really able to do it is to use a performance test, such as we're doing here with this flight simulator. Let me stress that word performance. While performance testing may be done by paper and pencil or simulation, the more a test resembles a hands-on measurement, the easier it is to reach an accurate performance measurement. With me is Hilton Goldman, and he's Chief of the Advisory Service Branch, Headquarters Air Training Command. First of all, let me ask you, even though performance tests are the best measure of a student's skills, they're comparatively expensive and time-consuming, so with that in mind, when should a performance test be used? Use it to validate the instruction. This is necessary to be sure that instruction teaches what it's supposed to teach. However, once instruction has been validated and as long as there is no major change in student characteristics or subject matter, performance tests can then be limited to critical tasks. Now by critical tasks, I mean tasks where a loss of life or property or mission failure could result from misperformance. Incidentally, the same performance test item that lets you know if the student can do the job also lets you know if the student has acquired the related knowledge. Let me show you an example here. Now, here's a performance test item using a Webster's abridged dictionary find and state the correct pronunciation, the meaning, and the part of speech for the word charisma. Okay, here are some knowledges that this would test. How words are arranged in the dictionary, that it has a key to pronunciation, how to use the key to pronunciation, identify what part of speech a word is, and these are some of the skills that it tests. Ability to locate a word, ability to produce sounds indicated by the symbols and word examples in the key to pronunciation. Now, must the student always use actual equipment in a performance test? In some cases, it may be either too expensive or inappropriate to use actual equipment, or the actual equipment may not even be available. In such cases, it is necessary to use simulation. Often, there may be a choice of degrees of simulation. Take flying training as a good example. It's not feasible to test the student pilot's ability to perform emergency ejection procedures on this T-38 aircraft. However, you might use this very realistic instrument flight simulator or this classroom training aid to teach emergency ejection procedures. Notice, as we went through this list of alternatives, the simulation device became less expensive, but also it provided less fidelity. Well, if you have several alternatives, on what basis do you choose which one to use? Use the simplest form of simulation that will let you measure the tests or skills you want to measure. In the emergency procedures example that we just mentioned, if you only want to test the student's ability to accomplish critical procedures, the ejection seat trainer provides all that is needed. Now, how can you be sure that what you've chosen will do the job? You must validate the simulation. And how do you do that? Well, try the test item in a group of people who can do the job and also in another group who cannot. The former group should pass the test item and the latter group should fail it. And it doesn't work that way. The test item is not valid and should not be used. If you want to test the student's ability to do something that requires motor skill, fire a rifle, type a letter, have the student perform the task. Similarly, to test if the student can perform a procedure task, placing a machine in operation, assembling a unit, filling out a form, have the student do it. But for many procedures, the skilled aspect of the task is mental. These are procedures that could be performed with little or no practice. If one knows what, when, and how to do them. In such cases, you could measure proficiency by multiple-choice test items, like the one you see here. Troubleshooting. How you test for ability to do troubleshooting depends on the objective you're testing. If you want to test whether the student understands the logic of a system, you can do this with a paper and pencil simulator. In this example, the student is given a wiring diagram. A problem. And this representation of the panel's are these points. Find and repair the trouble if one exists. Circle each measurement you use. Instead of actually making the measurements needed to find the trouble, the student draws a circle around each measurement he would use. I need to know the voltage at this point. For each measurement he indicates he needs, the student gets a picture of the reading he would get if he actually used the voltmeter. The voltage is a positive 30.05 volts. This point here. Of course, if you want to test the student's ability to use the voltometer, then have the student use it. One of the attitudes to be taught in basic training is respect for the flag. You can identify certain behaviors that would be displayed by someone who has this attitude. Then observe those who have completed the training and see if, without prompting, they behave appropriately. In this case, obviously the instruction was not effective. So now you're convinced that performance tests are necessary and you know what makes up a performance test item. Is that all there is to know about performance testing? Not quite. You also must know how to conduct a performance test. That involves trying it out to be sure it measures what it should, administering it properly, and scoring it. Before any performance test is incorporated into a training course, it must be tried out. Let's say the test is for dental technician trainees on the assembly and disassembly of the anesthetic syringe. Today you will be tested on how to properly assemble and disassemble a dental syringe using the long and short needle. You'll have five minutes to complete this test in with 100% accuracy. Do you understand the instructions? Yes. Let's go. Since the end result of their work must be a properly assembled syringe, and since assembly requires following precisely a series of steps, the observer will make note of any missteps. On this test item, mistakes will show up in two ways, exceeding the allowed time limit or an improperly assembled syringe. But this is a triad. Perhaps the test itself is faulty, so the students who don't do well should be questioned. Did you understand the instructions for this test? Yes. Do you feel that we allowed you enough time to complete the test in? Well, I guess so. Was it the same type of syringe that you had been trying with? Yes. Okay, that's the end of the test. Where the answers to those questions, among others, are answered yes, the test can be considered validated. In administering a performance test, the procedures you set up must not contaminate the test results and also must ensure fairness to each student. This makes it essential that the administration of the test be standardized. But performance testing, by its very nature, introduces a possibility that not every student will be tested in exactly the same way. Where variables may exist, sometimes an extra effort is needed to counter them to prevent their affecting one student or one group of students, adversely in comparison to other students. Let's talk about environmental variables for a moment. Environmental variables include conditions such as lighting, temperature, and background noise level. So the conditions required for testing should be stated in the directions. It's the responsibility of the tester to ensure that these conditions exist at the time of testing. Now, let's consider the next item. Personal variables include personal, physical, and emotional conditions. Again, where these conditions are important in your testing, it's up to the test supervisors to ensure the timing of the test. With relation to meal time, rest time, exercise time, and work time is standardized for every student. Now, for the final item on our list of variables, and by far the most critical of all, instructional and tester's variables. These are items that test-administering personnel can control and should in the interest of fairness. Instructions to students who are about to begin a performance test must be standardized. Consistent, agreed upon in advance by all testers and observers, and they should be delivered to all students without variation. One good way is to hand each student a written instruction sheet or to read them aloud and check to see that they're clearly understood. All right, you know what you're supposed to do. You're supposed to show me how smart you are. I want you to disassemble that piece there. Don't get any of the parts dirty. Reassemble it like your life depended on it. Now, let's just get this over with. Go. Add living the instructions verbally is in good practice. Different testers will have different ways, not all of them strictly impersonal. That's what you might call rough and ready instructions, but a true performance test doesn't start that way. The tester left out the time limits, some of the conditions of the test, and the exact end performance expected. So students' performance cannot really be measured, except in the subjective opinion of the tester. And what about the attitude that was displayed? It's far better to give test instructions orally from a written text and then give copies to each trainee. Do you understand your own wrong? Don't help each other. Bob, how much time does it take for the test? Ten minutes. How many left readings are we going to take? Four. Okay, both radios and frequency cameras? I'll fire that in a little bit before you go. Do you have any questions about the test? No. Everything's clear then? Yes. Okay. Let's surround the radios and remember, we've got ten minutes. Where test instructions are clear, the results of the test should be reliable with all trainees month after month, provided other conditions are also standardized, which brings us to tester's variables, meaning the personal behavior of test givers and observers. You've already seen two examples of tester attitude, one just wanting to get it over with, the other taking care to assure a fair start. There's no question as to which is the proper approach. The tester's manner must be strictly impersonal, calm, reasonable, and noncommittal. No hint of favoritism or antagonism must be given. One of the easiest ways to ruin a performance test is for the tester to motherhand the trainee by giving hints or making corrections. At the end, the trainee may have done the job correctly, but it'll be more of the tester's work than the trainee's. Now we come to the bottom line in performance testing. The result of the test in terms of the score awarded the trainee. You may have done everything else right, but if you haven't established a proper method of scoring your test, well, you've wasted a lot of your effort. Let's run down this list briefly to help you select the method best suited to your particular needs. Assisted testing versus non-interference testing is pretty much self-explanatory. Let's say a chief cook is testing a food service trainee in a mess hall. Assisting is quite appropriate here. You want the trainees to demonstrate their knowledge, but you don't want to waste the food. One quick note on the dispatch. You would note afterwards that the trainee needed help in that step in breakfast preparation. Comparing the number of helps needed with a criterion achieved gives you a relative score for that segment of training. The go-no-go method of scoring is generally used to score simple, objective processes or products. Here, either the performance standard is met or it's not. There's no middle ground. For example, a test involving removal and replacement of the power supply in a receiver transmitter. The trainee's performance is measured as go or no-go on two criteria. First, was the correct part removed and replaced? And secondly, does the transmitter work properly afterward? The go-no-go scoring method can of course be used on much more complicated performances involving many separate steps. The advantage of this method is that the tester's subjective judgment is not a factor, and the result of the test is a clear cut, yes or no. Fixed point scoring is appropriate to tasks where the trainee obtains the end product through processes which aren't necessarily lined up in a rigid order. A welding performance test is an example of this type of scoring. Normally, a checklist is used, which sets forth all the behaviors required in preparing the tools and the materials to be welded, the method of handling, the safety precautions to be taken, and so on. In a typical welding performance test, there may be as many as six different points on tool and material preparation, ten points on work procedure, six on safety measures, and ten on evaluation of the finished product. Was the tool held at the proper angle, one point? Was the base metal preheated, one point? Was the fire extinguisher placed within easy reach, one point? Was the bead free of irregularities, three points? With the fixed point scoring method, the tester can set up a test and as many separately scored behaviors as deemed necessary. At the end, the tester will know at a glance which areas require additional training, and if a consistent pattern of poor performance in one area is seen over a period of time, it's obvious the training program needs to be revised. Our next item, mixed scoring, applies to tests where two or more types of scoring methods can be applied. An example would be a two-fold performance test with a distance measuring navigational receiver transmitter, where the trainee is required to first identify and correct a malfunction in the receiver and then calibrate the receiver with a signal generator. The first part of the test would be scored on a go-no-go basis, and the second part on a basis of points given for accuracy within a specified tolerance. Mix scoring can be appropriately used in any performance test where the trainee must perform troubleshooting on an item of equipment and then operate it. Now we come to the rating scale type of scoring. A simple type of rating scale would be one in which values are assigned to behavior on basis of its closeness to perfection, such as marching, and in this particular example, flight drill. Each marching movement to be graded is assigned a value on an explicit basis so that independent raiders are able to agree consistently on their scoring. This type of rating scale is especially useful where the criterion objective specifies characteristics of an acceptable action or product. When a rating scale involves qualities that can be quantified, then the judgment of the tester must be a major factor. Here's an example. To rate this cake, you could use a descriptive scale, such as this one, but it's not a very satisfactory way because judgment may differ, and the trainee has no way of knowing what the tester thinks is important. Appearance, texture, volume, moistness, flavor. A performance that is rated excellent by one tester may seem only fair to another. If this type of scoring has to be used, the best way to ensure fairness is to have a list of factors that should be considered and the relative weight for each and to use several testers as a cross-check against one another. If you have a choice in the matter of scoring methods, these two, go-no-go or fixed-point scoring, are the most valid and reliable measures of a student's performance. But however you conduct or score a test, the important thing is that you do performance testing, that you make it the goal line, so to speak, that the student must cross. Performance testing ensures that trainees will be able to perform productive work immediately on going out into the real world of the Air Force. Course objectives almost always require a student to do something, yet too many get only a multiple-choice test when actual performance tells the real story. Yes, performance testing takes time, and no, you needn't demand it for every detail of a lengthy job. Be selective. Take the most important parts of the job for hands-on testing. Remember what was said earlier, that students aren't dumb. They'll learn whatever they're going to be tested on. Believe me, if they know they're going to have to perform, they will prepare themselves.