Lesson 7 - Assessment InstrumentsLesson 7 Readings
At this point you’ve come quite a ways in the instructional design process. You assessed your needs and wrote an instructional goal statement. Then you analyzed your goal to identify goal steps, substeps, subordinate skills, and entry behaviors. After that you analyzed the learners and both the performance and learning context. And in the last lesson you wrote objectives for each of the skills in your instructional analysis. The next step in the Dick and Carey model is to develop assessment instruments so that you will be able to determine if your learners have achieved your objectives. In the ASSURE model, assessment is not really discussed until the final step (Evaluate and Revise).
As an instructional designer, the emphasis on assessment is an important one as performing appropriate and well-thought out assessment helps us determine what objectives have or have not been learned by the learners and will also help in performing the formative evaluation. As Robert Mager states in his book Making Instruction Work, " If it’s worth teaching, it’s worth finding out whether the instruction was successful. If it wasn’t entirely successful, it’s worth finding out how to improve it" (pg. 83). If you think of objectives as describing where you are going, the assessment items are the means by which you find out whether you got there.
You may wonder why test items are created now, when you haven’t even developed your instruction? Well, the idea is that your assessment items should stem directly from your objectives. The performance asked for in the assessment item should match the performance described in the objective. They should not be based on what you think are good or fun test questions. They should also not be based on what your instructional activities are. In fact, the activities should be based on your objectives and assessment items. The good thing is that if you’ve written worthwhile objectives, you already know what content to test for. Then it’s just a matter of creating good test items that measure the acquisition of the skills, knowledge, or attitudes you are looking for.
Introduction to Assessment
As discussed in Chapter 7 of Dick and Carey, learner-centered assessment is linked very closely to the traditional notion of criterion-referenced tests. The name criterion-referenced is derived from the purpose of the test: to find out whether the criteria stated in an objective have been achieved. Criterion-referenced assessments are composed of items or performance tasks that directly measure skills described in one or more behavioral objectives. The importance of criterion-referenced assessment from an instructional design standpoint is that it is closely linked to instructional goals and a matched set of performance objectives, therefore giving the designers an opportunity to evaluate performance and revise instructional strategies if needed. In other words, criterion-referenced assessment allows instructors to decide how well the learners have met the objectives that were set forth. It also facilitates a reflective process in which learners are able to evaluate their own performance against the stated objectives and assessment items. Smith and Ragan (1999) note that criterion referenced tests have also been referred to as objective-referenced or domain-referenced instruments. They believe that this testing strategy is effective for determining "competency", especially as it relates to meeting instructional objectives.
In contrast to criterion-referenced tests, norm-referenced tests are designed to yield scores that compare each student’s performance with that of a group or with a norm established by group scores. They provide a spread of scores that generally allows decision makers to compare or rank learners. They are not based on each student achieving a certain level of mastery. In fact, in many cases items are selected to produce the largest possible variation in scores among students. As a result, items that all students are able to master are often removed in order to maintain a certain spread of scores. An example of a norm-referenced test would be the SAT test. Scores from this test are used to perform comparisons of students for various purposes (such as college admission). Although this form of assessment can be learner-centered, it differs in the manner in which it defines the content that is to be assessed. In this course we will mainly concern ourselves with criterion-referenced assessment.
Types of Criterion-Referenced Tests
Dick, Carey and Carey discuss four different types of criterion-referenced tests that fit into the design process:
1. Entry Behaviors Test
An entry behaviors test is given to learners before instruction begins. They are designed to assess learners’ mastery of prerequisite skills. These are the skills that appear below the dotted line you drew on your instructional analysis flowchart. If you have no entry behaviors then there would be no need to develop a pretest. However, if you have entry behaviors that you are unsure about you should test your learners to help determine if they are indeed entry behaviors after all.
A pretest is used to determine whether learners have already mastered some of the skills in your instructional analysis. If they have, then they do not need as much instruction for those skills. If it becomes obvious that they lack certain skills then your instruction can be developed with enough depth to help them attain those skills. When using a pretest in this manner you are not trying to get a score that you can compare with a later posttest in order to document gains.
A pretest is often combined with an entry behaviors test. However, it is important to keep in mind the purpose of each test. The entry behaviors test determines whether or not students are ready to begin your instruction, while the pretest helps determine which skills in your main instructional analysis they may already be familiar with. However, if you already know that your learners have no clue about the topic you are teaching them, then they may not need a pretest.
3. Practice Tests
Practice tests solicit learner participation during the instruction by providing them with a chance to rehearse the new skills they are being taught. They also allow instructors to provide corrective feedback to keep learners on track.
Posttests are given following instruction, and help you determine if learners have achieved the objectives you set out for them in the beginning. Each item on a posttest should match one of your objectives, and the test should assess all of the objectives, especially focusing on the terminal objective. If time is a factor, it may be necessary to create a shorter test that assesses only the terminal objective and any important related subskills.
Posttests are used by instructors to assess learner performance and hand out grades, but for the designer the primary purpose of a post-test is to help identify areas where the instruction is not working. If learners are not performing adequately on the terminal objective, then there is something wrong with your instruction, and you will have to identify the areas that are not working. Since each test item should correspond to one of your objectives, it should be relatively easy to figure this out.
Designing Tests & Writing Items
There are quite a few issues to consider when designing assessment instruments. Let’s spend a little time discussing some of the more important ones.
Types of Assessment Items
The first thing we want to look at is the various types of items you can use when creating assessment items. Earlier we discussed different types of tests (Entry Behaviors Test, Pretest, Practice Tests, and Posttests); now we are discussing individual test items. Possible test items include:
In the table on page 154, Dick and Carey give some guidelines for selecting item types according to the type of behavior specified in your objective. This table provides a good starting point for deciding on what item type to use for a particular objective. However, when it comes right down to it, the wording of your objective should guide the selection of item type. You should select the type of item that gives learners the best opportunity to demonstrate the performance specified in the objective. For example, if our objective was for students to state the capital of Virginia, it would be best to have them state it from memory (fill-in-the-blank) and not pick it from a list of choices (multiple-choice).
In addition to selecting the appropriate test item type, it is also important to consider the testing environment. If your test items require special equipment and facilities – as specified in the "conditions" component of your objective – you will need to make sure that those things will be available to them. If not, you will need to create a realistic alternative to the ideal test item. Keep in mind that the farther removed the behavior in the assessment is from the behavior specified in the objective, the less likely you will be able to predict if learners can or cannot perform the objective.
Matching Learning Domain and Item Type
The next issue we want to look at is that of matching the learning domain with an appropriate item type. Organizing your objectives according to learning domain can also aid you in selecting the most appropriate type of assessment item. If you remember, Gagné defined four main learning domains (categories):
Writing Test Items
You should write an assessment item for each objective whose accomplishment you want to measure. Mager provides these steps to follow when writing a criterion assessment item:
If you follow these steps and still find yourself having trouble drafting an assessment item, it is almost always be because the objective isn’t clear enough to provide the necessary guidance.
Criteria for Writing Test Items
Dick and Carey list several criteria that you should consider when writing test items:
Let’s take a brief look at each one.
As we have inferred already, test items should be congruent with the terminal and performance objectives by matching the behavior involved. What this means is that each test item should measure the exact behavior and response stated in the objective. The language of the objective should guide the process of writing the assessment items. A well-written objective will prescribe the form of test item that is most appropriate for assessing achievement of the objective. Appropriate assessment items should answer "yes" to the following questions:
For example, if the performance of an objective states that learners will be able to state or define a term, the assessment item should ask them to state or define the term, not to choose the definition from a list of answers.
Another common bad practice is teaching one thing and then testing for another. You should not use a test item that asks for a different performance than the one called for by your objectives. For example, if you have an objective that says students need to be able to make change, it would be deceitful to then have test items such as the following:
None of these items asks the student to do what the objective asks, which is to make change. As a result you will not know if your students can perform as required. Kemp, Morrison, and Ross (1998) cite several more examples of "mismatches" between objectives and assessment. In one example a college professor whose objective asked students to analyze and synthesize developments of the Vietnam War simply asked students to list those developments in the final exam. Other examples include a corporate training course on group leadership skills that included objectives that were performance or skill-based, yet the sole assessment items were multiple-choice. As these examples illustrate, it is important to determine which learning domain your objective falls into in order to write the most appropriate type of assessment.
So why do we keep saying that the performance indicated in the assessment item has to match the performance in the objective? Well, the point of testing is to be able to predict whether your learners will be able to do what you want them to do when they leave you, and the best way to do that is to observe the actual performance that you are trying to develop. Mager (1988) provides a good story to illustrate this point. Suppose your surgeon were standing over you with gloved hands and the following conversation took place:
So, would you prefer your surgeon to have had some meaningful, practical types of assessments or strictly paper-and-pencil-tests?
Test items should take into consideration the characteristics and needs of the learners. This includes issues such as learners’ vocabulary and language levels, motivational and interest levels, experiences and backgrounds, and special needs. To start with, test items should be written using language and grammar that is familiar to the learners. Another important aspect of learner-centered assessment is that the level of familiarity of experiences and contexts needs to be taken into consideration. Learners should not be asked to demonstrate a desired performance in an unfamiliar context or setting. The examples, question types, and response formats should also be familiar to learners, and your items should be free of any gender, racial, or cultural bias.
Remember the context analysis you wrote? Well, when writing test items you should consider both the performance context and the learning context your wrote about. It is important to make your test items as realistic and close to the performance setting as possible. This will help ensure the transfer of skills from the learning environment to the eventual performance environment. According to Dick and Carey, "the more realistic the testing environment, the more valid the learners’ responses will be" (pg. 153). It is also important to make sure the learning environment contains all the necessary tools to adequately simulate the performance environment.
Test items should be well written and free of spelling, grammar, and punctuation errors. Directions should be clearly written to avoid any confusion on the part of the learner. It’s also important to avoid writing "tricky" questions that feature double negatives, deliberately confusing directions, or compound questions. Your learners should miss questions because they do not have the necessary skill, not because your directions were unclear, or because you wanted to throw them off with unclear wording.
Dick and Carey provide a checklist of these four criteria on page 165. Use this checklist as you create your own test items.
How Many Items?
The question inevitable arises as to how many items are necessary to achieve mastery of an objective? For some skills only one item is necessary. However, for others it may require more than one item. For example, a second grade student may be asked to demonstrate his mastery of an arithmetic rule by means of the item: 3M + 2M = 25; M=? Obviously, the purpose of assessment would be to determine if the student could perform a class of arithmetic operations similar to this, not whether he or she is able to perform this single one. Generally items of the same type and class would be employed to ensure the reliability of the results.
Also, on any single item a student may make a correct response because he or she has seen the correct response before, or perhaps has just made a good guess. In this case several items may be warranted. With some assessment items, though, guessing is not something that could be rewarded, so you may only require a single performance. Another possibility is that a single item may be missed because a student has been misled by some confusing characteristic of the item, or has simply made a "stupid" mistake.
It is essential to keep in mind that, no matter how many items are created for an objective, the conclusion aimed for should not be, "how many did they get correct?" but rather, "does the number correct indicate mastery of the objective?" Also keep in mind that while two items may be better than one, it may also yield a 50-50 result, with a student getting one right and one wrong. Would this indicate mastery? Gagné (1988) suggests having three items in this case instead of two, as two out of three provides a better means of making a reliable decision about mastery.
Assessment of Performances, Products, and Attitudes
Some intellectual skills, as well as psychomotor and attitudinal skills, cannot be assessed using common objective-type test items. They require either the creation of some type of product or a performance of some sort. These types of performances need to be assessed using an evaluation or observation instrument. Dick and Carey suggest that you provide guidance during the learning activities and construct a rubric to assist in the evaluation of the performance or product.
Attitudes are unique in that they require in that they are not directly measurable. Instead, the best way to assess attitudes is to observe the learner exhibiting or not exhibiting the desired attitude. During observation, it is important that the learners be given the choice to behave according to their attitudes. If you are observing a performance and the learners know they are being observed their behavior may not reflect their true attitudes. If direct observation isn’t possible you can have students respond to a questionnaire or open-ended essay question. Much care should be taken when constructing such tests, though. If you simply give them a test with leading questions and/or directions describing the nature of the test they are likely to give you the answer they think you want to read. The results would not tell you how they would act when faced with a real-world situation involving that attitude.
Dick and Carey make the following suggestions regarding the development of these types of assessment instruments:
Directions for performance and products should clearly describe what is expected of the learners. You should include any special conditions and decide on the amount of guidance you will provide during the assessment. In some situations you may want to provide no guidance.
Developing the Instrument
When assessing performances, products, or attitudes you will need to create an assessment instrument to help you evaluate the performance, product, or attitude. Dick and Carey offer five steps to creating this instrument:
Dick and Carey provide good examples of assessment instruments for evaluating psychomotor skills and attitudes on page 169 and 170 of their book.
Many of you are probably familiar with this type of assessment. Portfolios are collections of work that together represent learners’ achievements over an extended period of time. This could include tests, products, performances, essays, or anything else related to the goals of the portfolio. They allow you to assess learners’ work as well as their growth during the process. As with all other forms of assessment, whatever is included in the portfolio must be related to specific goals and objectives. The choice of what to include can be decide on entirely by the teacher, or in cooperation with students. Assessment of each portfolio component is done as it is completed, and the overall assessment of the portfolio is carried out at the end of the process using rubrics. In addition, learners are given the opportunity to assess their own work by reflecting on the strengths and weaknesses of various components. Portfolios can also be used as part of the evaluation process to determine what students did and did not learn, and then that information can be used to strengthen the instruction.
Evaluating Congruence in the Design Process
One of the most crucial aspects of the assessment phase of the design process is to be able to evaluate the congruence of the assessment against the objectives and analyses that have been performed. Remember that this is a systematic approach to instructional design, which means that every step in the process influences subsequent steps. As such, all of your skills, objectives, and assessment items should be parallel. One way to clearly represent this relationship is to create a three-column table that lists each of the skills from your instructional analysis, the accompanying objective, and the resulting assessment item. At the bottom of the table you would finish up with your main instructional goal, the terminal objective, and the test item for the terminal objective.
Design Evaluation Chart
It is important at this point to make sure that your design is adequate so that you will be able to move on to the next step in the instructional design process. The next step involves developing an instructional strategy based on all of the design work you have done up to this point. But before we move on, let’s close with one more note from Mager regarding objectives and assessment:
Here are some examples of good and bad assessment items:
Objective: The student will state the time shown on an analog clock to the nearest 5 minutes.
Objective: The student will set up an attractive merchandise display in the student store, with appropriate signs.
Objective: Students will write a descriptive essay of at least 300 words.
In addition, if you return to Appendix D in the Dick and Carey book you will see that they have a Design Evaluation Chart that lists the skills, objectives, and test items for a portion of their project on story writing. This will provide a good example for you to follow.
Instructional Design Project Part Four (cont.)
The activities in this lesson should be added to the document you began in the last lesson (objectives.doc). If you recall, in the last lesson you began Part Four of your ID Project by writing objectives for each of the skills and subskills in your instructional analysis. Now that you have drafted a list of objectives describing what you want your learners to be able to do after your instruction, it is time to create test items that will determine whether or not they have achieved those objectives. While you may want to administer an entry behaviors test or pretest to your learners, for now we will concentrate on creating posttest assessment items. To complete Part Four of your ID Project, perform the following tasks:
To begin with, write down each of your objectives in order, including the terminal objective. Beneath each one, answer the following questions:
When you have answered these questions, create a criterion-referenced assessment item or evaluation tool for the objective. The criterion-referenced items or evaluation tools do not need to be paper-and-pencil tests, but they must accurately assess the behavior or performance called for by each of your objectives, and they should attempt to provide the conditions stated in the objective. If you feel you need more than one item in order to assess achievement of the objective, feel free to include them. However, at this point you are only required to create one item per objective.
If you are assessing a product, performance, or attitude, you will not create an objective-type item. Instead, describe the product you will have them create or the behavior you will have them perform. Then, list some of the criteria you would include in a checklist or rating scale for that item. These criteria should reflect the characteristics of the product, the steps in the performance, or the items you will use to determine the presence of the attitude. Also indicate how these criteria will be rated.
When writing your assessment items, keep in mind the four categories of test item qualities that were discussed earlier:
If you need to, use the checklist on page 165 of your book to help you evaluate your assessment items.
Here’s an example of what your document should look like for each objective. This example shows an intellectual skill with an objective-type test item.
When you are finished you should have an assessment item for each of your objectives. The final thing you need to do is create a design evaluation chart that indicates the congruence between your skills, objectives, and assessment items. Following the examples in the book, create a three-column table. In the left column list all of your goal steps, substeps, and subordinate skills in order. In the middle column list the accompanying objective for each skill. In the last column list your assessment item(s) for that skill and objective. Make sure that everything is lined up properly, so that it is obvious which skill goes with which objective, and which objective goes with which test item.
Submitting Part Four of Your ID Project
At the end of this lesson you will submit your completed Part Four. To recap, Part Four of your ID Project should be typed up in Microsoft Word. At the top of the paper you should have "ID Project Part Four: Objectives and Assessment". Underneath that should be your name, email address, and the date. Also, make sure the file is named "objectives.doc". When you have completed Part Four, upload the Word document to the "instrdes" folder in your Filebox. When you have finished uploading your file, proceed to the online student interface to officially submit your activities for grading.