Face validity is the extent to which a test is subjectively viewed as covering the concept it purports to measure. It refers to the transparency or relevance of a test as it appears to test participants. In other words, a test can be said to have face validity if it "looks like" it is going to measure what it is supposed to measure. For instance, if a test is prepared to measure whether students can perform multiplication, and the people to whom it is shown all agree that it looks like a good test of multiplication ability, this demonstrates face validity of the test. Face validity is often contrasted with content validity and construct validity.
Some people use the term face validity to refer only to the validity of a test to observers who are not expert in testing methodologies. For instance, if a test is designed to measure whether children are good spellers, and parents are asked whether the test is a good test, this measures the face validity of the test. If an expert is asked instead, some people would argue that this does not measure face validity. This distinction seems too careful for most applications. Generally, face validity means that the test "looks like" it will work, as opposed to "has been shown to work".
In simulation, the first goal of the system designer is to construct a system which can support a task to be accomplished, and to record the learner's task performance for any particular trial. The task(s)—and therefore, the task performance—on the simulator should be representative of the real world that they model. Face validity is a subjective measure of the extent to which this selection appears reasonable on the face of it—that is, subjectively to an expert after only a superficial examination of the content. Some assume that it is representative of the realism of the system, according to users and others who are knowledgeable about the real system being simulated. Those would say that if these experts feel the model is adequate, then it has face validity. However, in fact face validity refers to the test, not the system.