We can improve the quality of face validity assessment considerably by making it more systematic. For instance, if you are trying to assess the face validity of a math ability measure, it would be more convincing if you sent the test to a carefully selected sample of experts on math ability testing and they all reported back with the judgment that your measure appears to be a good measure of math ability.
In content validity , you essentially check the operationalization against the relevant content domain for the construct. This approach assumes that you have a good detailed description of the content domain, something that's not always true. For instance, we might lay out all of the criteria that should be met in a program that claims to be a "teenage pregnancy prevention program.
Then, armed with these criteria, we could use them as a type of checklist when examining our program. Only programs that meet the criteria can legitimately be defined as "teenage pregnancy prevention programs.
But for other constructs e. In criteria-related validity , you check the performance of your operationalization against some criterion. How is this different from content validity? In content validity, the criteria are the construct definition itself -- it is a direct comparison.
In criterion-related validity, we usually make a prediction about how the operationalization will perform based on our theory of the construct. The differences among the different criterion-related validity types is in the criteria they use as the standard for judgment. In predictive validity , we assess the operationalization's ability to predict something it should theoretically be able to predict.
For instance, we might theorize that a measure of math ability should be able to predict how well a person will do in an engineering-based profession. We could give our measure to experienced engineers and see if there is a high correlation between scores on the measure and their salaries as engineers.
A high correlation would provide evidence for predictive validity -- it would show that our measure can correctly predict something that we theoretically think it should be able to predict. In concurrent validity , we assess the operationalization's ability to distinguish between groups that it should theoretically be able to distinguish between.
For example, if we come up with a way of assessing manic-depression, our measure should be able to distinguish between people who are diagnosed manic-depression and those diagnosed paranoid schizophrenic. If we want to assess the concurrent validity of a new measure of empowerment, we might give the measure to both migrant farm workers and to the farm owners, theorizing that our measure should show that the farm owners are higher in empowerment.
As in any discriminating test, the results are more powerful if you are able to show that you can discriminate between two groups that are very similar. When we conduct research, we are continually flitting back and forth between these two realms, between what we think about the world and what is going on in it. When we are investigating a cause-effect relationship, we have a theory implicit or otherwise of what the cause is the cause construct.
For instance, if we are testing a new educational program, we have an idea of what it would look like ideally. Similarly, on the effect side, we have an idea of what we are ideally trying to affect and measure the effect construct.
But each of these, the cause and the effect, has to be translated into real things, into a program or treatment and a measure or observational method. We use the term operationalization to describe the act of translating a construct into its manifestation. In effect, we take our idea and describe it as a series of operations or procedures.
Now, instead of it only being an idea in our minds, it becomes a public entity that anyone can look at and examine for themselves. It is one thing, for instance, for you to say that you would like to measure self-esteem a construct. But when you show a ten-item paper-and-pencil self-esteem measure that you developed for that purpose, others can look at it and understand more clearly what you intend by the term self-esteem.
Now, back to explaining the four validity types. They build on one another, with two of them conclusion and internal referring to the land of observation on the bottom of the figure, one of them construct emphasizing the linkages between the bottom and the top, and the last external being primarily concerned about the range of our theory on the top. Assume that we took these two constructs, the cause construct the WWW site and the effect understanding , and operationalized them -- turned them into realities by constructing the WWW site and a measure of knowledge of the course material.
Here are the four validity types and the question each addresses:. In this study, is there a relationship between the two variables? In the context of the example we're considering, the question might be worded: There are several conclusions or inferences we might draw to answer such a question. We could, for example, conclude that there is a relationship. We might conclude that there is a positive relationship. We might infer that there is no relationship.
We can assess the conclusion validity of each of these conclusions or inferences. Assuming that there is a relationship in this study, is the relationship a causal one? Just because we find that use of the WWW site and knowledge are correlated, we can't necessarily assume that WWW site use causes the knowledge. Both could, for example, be caused by the same factor. For instance, it may be that wealthier students who have greater resources would be more likely to use have access to a WWW site and would excel on objective tests.
Check out our quiz-page with tests about:. Martyn Shuttleworth Oct 20, Retrieved Sep 11, from Explorable. The text in this article is licensed under the Creative Commons-License Attribution 4. You can use it freely with some kind of link , and we're also okay with people reprinting in publications like books, blogs, newsletters, course-material, papers, wikipedia and presentations with clear attribution.
Don't have time for it all now? No problem, save it as a course and come back to it later. Martyn Shuttleworth K reads. Share this page on your website: This article is a part of the guide: Select from one of the other courses available: Don't miss these related articles:.
Check out our quiz-page with tests about: Back to Overview "Validity and Reliability". Related articles Related pages: Search over articles on psychology, science, and experiments. Leave this field blank:
Internal validity dictates how an experimental design is structured and encompasses all of the steps of the scientific research method. Even if your results are great, sloppy and inconsistent design will compromise your integrity in the eyes of the scientific community.
Research validity in surveys relates to the extent at which the survey measures right elements that need to be measured. In simple terms, validity refers to how well an instrument as measures what it is intended to measure.
INTERNAL VALIDITY is affected by flaws within the study itself such as not controlling some of the major variables (a design problem), or problems with the research instrument (a data collection problem). When we think about validity in research, most of us think about research components. We might say that a measure is a valid one, or that a valid sample was drawn, or that the design had strong validity.
Different methods vary with regard to these two aspects of validity. Experiments, because they tend to be structured and controlled, are often high on internal validity. However, their strength with regard to structure and control, may result in low external validity. Validity of research can be explained as an extent at which requirements of scientific research method have been followed during the process of generating research findings. Oliver () considers validity to be a compulsory requirement for .