Technical Assistance
|
What & Why? :: Evaluation & Measurement :: Presentations :: Databases :: Logic Models :: Assessments :: Glossary :: Resources
|
|
What are Outcomes and How Are They Used?
Outcomes are benefits or changes for individuals or populations during or after participating in program activities. They are influenced by a program's outputs. Outcomes may relate to behavior, skills, knowledge, attitudes, values, condition, or other attributes. They are what participants know, think, or can do; or how they behave; or what their condition is, that is different following the program. What are the different levels of OUTCOMES?
Why Measure OUTCOMES? There are decreasing funds for nonprofits; yet there are increasing community needs and an outcome evaluation can look at impacts/benefits to clients during and after participation in your programs. Return To Top Evaluation and Measurement
As we all know, research is an important part of developing and maintaining an effective treatment program. This manual is designed not only to explore various assessment tools but also to introduce or refresh your knowledge about the research process; more specifically the evaluation research process. Such an evaluation can be conducted on programs, employees, or clients. Evaluation can look at specific questions such as, .How does our program impact the recidivism of adolescent females who have substance abuse problems?. or more general questions such as .What type of social activities do the clients in our program enjoy?. Evaluation can be a threatening and uncomfortable process for some people. Many groups and organizations struggle with how to build a good evaluation capability into their everyday activities and procedures. Most agencies have incorporated research into their quality assurance program. We will talk more about this process later. Evaluation is a methodological area that is closely related to, but distinguishable from a more traditional social research. Evaluation utilizes many of the same methodologies used in traditional social research, but because evaluation takes place within an agency such as ours. It requires group skills, management ability, political dexterity, sensitivity to multiple stakeholders and other skills that social research in general does not require. The following is a discussion of the major terms and issues in the field. What is Evaluation? Probably the most frequently given definition is: the systematic acquisition and assessment of information to provide useful feedback about some object (this could be a program, policy, technology, person, need, activity, etc.). Evaluation work involves collecting and sifting through data, making judgments about that data and inferring the results of that data to a program or process. Goals of Evaluation The generic goal of most evaluations is to provide "useful feedback" to a variety of audiences including sponsors, donors, clients, groups, administrators, staff, and other relevant constituencies. Most often, feedback is perceived as "useful" if it aids in decision-making. But the relationship between an evaluation and its impact is not a simple one. Studies that seem critical sometimes fail to influence short-term decisions, and studies that initially seem to have no influence can have a delayed impact when more congenial conditions arise. Despite this, there is broad consensus that the major goal of evaluation should be to influence decision-making or policy formulation through the provision of empirically driven feedback. Types of Evaluation There are many different types of evaluations depending on the object being evaluated and the purpose of the evaluation. Perhaps the most important basic distinction in evaluation types is that between formative and summative evaluation. Formative evaluations strengthen or improve the object being evaluated -- they help form it by examining the delivery of the program or technology, the quality of its implementation and the assessment of the organizational context, personnel, procedures, inputs, and so on. Summative evaluations, in contrast, examine the effects or outcomes of some object -- they summarize it by describing what happens subsequent to delivery of the program or technology; assessing whether the object can be said to have caused the outcome; determining the overall impact of the causal factor beyond only the immediate target outcomes and estimating the relative costs associated with the object. Formative evaluation includes several evaluation types:
Summative evaluation can also be subdivided:
Sampling Sampling is the process of selecting units (e.g., people, organizations) from a population of interest so that by studying the sample we may fairly generalize our results back to the population from which they were chosen. What is a sample? A sample is a finite part of a statistical population whose properties are studied to gain information about the whole(Webster, 1985). When dealing with people, it can be defined as a set of respondents(people) selected from a larger population for the purpose of a survey. What is a population? A population is a group of individuals, persons, objects, or items from which samples are taken for measurement. For example, if you were looking at a substance abuse program, the population would be all the clients in that substance abuse program. What is sampling? Sampling is the act, process, or technique of selecting a suitable sample, or a representative part of a population for the purpose of determining parameters or characteristics of the whole population. What is the purpose of sampling? To draw conclusions about populations from samples, we must use inferential statistics, which enables us to determine a population.s characteristics by directly observing only a portion (or sample) of the population. We obtain a sample rather than a complete enumeration (a census ) of the population for many reasons. Obviously, it is cheaper to observe a part rather than the whole, but we should be prepared to cope with the dangers of sampling. In this tutorial, we will investigate various kinds of sampling procedures. Some are better than others but all may yield samples that are inaccurate and unreliable. The dangers can be minimized, but some potential error is the price paid for the convenience and savings that samples provide. What is the difference between probability(random) and non-probability(non-random) sampling? The difference between non-probability and probability sampling is that non-probability sampling does not involve random selection and probability sampling does. Does that mean that non-probability samples aren't representative of the population? Not necessarily. But it does mean that non-probability samples cannot depend upon the rationale of probability theory. At least with a probability sample, the researchers know the odds or probability that the population is represented. In general, researchers prefer probability or random sampling methods to non-random ones, and consider them to be more accurate and rigorous. However, in applied social research there may be circumstances where it is not feasible, practical or theoretically sensible to do random sampling. Random Sampling This may be the most important type of sample. A random sample allows a known probability that each elementary unit will be chosen. For this reason, it is sometimes referred to as a probability sample. This is the type of sampling that is used in lotteries and raffles. For example, if you want to select 10 players randomly from a population of 100, you can write their names, fold them up, mix them thoroughly then pick ten. In this case, every name had any equal chance of being picked. Random numbers can also be used (see Lapin page 81). Non-Random Sampling Purposeful sampling selects information rich cases for in-depth study. Size and specific cases depend on the study purpose. They are briefly described below for you to be aware of them. The details can be found in Patton(1990)Pg 169-186. Sample Size Using a sample in research saves on money and time. In order to reduce sampling errors the researcher should use a suitable sampling strategy and an appropriate sample size. A sample should yield valid and reliable information. Sample size is symbolized in research articles or reports as the letter "N." The question of sample size can be a difficult one. Sample size can be determined by various constraints. For example, the available funding may pre-specify the sample size. When research costs are fixed, a useful rule of thumb is to spend about one half of the total amount for data collection and the other half for data analysis. This constraint influences the sample size as well as sample design and data collection procedures. In general, sample size depends on the nature of the analysis to be performed, the desired precision of the estimates one wishes to achieve, the kind and number of comparisons that will be made and the number of variables that have to be examined. Measurement Measurement is the process of observing and recording the observations that are collected as part of a research effort. There are two major issues that will be considered here. First, one must understand reliability of measurement, including consideration of true score theory and a variety of reliability estimators. Second, one must understand the different types of measures that you might use in social research. Four broad categories of measurements are usually considered:
Reliability What is Reliability? Reliability is the consistency of your measurement, or the degree to which an instrument measures the same way each time it is used under the same condition with the same subjects. In short, it is the repeatability of your measurement. A measure is considered reliable if a person's score on the same test given twice is similar. It is important to remember that reliability is not measured, it is estimated. There are two ways that reliability is usually estimated: test/retest and internal consistency. Test/Retest Test/retest is the more conservative method to estimate reliability. Simply put, the idea behind test/retest is that the score on test 1 should be the same as the score on test 2. The three main components to this method are as follows:
Internal Consistency Internal consistency estimates reliability by grouping questions in a questionnaire that measure the same concept. For example, you could write two sets of three questions that measure the same concept (say class participation) and after collecting the responses, run a correlation between those two groups of three questions to determine if your instrument is reliably measuring that concept. One common way of computing correlation values among the questions on an instrument is by using Cronbach's Alpha. In short, Cronbach's alpha splits all the questions on your instrument every possible way and computes correlation values for them all (we use a computer program for this part). In the end, your computer output generates one number for Cronbach's alpha and, just like a correlation coefficient, the closer it is to one, the higher the reliability estimate of your instrument. Cronbach's alpha is a less conservative estimate of reliability than test/retest. The primary difference between test/retest and internal consistency estimates of reliability is that test/retest involves two administrations of the measurement instrument, whereas the internal consistency method involves only one administration of that instrument. Validity Definition: Validity is the strength of our conclusions, inferences or propositions. More formally, Cook and Campbell (1979) define it as the "best available approximation to the truth or falsity of a given inference, proposition or conclusion." In short, were we right? Types of Validity: There are five types of validity commonly examined in social research.
Return To Top Presentations
A presentation was created by the Center for Urban Studies regarding outcomes monitoring. This presentation is available for download. Download the CDBG Outcomes Training Download the CDBG and Performance Measurement Training Return To Top Databases
A list of books and other resources that may be helpful when using SPSS has been compiled and is available for download. Download the Introduction to SPSS A brief SPSS training has also been created and is available for download. Download the SPSS Training Return To Top Logic Model
Logic models have been an essential tool when developing and monitoring outcomes. A program logic model is a systematic, visual way to present a planned program with its underlying assumptions and theoretical framework. It is a picture of why and how you believe a program will work. Logic models are tools for program planning, management, and evaluation. They can be used at any point in the evolution of a program and can lead to better programs. Program logic models describe the sequence of events for bringing about change and relate activities to outcomes Download the Logic Model Worksheet Source: Measuring Program Outcomes: A Practical Approach. United Way of America, 1996 Return To Top Assessments
The following pages have various assessment tools. Please note that this is not an exhaustive collection nor are they necessarily the .best. of their category. Buros Institute This website allows you to search by topic or by name of assessment for reviews of assessment. It will give you the title, author, purpose of assessment, publisher and publisher.s address. You can purchase a full review of each assessment for $15 per assessment. Note that these reviews are descriptions and evaluations of the tests, not the actual tests themselves. To purchase the actual test materials, you will need to contact the test publisher(s). Chipts The Center for HIV Identification, Prevention, and Treatment Services (CHIPTS) is a collaboration of researchers who want to enhance the collective understanding of HIV research and to promote early detection, effective prevention, and treatment programs for HIV. This website will allow you to search by topic and (when available) will give you the assessment name. background of the scale (i.e. # of items and what it was designed for), the assessment developer, any copyright information and who has it copyrighted, the psychometric measures (i.e. reliability and validity), the actual assessment items, how to score the assessment, and any related references. Assessment Publishers Directories of test publishers are included in most major testing reference books (MMY, Tests, TIP). The size and scope of the directory usually reflects how many tests are included in that book. For example, TIP provides brief information on the greatest number of commercially available tests and, thus, has an extensive publisher directory. The Test Collection at Educational Testing Service (ETS) has a free pamphlet entitled Major U. S. Publishers of Standardized Tests, which lists the names, addresses, and phone numbers of 28 major test publishers. Call or write to them for your free copy at ETS, Library, Rosedale Road, Princeton, NJ, 08541, (609) 734-5667. Assessment References
Download Things to Consider When Evaluating an Assessment Tool Return To Top Glossary
Causal relationship - a relationship where variation in one variable causes variation in another. Concurrent validity - the ability to distinguish between groups that should be theoretically distinguishable. Content validity - whether or not your instrument reflects the content you are trying to measure. Convergent validity - measures that should be related are related. Discriminant validity - measures that should not be related are not. Correlation - a measure of the association between two variables, closer to 1 means a stronger correlation. Covariation - a measure of how two variables both vary relative to one another. Deviation - the difference of a score from the mean. Error Component - the part of the variance of an observed variable that is due to random measurement errors. Face Validity - addresses whether or not a measurement instrument is valid on its face. Hypothesis - a theory or prediction made about the relationship between two variables. Interaction - when the effect of one variable (or factor) is not the same at each level of the other variable (or factor). Linear Correlation - a statistical measure of the strength of the relationship between variables (e.g., treatment and outcome). The closer the coefficient is to +1 or -1, the stronger the relationship - a positive correlation implies a direct relationship between the variables, a negative correlation implies an inverse relationship. Linear Regression - the prediction equation that estimates the value of the outcome variable ("y") for any given treatment variable ("x"). Main Effect - the effect of a factor on the dependent variable (response) measured without regard to other factors in the analysis. Mean - the average of your sample, computed by taking the sum of the individual scores and dividing them by the total number of individuals (sample size, "n"). Median - if you rank the observations according to size, the median is the observation that divides the list into equal halves. Mode - the observation that occurs most frequently. Null Hypothesis - the prediction that there is no relationship between your treatment and your outcome. Predictive validity - the ability to predict something you want to predict. Random sample - a sample of a population where each member of the population has an equal chance of being in the sample. Significance level - the probability of finding a relationship between your treatment and effect when there isn't one in reality. Type I Error - rejecting the null hypothesis when it is true. Type II Error - accepting the null hypothesis when it is false. Variation - a measure of the spread of the variable, usually used to describe the deviation from a central value (e.g., the mean). Numerically, it's the sum of the squared deviations from the mean. Return To Top Resources
Coming Soon! Return To Top |
