Evaluation is a
atic determination of a subject's merit, worth and significance, using criteria governed by a set of standards
. It can assist an organization, program, design, project or any other intervention or initiative to assess any aim, realisable concept/proposal, or any alternative, to help in decision-making
; or to ascertain the degree of achievement or value in regard to the aim and objectives
and results of any such action that has been completed. The primary purpose of evaluation, in addition to gaining insight
into prior or existing initiatives, is to enable reflection
and assist in the identification of future change.
Evaluation is often used to characterize and appraise subjects of interest in a wide range of human enterprises, including the arts
, criminal justice
, non-profit organization
, health care
, and other human services. It is long term and done at the end of a period of time.
Evaluation is the structured interpretation and giving of meaning to predicted or actual impacts of proposals or results. It looks at original objectives, and at what is either predicted or what was accomplished and how it was accomplished. So evaluation can be formative
, that is taking place during the development of a concept or proposal, project or organization, with the intention of improving the value or effectiveness of the proposal, project, or organisation. It can also be summative
, drawing lessons from a completed action or project or an organisation at a later point in time or circumstance.
Evaluation is inherently a theoretically informed approach (whether explicitly or not), and consequently any particular definition of evaluation would have been tailored to its contextthe theory, needs, purpose, and methodology of the evaluation process itself. Having said this, evaluation has been defined as:
* A systematic, rigorous, and meticulous application of scientific methods to assess the design, implementation, improvement, or outcomes of a program. It is a resource-intensive process, frequently requiring resources, such as, evaluate expertise, labor, time, and a sizable budget
* "The critical assessment, in as objective a manner as possible, of the degree to which a service or its component parts fulfills stated goals" (St Leger and Wordsworth-Bell).
The focus of this definition is on attaining objective knowledge, and scientifically or quantitatively measuring predetermined and external concepts.
* "A study designed to assist some audience to assess an object's merit and worth" (Stufflebeam).
In this definition the focus is on facts as well as value laden judgments of the programs outcomes and worth.
The main purpose of a program evaluation can be to "determine the quality of a program by formulating a judgment" Marthe Hurteau, Sylvain Houle, Stéphanie Mongiat (2009).
An alternative view is that "projects, evaluators, and other stakeholders (including funders) will all have potentially different ideas about how best to evaluate a project since each may have a different definition of 'merit'. The core of the problem is thus about defining what is of value."
From this perspective, evaluation "is a contested term", as "evaluators" use the term evaluation to describe an assessment, or investigation of a program whilst others simply understand evaluation as being synonymous with applied research.
There are two function considering to the evaluation purpose Formative Evaluations provide the information on the improving a product or a process Summative Evaluations provide information of short-term effectiveness or long-term impact to deciding the adoption of a product or process.
Not all evaluations serve the same purpose some evaluations serve a monitoring function rather than focusing solely on measurable program outcomes or evaluation findings and a full list of types of evaluations would be difficult to compile.
This is because evaluation is not part of a unified theoretical framework, drawing on a number of disciplines, which include management
and organisational theory
, policy analysis
, social anthropology
, and social change
However, the strict adherence to a set of methodological assumptions may make the field of evaluation more acceptable to a mainstream audience but this adherence will work towards preventing evaluators from developing new strategies for dealing with the myriad problems that programs face.
It is claimed that only a minority of evaluation reports are used by the evaluand (client) (Datta, 2006).
One justification of this is that "when evaluation findings are challenged or utilization has failed, it was because stakeholders and clients found the inferences weak or the warrants unconvincing" (Fournier and Smith, 1993).
Some reasons for this situation may be the failure of the evaluator to establish a set of shared aims with the evaluand, or creating overly ambitious aims, as well as failing to compromise and incorporate the cultural differences of individuals and programs within the evaluation aims and process.
None of these problems are due to a lack of a definition of evaluation but are rather due to evaluators attempting to impose predisposed notions and definitions of evaluations on clients. The central reason for the poor utilization of evaluations is arguably due to the lack of tailoring of evaluations to suit the needs of the client, due to a predefined idea (or definition) of what an evaluation is rather than what the client needs are (House, 1980).
The development of a standard methodology for evaluation will require arriving at applicable ways of asking and stating the results of questions about ethics such as agent-principal, privacy, stakeholder definition, limited liability; and could-the-money-be-spent-more-wisely issues.
Depending on the topic of interest, there are professional groups that review the quality and rigor
of evaluation processes.
Evaluating programs and projects, regarding their value and impact within the context they are implemented, can be ethically
challenging. Evaluators may encounter complex, culturally specific systems resistant to external evaluation. Furthermore, the project organization or other stakeholders may be invested in a particular evaluation outcome. Finally, evaluators themselves may encounter "conflict of interest (COI)
" issues, or experience interference or pressure to present findings that support a particular assessment.
General professional codes of conduct
, as determined by the employing organization, usually cover three broad aspects of behavioral standards, and include inter-collegial
relations (such as respect for diversity
), operational issues (due competence
, documentation accuracy and appropriate use of resources), and conflicts of interest (nepotism
, accepting gifts and other kinds of favoritism).
However, specific guidelines particular to the evaluator's role that can be utilized in the management of unique ethical challenges are required. The Joint Committee on Standards for Educational Evaluation
has developed standards for program, personnel, and student evaluation. The Joint Committee standards are broken into four sections: Utility, Feasibility, Propriety, and Accuracy. Various European institutions have also prepared their own standards, more or less related to those produced by the Joint Committee. They provide guidelines about basing value judgments on systematic inquiry, evaluator competence and integrity, respect for people, and regard for the general and public welfare.
The American Evaluation Association has created a set of Guiding Principle
s for evaluators. The order of these principles does not imply priority among them; priority will vary by situation and evaluator role. The principles run as follows:
* Systematic Inquiry
: evaluators conduct systematic, data
-based inquiries about whatever is being evaluated. This requires quality data collection, including a defensible choice of indicators, which lends credibility to findings.
Findings are credible when they are demonstrably evidence-based, reliable and valid. This also pertains to the choice of methodology
employed, such that it is consistent with the aims of the evaluation and provides dependable data. Furthermore, utility of findings is critical such that the information obtained by evaluation is comprehensive and timely, and thus serves to provide maximal benefit and use to stakeholders
* Competence: evaluators provide competent performance to stakeholders
. This requires that evaluation teams comprise an appropriate combination of competencies, such that varied and appropriate expertise is available for the evaluation process, and that evaluators work within their scope of capability.
: evaluators ensure the honesty and integrity
of the entire evaluation process. A key element of this principle is freedom from bias in evaluation and this is underscored by three principles: impartiality, independence, and transparency.
Independence is attained through ensuring independence of judgment is upheld such that evaluation
conclusions are not influenced or pressured by another party, and avoidance of conflict of
interest, such that the evaluator does not have a stake in a particular conclusion. Conflict of
interest is at issue particularly where funding of evaluations is provided by particular bodies
with a stake in conclusions of the evaluation, and this is seen as potentially compromising the
independence of the evaluator. Whilst it is acknowledged that evaluators may be familiar with
agencies or projects that they are required to evaluate, independence requires that they not have
been involved in the planning or implementation of the project. A declaration of interest should
be made where any benefits or association with project are stated. Independence of judgment is
required to be maintained against any pressures brought to bear on evaluators, for example, by
project funders wishing to modify evaluations such that the project appears more effective than
findings can verify.
Impartiality pertains to findings being a fair and thorough assessment of strengths and
weaknesses of a project or program. This requires taking due input from all stakeholders involved
and findings presented without bias and with a transparent, proportionate, and persuasive link
between findings and recommendations. Thus evaluators are required to delimit their findings to
evidence. A mechanism to ensure impartiality is external and internal review. Such review is
required of significant (determined in terms of cost or sensitivity) evaluations. The review is
based on quality of work and the degree to which a demonstrable link is provided between findings
Transparency requires that stakeholders are aware of the reason for the evaluation, the criteria
by which evaluation occurs and the purposes to which the findings will be applied. Access to the
evaluation document should be facilitated through findings being easily readable, with clear
explanations of evaluation methodologies, approaches, sources of information, and costs
for People: Evaluators respect the security
of the respondents, program participants
, and other stakeholders with whom they interact
.This is particularly pertinent with regards to those who will be impacted upon by the evaluation findings.
Protection of people includes ensuring informed consent from those involved in the evaluation, upholding confidentiality, and ensuring that the identity of those who may provide sensitive information towards the program evaluation is protected. Evaluators are ethically required to respect the customs and beliefs of those who are impacted upon by the evaluation or program activities. Examples of how such respect is demonstrated is through respecting local customs e.g. dress codes, respecting peoples privacy, and minimizing demands on others' time.
Where stakeholders wish to place objections to evaluation findings, such a process should be facilitated through the local office of the evaluation organization, and procedures for lodging complaints or queries should be accessible and clear.
for General and Public Welfare
: Evaluators articulate and take into account the diversity of interest
s and values
that may be related to the general and public welfare. Access to evaluation documents by the wider public should be facilitated such that discussion and feedback is enabled.
Furthermore, the international organizations such as the I.M.F. and the World Bank have independent evaluation functions. The various funds, programmes, and agencies of the United Nations has a mix of independent, semi-independent and self-evaluation functions, which have organized themselves as a system-wide UN Evaluation Group
that works together to strengthen the function, and to establish UN norms and standards for evaluation. There is also an evaluation group within the OECD-DAC, which endeavors to improve development evaluation standards. The independent evaluation units of the major multinational development banks (MDBs) have also created the Evaluation Cooperation Group to strengthen the use of evaluation for greater MDB effectiveness and accountability, share lessons from MDB evaluations, and promote evaluation harmonization and collaboration.
The word "evaluation" has various connotations
for different people, raising issues related to this process that include; what type of evaluation should be conducted; why there should be an evaluation process and how the evaluation is integrated into a program, for the purpose of gaining greater knowledge and awareness?
There are also various factors inherent in the evaluation process, for example; to critically examine influences within a program that involve the gathering and analyzing of relative information about a program. Michael Quinn Patton
motivated the concept that the evaluation procedure should be directed towards:
* The making of judgments on a program
* Improving its effectiveness,
* Informed programming decisions
Founded on another perspective of evaluation by Thomson and Hoffman in 2003, it is possible for a situation to be encountered, in which the process could not be considered advisable; for instance, in the event of a program being unpredictable, or unsound. This would include it lacking a consistent routine; or the concerned parties unable to reach an agreement regarding the purpose of the program. In addition, an influencer, or manager, refusing to incorporate relevant, important central issues within the evaluation
There exist several conceptually distinct ways of thinking about, designing, and conducting evaluation efforts. Many of the evaluation approaches in use today make truly unique contributions to solving important problems, while others refine existing approaches in some way.
Classification of approaches
Two classifications of evaluation approaches by House
[House, E. R. (1978). Assumptions underlying evaluation models. ''Educational Researcher''. 7(3), 4-12.]
and Stufflebeam and Webster
[Stufflebeam, D. L., & Webster, W. J. (1980)]
"An analysis of alternative approaches to evaluation"
''Educational Evaluation and Policy Analysis''. 2(3), 5-19.
can be combined into a manageable number of approaches in terms of their unique and important underlying principles.
House considers all major evaluation approaches to be based on a common ideology
entitled liberal democracy
. Important principles of this ideology include freedom of choice, the uniqueness of the individual
inquiry grounded in objectivity
. He also contends that they are all based on subjectivist
ethics, in which ethical conduct is based on the subjective or intuitive experience of an individual or group. One form of subjectivist ethics is utilitarian
, in which "the good
" is determined by what maximizes a single, explicit interpretation of happiness for society as a whole. Another form of subjectivist ethics is intuitionist
, in which no single interpretation of "the good" is assumed and such interpretations need not be explicitly stated nor justified.
These ethical positions have corresponding epistemologies
for obtaining knowledge
. The objectivist epistemology is associated with the utilitarian ethic; in general, it is used to acquire knowledge that can be externally verified (intersubjective agreement) through publicly exposed methods
. The subjectivist epistemology is associated with the intuitionist/pluralist ethic and is used to acquire new knowledge based on existing personal knowledge, as well as experiences that are (explicit) or are not (tacit) available for public inspection. House then divides each epistemological approach into two main political perspectives. Firstly, approaches can take an elite perspective, focusing on the interests of managers and professionals; or they also can take a mass perspective, focusing on consumer
s and participatory
Stufflebeam and Webster place approaches into one of three groups, according to their orientation toward the role of values
and ethical consideration. The political orientation promotes a positive or negative view of an object regardless of what its value actually is and might be—they call this pseudo-
evaluation. The questions orientation includes approaches that might or might not provide answers specifically related to the value of an object—they call this quasi
-evaluation. The values orientation includes approaches primarily intended to determine the value of an object—they call this true evaluation.
When the above concepts are considered simultaneously, fifteen evaluation approaches can be identified in terms of epistemology, major perspective (from House), and orientation.
[ Two pseudo-evaluation approaches, politically controlled and public relations studies, are represented. They are based on an objectivist epistemology from an elite perspective. Six quasi-evaluation approaches use an objectivist epistemology. Five of them—experimental research, management information systems, testing programs, objectives-based studies, and content analysis—take an elite perspective. Accountability takes a mass perspective. Seven true evaluation approaches are included. Two approaches, decision-oriented and policy studies, are based on an objectivist epistemology from an elite perspective. Consumer-oriented studies are based on an objectivist epistemology from a mass perspective. Two approaches—accreditation/certification and connoisseur studies—are based on a subjectivist epistemology from an elite perspective. Finally, adversary and client-centered studies are based on a subjectivist epistemology from a mass perspective.
Summary of approaches
The following table is used to summarize each approach in terms of four attributes—organizer, purpose, strengths, and weaknesses. The organizer represents the main considerations or cues practitioners use to organize a study. The purpose represents the desired outcome for a study at a very general level. Strengths and weaknesses represent other attributes that should be considered when deciding whether to use the approach for a particular study. The following narrative highlights differences between approaches grouped together.
Politically controlled and public relations studies are based on an objectivist epistemology from an elite perspective. Although both of these approaches seek to misrepresent value interpretations about an object, they function differently from each other. Information obtained through politically controlled studies is released or withheld to meet the special interests of the holder, whereas public relations information creates a positive image of an object regardless of the actual situation. Despite the application of both studies in real scenarios, neither of these approaches is acceptable evaluation practice.
Objectivist, elite, quasi-evaluation
As a group, these five approaches represent a highly respected collection of disciplined inquiry approaches. They are considered quasi-evaluation approaches because particular studies legitimately can focus only on questions of knowledge without addressing any questions of value. Such studies are, by definition, not evaluations. These approaches can produce characterizations without producing appraisals, although specific studies can produce both. Each of these approaches serves its intended purpose well. They are discussed roughly in order of the extent to which they approach the objectivist ideal.
* Experimental research is the best approach for determining causal relationships between variables. The potential problem with using this as an evaluation approach is that its highly controlled and stylized methodology may not be sufficiently responsive to the dynamically changing needs of most human service programs.
* Management information systems (MISs) can give detailed information about the dynamic operations of complex programs. However, this information is restricted to readily quantifiable data usually available at regular intervals.
* Testing programs are familiar to just about anyone who has attended school, served in the military, or worked for a large company. These programs are good at comparing individuals or groups to selected norms in a number of subject areas or to a set of standards of performance. However, they only focus on testee performance and they might not adequately sample what is taught or expected.
* Objectives-based approaches relate outcomes to prespecified objectives, allowing judgments to be made about their level of attainment. Unfortunately, the objectives are often not proven to be important or they focus on outcomes too narrow to provide the basis for determining the value of an object.
* Content analysis is a quasi-evaluation approach because content analysis judgments need not be based on value statements. Instead, they can be based on knowledge. Such content analyses are not evaluations. On the other hand, when content analysis judgments are based on values, such studies are evaluations.
Objectivist, mass, quasi-evaluation
* Accountability is popular with constituents because it is intended to provide an accurate accounting of results that can improve the quality of products and services. However, this approach quickly can turn practitioners and consumers into adversaries when implemented in a heavy-handed fashion.
Objectivist, elite, true evaluation
* Decision-oriented studies are designed to provide a knowledge base for making and defending decisions. This approach usually requires the close collaboration between an evaluator and decision-maker, allowing it to be susceptible to corruption and bias.
* Policy studies provide general guidance and direction on broad issues by identifying and assessing potential costs and benefits of competing policies. The drawback is these studies can be corrupted or subverted by the politically motivated actions of the participants.
Objectivist, mass, true evaluation
* Consumer-oriented studies are used to judge the relative merits of goods and services based on generalized needs and values, along with a comprehensive range of effects. However, this approach does not necessarily help practitioners improve their work, and it requires a very good and credible evaluator to do it well.
Subjectivist, elite, true evaluation
* Accreditation / certification programs are based on self-study and peer review of organizations, programs, and personnel. They draw on the insights, experience, and expertise of qualified individuals who use established guidelines to determine if the applicant should be approved to perform specified functions. However, unless performance-based standards are used, attributes of applicants and the processes they perform often are overemphasized in relation to measures of outcomes or effects.
* Connoisseur studies use the highly refined skills of individuals intimately familiar with the subject of the evaluation to critically characterize and appraise it. This approach can help others see programs in a new light, but it is difficult to find a qualified and unbiased connoisseur.
Subject, mass, true evaluation
* The adversary approach focuses on drawing out the pros and cons of controversial issues through quasi-legal proceedings. This helps ensure a balanced presentation of different perspectives on the issues, but it is also likely to discourage later cooperation and heighten animosities between contesting parties if "winners" and "losers" emerge.
* Client-centered studies address specific concerns and issues of practitioners and other clients of the study in a particular setting. These studies help people understand the activities and values involved from a variety of perspectives. However, this responsive approach can lead to low external credibility and a favorable bias toward those who participated in the study.
Methods and techniques
Evaluation is methodologically diverse. Methods may be qualitative or quantitative, and include case studies, survey research, statistical analysis, model building, and many more such as:
* Accelerated aging
* Action research
* Advanced product quality planning
* Alternative assessment
* Appreciative Inquiry
* Axiomatic design
* Case study
* Change management
* Clinical trial
* Cohort study
* Competitor analysis
* Consensus decision-making
* Consensus-seeking decision-making
* Content analysis
* Conversation analysis
* Cost-benefit analysis
* Data mining
* Delphi Technique
* Design Focused Evaluation
* Discourse analysis
* Educational accreditation
* Electronic portfolio
* Environmental scanning
* Experimental techniques
* Factor analysis
* Factorial experiment
* Feasibility study
* Field experiment
* Fixtureless in-circuit test
* Focus group
* Force field analysis
* Game theory
* Goal-free evaluation
* Historical method
* Iterative design
* Marketing research
* Most significant change technique
* Multivariate statistics
* Naturalistic observation
* Observational techniques
* Opinion polling
* Organizational learning
* Outcome mapping
* Outcomes theory
* Participant observation
* Participatory impact pathways analysis
* Policy analysis
* Post occupancy evaluation
* Process improvement
* Project management
* Qualitative research
* Quality audit
* Quality circle
* Quality control
* Quality management
* Quality management system
* Quantitative research
* Questionnaire construction
* Root cause analysis
* Six Sigma
* Standardized testing
* Statistical process control
* Statistical survey
* Strategic planning
* Structured interviewing
* Systems theory
* Student testing
* Theory of change
* Total quality management
* Wizard of Oz experiment
* Monitoring and Evaluation is a process used by governments, international organizations and NGOs to assess ongoing or past activities
* Assessment is the process of gathering and analyzing specific information as part of an evaluation
* Competency evaluation is a means for teachers to determine the ability of their students in other ways besides the standardized test
* Educational evaluation is evaluation that is conducted specifically in an educational setting
* Immanent evaluation, opposed by Gilles Deleuze to value judgment
* Performance evaluation is a term from the field of language testing. It stands in contrast to competence evaluation
* Program evaluation is essentially a set of philosophies and techniques to determine if a program 'works'
* Donald Kirkpatrick's Evaluation Model for training evaluation
* Efficiently updatable neural network A neural network based evaluation function
Links to Assessment and Evaluation Resources
- List of links to resources on several topics
Evaluation Portal Link Collection
Evaluation link collection with information about evaluation journals, dissemination, projects, societies, how-to texts, books, and much more
Free Resources for Methods in Evaluation and Social Research
Introduction to and Discussions on Monitoring & Evaluation of Development Programs & Projects
Basic Guide to Program Evaluation, Gene Shackman
- A website and resource library with freely available information about how to conduct, manage, and use evaluation well.