Interpersonal Models of Program Evaluation
- Assumptions
- Non-expert Evaluation
- Goal-free Evaluation
- Peer or Expert Evaluation
- The Values Approach
- Interactive Evaluation
- Stakeholder Evaluation (Steps Alleged Weaknesses Causes for Concern Strengths)
- The Awkward Question of Trust
- Overdependence on Internal Consistency
- Interpersonal Models and Accreditation
Some models of program evaluation are based on the roles of interest groups and consensus.
Some models assume that one particular kind of interest group should do the evaluation because it knows better than the others. Some give all interest groups a role in the evaluation because they assume that no particular group should control the process. A couple even assume that the primary qualification of an evaluator is that he must not be a member of an interest group. Still others are more interested in the appropriate kinds of interaction between groups.
These models tend to see school communities as consisting of various groups with different or competing interests. These groups are unlikely to act against what they see as their own interests, but the issue is not just competing forms of selfishness. Different interest groups bring different kinds of knowledge to an evaluation, either special areas of expertise or knowledge available only to people with a particular perspective and experience of the program.
Although these types of evaluation appear methodologically different, they lend compatible insights to program evaluation suitable for accreditation.
Interpersonal styles of evaluation depend on interaction between people, and unashamedly use biased personal opinions as their basic material.
Flexner held that accreditation was best done by non-experts, average laypeople using a little common sense, and simple process criteria.
He did not want to belong to an interest group; he felt that experts blindly accept questionable practices, are unwilling to criticize, and have professional relationships to protect. (Floden, 1983:267, 272)
While nobody suggests that his evaluation methods are still adequate, his comments on vested interests are still relevant.
Goal-free evaluation focuses on the consumer as the interest group. In the original form of the model, Scriven held that an evaluator becomes biased when he knows a program's goals because he interprets his observations according to program goals. Almost by definition, the evaluator could not be someone in the program who knew its objectives. Considered an extremist view at its inception, goal-free evaluation was a necessary counterbalance in an era dominated by means-ends thinking and product evaluation.
The evaluator observes what happens in the program to find actual product and effects. Rather than asking about the goals of the program, he must infer the actual goals from his observations. He might also be able to infer the reasons why the program exists and why it uses the approaches it does. The evaluator's inferences should resemble those which were formulated if the program reality matches the formulation and suits its consumers. Scriven did not present a clear methodology, but the basic steps seem to be:
identify the program,
observe the processes,
question participants on what they are doing (but not why),
find out their personal responses to the program,
infer real effects and actual goals, and
prepare a report for use in a product evaluation.
This model has its problems. It either lacks a value-base for the evaluator to draw conclusions, or uses only that of the consumer. That is, it is biased in favor of one interest group, even if it portrays the program accurately from that vantage point. It also relies heavily on inductive logic and the evaluator's ability to interpret his observations. Besides, it cannot replace product evaluation although it is an appropriate compliment. (See also Meyers, 1981:122ff)
On the other hand, its central assumption and chief advantage is that intended and actual products can differ greatly. What a program is really doing and achieving might be very different from its written goals. It responds to side-effects, which is especially important as they can be more important and real than intended products. Product evaluators are less well positioned to even find them, let alone judge their importance or evaluate them. Goal-free evaluation is a consumer's kind of evaluation because it looks at the product the consumer gets, giving an undeniably important viewpoint. In this sense, the product view is producer-centered because producers can specify goals to try to limit their liability to their consumers. (See Scriven, 1974; Browne, 1984:49 also differentiates between "blueprint" and "product.")
It is hard to say the goal-free evaluation is basically flawed; it is more accurate to say that it is inherently limited and unable to serve as a true whole-of-program evaluation.
In this view, the team of expert or peer reviewers is the group which must come to a consensus. Barnett even suggests that it is a defining element of higher education. (1988a:108) This is actually a kind of interschool consensus group except that the kind of relationship is quite different as it reflects the accreditor's power.
Peer evaluation is the most common form of accreditation evaluation and the literature frequently mentions it. Peer reviewers are usually staff members of other schools that are already accredited, and are presumed to be subject matter experts. Ferris also mentions "accreditation by the expert" as a separate model (1986:4), meaning that accreditation is the business of only a few people whose expertise everybody else should trust. Eisner articulated it as a form of qualitative program evaluation, calling it the connoisseurship model. (See Eisner, 1983; Guba and Lincoln, 1982:18f) These kinds are considered together here because they have much in common.
Expert evaluation has several important strengths. Experts attempt to face what they believe are the most important issues in their field. To disregard their expertise is to prefer ignorance. Subject matter experts also protect legitimate interests in disciplinary or professional content, the importance of which is hard to underestimate. Despite the approach's weaknesses, it has at least promoted high content standards.
The issue of tacit knowledge. Sadler makes a useful critique of expert (tacit) knowledge. Expert knowledge is not clearly articulated in such a way as to be accessible to laymen or even students, but if experts substantially agree, then the knowledge must exist. He lists four weaknesses: firstly, its mystique makes students dependent on their teachers, and secondly, experts can seriously disagree (also Ferris 1986:4) and when they do, their standards appear arbitrary. Thirdly, evaluation systems using tacit knowledge are very labor-intensive, and fourthly, consensus can depend more on group dynamics than on justified application of standards.
Sadler also points out strengths. Tacit knowledge is better when a small number of criteria is inadequate and a comprehensive list is too long to be workable. It is also better when many exceptions to the standards may be made under certain conditions. (1987:199f)
At worst, experts can too easily become technocrats. "Expertise" seriously threatens accreditation when it becomes the secret knowledge of an elite, being neither explicit nor open to evaluation. Even proving that such knowledge exists in a particular case can be difficult and time-consuming. In a real evaluation, the accreditor assumes that members of the evaluation really have such knowledge. There is no way, however, to differentiate between the use of real, tacit knowledge and subjective judgments, which are highly unreliable. (Cf. e.g. Sadler, 1987:194) That is, one cannot know if the evaluation even uses tacit knowledge unless the evaluation team dedicates considerable time and effort to communicating the rationale for their conclusions.
Where it does exist, it uses a consensus view of quality. It does not necessarily support the ineffable view because, in any one case, it is possible for experts to explain their criteria and the reasons for them. There is no shortage of effability; criteria can be reconceptualized and re-expressed in indefinitely many ways.
Weaknesses. Besides those relating to tacit knowledge, the approach as a whole is fraught with problems of many kinds:
1. Experts can easily impose incompatible values on the evaluee.
2. Experts cannot easily observe the role of values in an institution. (Ferris, 1986:4)
3. Self-studies done by elites and for elites do not involve others. They hardly encourage others to use them to improve programs, except perhaps by technocratic or bureaucratic decision.
4. There is some evidence that it fails to give good insight of specific programs (Kalkwijk, 1991).
5. In Dressel's opinion, expert reviews have tended to be broad generalizations made quickly with minimal data, and overly influenced by political and value considerations. (1976:4)
6. Peer reviewers easily become preoccupied with processes. This is a particular problem in many kinds of nontraditional education where processes are very different from ordinary campuses. Especially in cases of nontraditional education, peer reviewers should come from programs similar enough for them to be able to make useful, unprejudiced statements.
7. It does not pass the interest group test. On one hand, issues of functionality lay outside the interests protected by content experts. They are rightly concerned with the body of knowledge which students must know, but not necessarily whether it is communicated well to students or whether programs are well-run. Like gate-keepers of professional guilds (cf. Sadler, 1987:199), experts can succumb to the private club mentality. Reviewers are by definition biased in favor of one set of particular interests, which they can most effectively protect by withholding a recommendation favorable to the school.
8. There are many different kinds of experts. To claim that expertise alone is an adequate basis for evaluation conclusions, a team would need enough experts in each relevant area to come to some sort of consensus based on a body of responsible opinion. It would also need to ensure each team member did not get over-involved outside their own particular area of ability. This hardly appears likely, especially given the many different kinds of relevant expertise. A subject matter expert is an authority in a particular content area. Others know more about program administration and are better qualified to give advice on administrational efficiency. Other expert groups are educators and program evaluators, which have seldom played a role in accreditation. Academics differ from practicing professionals and it is unwise to presume that they always accept each other's opinions. For example, rather than eclectisize several theories, academics tend to construct single theories, which individually may be inadequate for practice. Basic research often questions (and sometimes disproves) the working assumptions of professionals.
Conclusion. It is impossible for accreditors to develop pools of expertise in all relevant fields, and schools can take responsibility for getting consultant help where necessary. However, review by content experts is indispensable in protecting the issues of content, and it has a good track record doing so. Nevertheless, it cannot stand alone as an adequate model; it favors a particular interest group yet has no built-in checks and balances.
The next approach to accreditation purports to be mostly concerned with values, but it is not as different from other models as it seems. While it could be considered to be essentially different, it is an interpersonal model in that it assumes that an identifiable community of schools and accreditors can share a common set of values.
In it original form, it rests on several main tenets:
- The school, not the accreditation agency, should evaluate how well the school has adapted to its context.
- Criteria should only measure things that are general throughout all schools, not how a school fits into its local context.
- Accreditors should make their values explicit, and may even dictate an ideology of education.
- Accreditation should depend very heavily on stakeholder evaluation.
Narrow definition of schools. It must be admitted that the original ATA manual written by Robert Ferris was for a very narrowly-defined kind of school, for which it would probably be a successful accreditation tool. It would not normally be in accreditors' best interests to define their kinds of schools so precisely; the assumptions in the ATA document are too narrow to include some good schools and consequently become an imposition. For example, it assumes that all schools are instructional institutions created to train church personnel. The literature on the goals of higher education, for all its weaknesses, mostly avoids this problem while taking much the same role in defining values. Most accreditors restrict their ideology to education.
The manual overtly dictates a particular ideology of Theological Education by Extension and then identifies it with educational quality. This could be taken to imply that any variation of philosophy is poor-quality education, a far more dangerous stance than just dictating process specifics, one of the very problems that it is trying to avoid (ATA, 1987:4).
Interactive Evaluation: Stake, Cronbach
Stake (1967) proposed the countenance model with its thirteen categories of information. It seems a little too complex to be suited to accreditation, but Plueddemann (1987:59) summarized it in six leading questions:
1. What context was assumed during planning?
2. What learning activities were intended?
3. What outcomes were intended?
4. What was the actual context of the program?
5. What learning activities were actually used?
6. What were the actual outcomes?
This model had several strengths. It differentiated between what was intended and what actually happened, which implied a feedback evaluation of program goals. It looked at both processes and outcomes. Its original form included places for information on the rationale of the program, and for standards and judgments.
According to Stufflebeam (1983:122), Stake eventually incorporated the countenance model into the responsive model, where evaluators could use it as an advance organizer to plan the evaluation. (Stake, 1983:295)
In responsive evaluation, the evaluator is part of the evaluation process, not a disinterested and remote separate party. The evaluation emerges from the way people respond to each other. Stake suggested twelve activities, summarized in the following seven points:
1. Observe the program and draw conclusions about its scope and processes,
2. Talk with program participants,
3. Locate and conceptualize emerging issues,
4. Discover purposes and concerns,
5. Prepare portrayals,
6. Select observers, judges and instruments, and
7. Assemble reports. (1983:297f)
A strength of the model is that these activities can occur in almost any order (p. 297); one lesson from curriculum studies is that people do not follow a neat set of steps in exact order (cf. Brady's view of interactive curriculum 1983:64ff). It is impossible when handling two steps simultaneously in relationship to each other, and it is always necessary to think ahead to future steps, then later re-check and revise what was done. (See also Ferris, 1986:4f
However, the lack of a clear procedure is also a weakness waiting to ensnare the inexperienced evaluator. A step-by-step procedure helps the evaluator conceptualize what he is trying to achieve and why, and shows where to start and how to do it. Some ordering is obligatory; an evaluator could hardly start by writing the final report. The truth is more likely that ordering is necessary but cannot be rigid. Interactive evaluation is much like the realities of what happens in product accreditation, and the relationship between the "steps" is much like most other models.
Cronbach presents a similarly interactive approach to evaluation. Unlike Stake, however, he suggests that the interactive aspect of evaluation is unavoidable, whatever model one uses. For example, evaluations depend greatly on people with political power for support, acceptance, and implementation. (1980:6) Similarly, he suggests that an evaluator has political power even if he does not want it. (1980:3) Evaluators play important roles in linking people together, suggesting evaluation models, establishing criteria, and implying success or failure; consequently, program participants attempt to do that which will be considered successful.
Like the consensus view of quality, the interactive model not only depends on interaction between people, it describes some dynamics that will happen anyway. Cronbach suggests it is better to make multiple evaluations of programs using various models than trust everything to one evaluation (1980:7), decreasing the expectations thrust upon any one evaluator.
As an approach to accreditation, it inherits the strengths of the countenance framework. That it is so dynamic without a more definite procedure, however, means that an evaluator needs considerable expertise and time to produce a suitable report. Stake saw "risks" in the approach, saying that it overly relied on subjective perceptions and that it ignored causes (1983:304), results of his naturalistic presuppositions. It is difficult to see how it could work if the evaluator was either part of the school or the accreditation agency. It is also best suited to providing descriptive and formative data, and gives little expectation of summative results.
An obvious way to evaluate a program is simply to ask the people in it what they think.
The evaluator can get many opinions on its problems and many ideas on how to improve it, although the technique is more difficult than that. In this view, the consensus group includes all the school's interest groups: students, teachers, experts, administrators, funders, employer groups, representatives of practicing professionals, and graduates, in fact anybody who "holds a stake" in the success of the institution.
Many stakeholder groups, most notably students, are not permanent administrative units in the school. By involving everyone, stakeholder evaluations differ from ordinary accreditation self-studies, which are done primarily by small groups of influential people.
Stakeholders have differing kinds of interests. For example:
- Most students want a fair system with an efficient, cooperative administration. They want to be satisfied with teaching, and they want a credible degree which will open doors to careers and to further education.
- Staff want career development and a sense of academic or professional achievement.
- Employers are mainly interested in getting competent employees.
Stakeholders also have different levels of interest. An employer group might be able to attract graduates from other schools, but full-time staff might think of what they will do all day in the long-term future. (Cf. also Harris, 1990:41)
A good stakeholder evaluation encourages openness, honesty, and willingness to solve problems. Each group can raise issues from their unique perspectives, presumably counterbalancing each other in the way they interact to protect their partisan interests.
Stakeholders have good reasons to work together if they want to become accredited and believe that they are ready for it. Using multiple perspectives naturally leads to the use of various tools. (Furnham has noted that one tends to think of everything as a nail if one's only tool is a hammer. [1990:109])
Stakeholderism makes several assumptions. As each school is presumably unique, stakeholderism uses an internal consistency criterion, although some stakeholders might want to copy other schools. Using their different views of the program as starting points, stakeholder groups seek consensus on appropriate ways to improve their program. Consequently, negotiation plays a prominent role in finding emergent truth about the program. It is almost true to say that stakeholders must negotiate their own accreditation with each other, and the negotiation aspect of curriculum becomes very illuminative.
Consensus refers to stated acceptance by appointed stakeholder representatives of a given proposal in a meeting, in such a way they all become willing to act upon it. It does not necessarily apply only to personal belief, because it is the result of the buffeting of negotiation and compromise.
By avoiding questions that predetermine answers, the evaluator does not decide what is good education. He is a facilitator whose questions help stakeholders to articulate their thoughts and observations.
The Steps in Stakeholder Evaluation
The versions of stakeholderism differ in mostly small ways. Unlike other versions suited a wider range of applications, Ferris adds more formalized steps appropriate to an accreditation evaluation. This largely answers questions on what is to be evaluated and why--a school is being evaluated for accreditation. It also makes the process more concrete and suited for untrained evaluators, being somewhat simpler and with fewer steps. In contrast, Guba and Lincoln add steps so that each stakeholder group can resolve some issues internally before dealing with other stakeholder groups.
When Ferris applied stakeholderism to accreditation, it was only one part of the process, the self-study. The accreditor then checked the credibility (integrity) of the self-study. That is, the accreditor not only sets a procedure for the self-study but the visiting accreditation team checks it afterward. The accreditor retains some leverage to avoid abuse, so that stakeholders cannot simply vote themselves accredited.
The following is a combination of Guba and Lincoln (1989) and Ferris (1986), with several additions from JCSEE (1981):
1. The school contacts the accreditor.
2. The school then appoints a two-member team to direct the evaluation.
3. The team identifies stakeholders and explains the full evaluation procedure.
4. The team choose stakeholder representatives who form a stakeholder committee.
5. The team helps each stakeholder group as follows:
- Each group interprets their role in the school.
- Each group identifies concerns and issues which they feel need to be resolved if the school is to improve.
- Each group locates new information and better ways to process it, so that it can better understand the school from its perspective. This should resolve some issues.
- Each group reviews the results of other stakeholder groups' discussions, which should also resolve some issues.
6. The team holds a stakeholder committee meeting and
- clarifies the purpose of the school,
- describes how it operates,
- identifies factors influencing the development of the school,
- discusses evidence on the effectiveness of the school,
- sorts out which issues have already been resolved,
- prioritizes unresolved issues,
- determines what information will be necessary to resolve them, and
- delegates information collection.
(The results of points a, b, and c comprise the basis of a statement on program characteristics.)
7. Delegated people collect information on unresolved issues.
8. Meanwhile, the evaluation team evaluates the school using the criteria provided by the accreditor. It also prepares a report on "integrity," that is, the credibility of the evaluation before the accreditor. This comprises a justification of the degree levels and a description of how the evaluation was done, including problems encountered and the teams' solutions.
9. The evaluation team prepares an agenda for negotiation.
10. The stakeholder committee meets and
- negotiates solutions,
- identifies unresolved issues,
- compiles a full self-study report (which also suggests program improvements), and
- comes to agreement on the report, revising it as necessary.
11. The school informs the accreditor that it is ready for the accreditor's visiting team and encloses copies of the report.
12. The accreditor's team study the report before the visit.
13. The visiting team from the accreditor checks that the school adhered to the procedure responsibly, verifies the level of the degrees awarded, and establishes that the school meets the accreditor's criteria. It also checks that the study of special program characteristics is adequate but may not question its results.
14. The visiting team reports back to the accreditor, which may grant accredited status on the school.
During the later stages of the process, the final report is also distributed to all stakeholder groups and a summary is made public.
The sources omitted to say that each group should review progress on program improvements since the last review, and that the school should implement the suggestions for school improvement.
Alleged Weaknesses in Stakeholderism: Weiss's Critique
Weiss's criticism of stakeholderism has some good points (below), but much of it is a little unfair. It was based on an evaluation that stood little chance of success--an evaluation of a government program where the government wanted to diffuse responsibility for the evaluation. The program involved political factors, complex funding, and various social groupings. Its sheer size and geographical spread were unfavorable to a good evaluation, and it was highly ambiguous, with activities varying greatly from day to day and from place to place. The evaluation also asked stakeholders to prespecify their information needs as if they were performing a quantitative evaluation. (1986b:191)
Weiss's case did not use functional units, and stakeholder evaluation was perhaps unsuitable to evaluate the whole of such a large and complex program anyway. Cronbach (1980:7) advises that no single evaluation could be adequate, regardless of its kind. Stakeholders were naturally harder to identify (1986a:151) and could only be further from the evaluation process. In a government program, political factors play a very significant role in decision-making, even if the public will likely accept the evaluation's findings. (1986b:192f)
She also counts it a weakness of stakeholder evaluation that evaluators cannot specify information needs in advance. (1986b:190f) In fact, however, this applies to all kinds of qualitative evaluation; truth about a program is emergent and the main issues emerge only during the evaluation.
Other supposed weaknesses are no more than limitations. For example, she says that the evaluation still needed knowledge of products (1986a:153), which fell outside the scope of the evaluation. There is nothing in the stakeholder model to say that evaluators should not study products or do empirical studies, as long as they do not replace qualitative information.
Causes for Concern in Stakeholderism
Stakeholderism cannot comprise the entire accreditation evaluation. The accreditor's visiting team need to review the integrity of the self-study and check whether the school has adequately functioning mechanisms to maintain content standards. However, this is a limitation rather than a weakness. Other limitations are internal consistency and necessary inconsistencies, discussed later in this chapter. Besides, stakeholderism has more than a few potential real weaknesses:
Who is the evaluator? The evaluation literature assumes that the evaluator is a specialized professional. In accreditation, however, it is a little more ambiguous. The evaluator would more likely be chosen from the school's staff (as Ferris suggested), or from the accreditor's staff.
Evaluator pressures. The process puts unreasonable pressures on evaluators. The evaluator too easily becomes a clearing-house, liaison between stakeholders, and supposedly value-free font of wisdom. (Weiss, 1986a:153) He must be unwilling to use his position to protect his own interests, especially as he might become an arbiter or power-broker. It is questionable whether Guba and Lincoln's proposal helps (1989:246f); they suggested good rules of negotiation which seem a little too idealistic to cope with many cases of natural inequities.
Inequities. Stakeholderism faces natural inequities because stakeholders do not have equal power, information, or negotiating skills.
Power structures are a problem because stakeholderism presupposes that all stakeholders have a right to contribute to the evaluation. In a more hierarchical society, powerful stakeholders can nearly dictate consensus, and strong personalities and skilful negotiators sometimes jostle for influence and power. In many societies, deans and rectors naturally negotiate from a position of power and status, while students have comparatively little voice. Stakeholderism offers no solutions for such problems.
Similarly, stakeholder groups are not equally well-informed about the business of the school, nor will they be equally close to the evaluation process. Students and junior staff can only comment on their particular experiences; they have limited viewpoints and lack the power that comes from having more complete information about the program.
Guba and Lincoln emphasize the empowering of weaker stakeholder groups to make the process fairer. (1989:246) Alternatively, some inequities might be justifiable when some stakeholders have much more at risk than others. Some might have their careers at stake, while a professional association can easily ignore a small school. It might be fairer to give some stakeholders more rights than others.
Who decides about stakeholders? Weiss points out that the approach does not have a way to clarify who should make decisions about who is a stakeholder. (1986a:152) Although most stakeholder groups are easy to define, some are rather borderline. In a program that trains professionals, consumers and potential consumers of professional services might have important opinions on the program.
Who protects the accreditor's interests? Evaluator agencies can use stakeholderism to diffuse responsibility and reduce vulnerability. (Weiss, 1986a:154) With a pattern of devolved authority, it becomes unclear who is protecting the interests of the accreditor.
Some parties can have a stake in the evaluation without having a stake in the school. The accreditor and other schools in the association also have quite justifiable interests to protect if they accept degrees and transfer credits from accredited schools. They also want to protect the credibility of the accreditor, especially if professional licensing is at stake.
It does not make much sense to extend the stakeholder concept one step further. Other schools cannot protect their interests by evaluating the evaluee school's internal evaluation, because the evaluee uses criteria generated internally through its stakeholders. A stakeholder evaluation may be good for the evaluee but it does little to ensure that a program is accreditable.
In any case, if outside groups (such as other schools or the accreditor) have a stake in the evaluation, then consistency becomes partly external.
Diffused responsibility. Like evaluator agencies, schools can also face problems of devolved authority. Despite variance between organizations and cultures, the increasing power of stakeholder groups at some point becomes administrative irresponsibility. The democratic ideal may be admirable, but not a chaotic imbalance of power where administrators can no longer carry their responsibilities.
Unhelpful expectations. Stakeholderism assumes that stakeholders have high expectations. Actually, however, both teachers and students can have very low expectations, especially if they have only ever known empty typewriter education and dysfunctional administration. It is almost as likely that stakeholders will be too idealistic and become disappointed with the evaluation.
Too much information. Scriven adds that bringing more people into an evaluation can simply make a problem more complicated, and perhaps unsolvable. (1986:64) For example, stakeholders can unnecessarily manufacture problems and opinions. Even if the input is very good, there can be too much to be all used. Stufflebeam, however, suggested that this was not such a bad problem if it remained in tension with the search for pertinent information. (1983:123) It also parallels the tension between models of evaluation; an accurate model produces such large amounts of information that it risks becoming unmanageable or irrelevant. (Cf. Stufflebeam, 1983:123) The opposite tendency is to limit information to keep it manageable, at the risk of simplistic, surrogate measures that bring their own kinds of inaccuracies.
Bias. Stakeholderism is inherently biased. Stakeholders cannot make etic observations, as mentioned by Scriven in value-free evaluation (1974; cf. also Guba and Lincoln, 1989:210). Then again, it is impractical to expect the accreditor's visiting team to make them either, even if for no other reason that it would require too large a time commitment.
By definition, all stakeholders hold a stake in the school, so they have interests in common which they seek to protect. In the case of accreditation, they all share an interest in getting an accredited status for their school. That is, they can have ulterior motives for reaching a consensus. (Cf. Kogan, 1986:133) Consequently, stakeholder consensus does not guarantee that a school is good. Schools naturally try to pretend they are good and rationalize their programs likewise. (Ramsey, 1978:214; Kogan, 1986:137) It has yet to be shown that people will not hide sensitive problems, especially in cultures which are afraid of losing face. Not only that, there is no basis for assuming that all stakeholders have compatible and valid views of quality; they can have common views of a degree mill philosophy or can simply know too little about education. They do not necessarily subscribe to principles higher than their own idiosyncracies.
The cause of the problem is overdependence on internal consistency; stakeholderism needs the counterbalancing effects of external consistency.
Disagreement. The alternative problem to agreement through bias in common is that stakeholders do not always agree. It is naive to assume that participation in an evaluation will always motivate people to perform better. (Premfors, 1986:173; Dressel, 1976:384) The evaluation can easily uncover a nest of problems over which stakeholder groups might disagree (cf. Weiss, 1986b:191). In some cases, evaluation can also foster unnecessary conflicts, as some types of personalities may perform poorly in evaluation settings, even though they are important to the school. (Premfors, 1986:173) Elsewhere, disagreement can reflect inconsistencies over which the school has little control.
Stakeholders may also disagree with evaluation results after the evaluation; the approach does not ensure that stakeholders will accept its results. Seeing how the study is done can easily disappoint some stakeholders who subsequently become less committed to its results. Some of them can feel threatened when shown to be wrong. (Weiss, 1986b:191f)
Some of these problems are not the exclusive property of stakeholderism. All models of qualitative program evaluation work with fallible people and face the potential problems of inequities, evaluator pressures, unhelpful expectations, disagreement, excessive information, and diffused responsibility. Not only that, the long list of weaknesses tends to be a worst-case scenario. Some are truly substantial but most are only potential; what can happen is not always what will happen.
Finding strengths in stakeholderism is not very difficult, and the list is quite convincing:
1. Stufflebeam's committee proposed standards for any type of evaluation, but in their opinion, any type of evaluation should include at least an element of stakeholderism. One of the reasons is ethical; stakeholders are those who have a right to know about the evaluation. (JCSEE, 1981:21, 28, 40, 47, 56, 77)
2. It shares a main strength of Scriven's goal-free evaluation (1974) in eliciting stakeholders' personal responses to the program and inferring real effects and actual goals.
3. It does not claim to create a perfect program. By aiming at improvement, it represents the continually changing dynamics of real programs.
4. Without stakeholder evaluation, schools would still have most of the same problems but they would be less aware of them.
5. In person-oriented cultures, stakeholders are the real starting point in the school anyway, not commitment to an abstract, impersonal statement of institutional mission.
6. By using an internal consistency criterion, stakeholderism avoids unnecessary value conflicts between schools and accreditors, and protects each school's uniqueness.
7. Stakeholders include groups of people that are interested in both functionality and content.
8. There is the potential for beneficial digressions.
9. It provides ways of interpreting abstract criteria.
10. The consensus view of quality enables stakeholder populations to extend beyond the school, thus creating a level of external validity on the issues of content.
11. The ATA view predetermines a procedure. Compared to human services evaluations as a whole, accreditation of schools is a very small category, so accreditors can be more specific in defining procedures which leave less room for ambiguity and improvisation.
12. Stakeholders are well placed to understand the program at least from their own perspective and can see some kinds of problems most clearly. Stakeholders can bring to light many inconsistencies and dysfunctionalities which could not be found using predetermined criteria. For example:
- A school with separate campus and extension programs might find that they are inconsistent with each other.
- A technological approach assumes that its theoretical and technical aspects are consistent with each other, but an uncomfortable mixture of technical and scientific knowledge can be inconsistent.
- Students' immediate interests or felt needs are often inconsistent with the subject matter they really need to learn. (Pring, 1976:48ff)
- A degree which signifies real learning might be inconsistent with students' expectations to rote-learn. (Cf. Samuelowicz, 1987:123ff; Adam:n.d.)
The internal consistency principle assumes that these will be consistent or can be made consistent, even when the school has no control over them.
Program evaluation assumes that highly-trained evaluators will coordinate evaluations; they will not be stakeholders so will be as neutral as possible. Accreditation, however, is different. Firstly, there are two sets of evaluators: those who conduct the school's self-study and those of the accreditor's visiting team. The former clearly represents the interests of the school, but the latter is less clear. Even if visiting accreditation teams get training in evaluation and include professional evaluators, it is unlikely that they will be neutral. They are chosen to represent the accreditor's interests, however defined, and schools can find it difficult to trust them.
Complaints about the lack of trust between accreditors and schools are commonplace in the literature. The issue of legalism, discussed above, weakens relationships between accreditors and accreditees. Furthermore, in Miller's and Barak's survey of undergraduate evaluations, one of the most often-occurring responses was resistance or reluctance from institutions. (1987:27; see also p. 28)
Similarly, accreditation too easily falls prey to the personal prejudices of members of its evaluation teams. Cross et al. raise the common complaint about prejudiced individual team members, who can undermine the efforts of otherwise open-minded accreditors (1974:159f). Barnett contrasts team chairmen (often fair and openminded) with the "waywardness" of individual members. It does not help that different players can interpret accreditation criteria quite differently. (Barnett, 1987:288, based on Alexander and Wormald, 1983:108 and Billing, 1983:34)
Scriven blames accreditors for selecting evaluation team members who are blind to deep-seated biases. (1983:251f) The problem is not much more than a private club mentality. The team is chosen according to its ability to represent the interests of the accreditor and the other schools, with no assurance that they will protect the interests of the evaluee. It is hard to see how an evaluation team could meet the standards for evaluator credibility, and their situation is readily interpreted as a conflict of interest. (See JCSEE, 1981:24f, 70ff)
A lack of trust also affects formative evaluation for school improvement. Too many schools have not believed accreditors who say that schools should use evaluation results to improve according to the unique characteristics of their schools. Evaluees easily feel that evaluators secretly want to use the evaluation as a tool of judgment, not improvement. Some schools do not trust what accreditors say they want, expecting a hidden agenda of standardization based on processes. They then interpret examples of possible improvements as rigid requirements for processes. (Rutherford, 1987b:95; Ewell, 1987:28; Van Os et al., 1987:252) It does not help that such a suspicion has often been well-founded, with accreditors sometimes paying only lip-service to the character and particular goals of each institution.
Accreditors, then, have too often had poor relationships with the schools which seek their accreditation, especially nontraditional schools. In fact, a reading of the literature causes a suspicion that lack of trust between accreditors and schools might well be the main cause of problems in the North American style of accreditation, and why discussion easily produces heat rather than light.
Van Os et al. do not understate the case when they say that trust between accreditor and accreditee is a conditio sine qua non. Trust is two-way; the accreditor can expect honest information from the school, and the school expects the accreditor to use of the information responsibly. (1987:252, 255) Trust is especially necessary to accreditation which depends on consensus.
Harris suggested criterion-referenced specifications of performance to avoid subversion by interest groups (1990:52). However this is simply reversion to the process model of accreditation. Besides, interest groups could easily subvert it at the stage of formulating criteria or of interpreting and applying them in concrete situations.
Accreditors can take steps to minimize these problems. Firstly, training and orientation of accreditation personnel have become increasingly important. (E.g. Kells, 1986:145)
Secondly, schools might be allowed to become non-accredited members of the accrediting association so that they can see and trust the accreditor's deliberating processes, and if necessary, urge for reform from inside rather than from outside. Barnett is clearly right to say that accreditation depends heavily on people's personalities and on how well they have been "socialized" into the "game." (Barnett, 1987:288, based on Alexander and Wormald, 1983:108)
Thirdly, selection of members of the accreditor's visiting team is important because interest-group ethics play such an important role in establishing trust and fairness. It is difficult to assume that training will help them protect interests other than those they are chosen to protect. Although they face in-built tensions between the school's and accreditor's interests, they should be acceptable to both accreditor and evaluee so that both parties feel that their interests are protected.
A most attractive solution is to give the evaluee the right to nominate evaluation team members while the accreditor retains the right to approve them. (SPABC, 1988:8) This clearly favors the school, who can select people sympathetic to their particular environment and goals. The accreditor protects its interests both by setting out the criteria for team members and by the right to deny approval. (Alternatively, the accreditor could nominate team members while the evaluee approves them, but this would favor the accreditor's interests and considerably disadvantage the school.)
Overdependence on Internal Consistency
Both product evaluation and stakeholderism rest on an internal consistency criterion with some serious weaknesses. The obvious problem with fully depending on internal consistency is the lack of external consistency.
Values. Values is a recurring issue and is so closely related to internal consistency that separating them neatly is difficult. Consistency is itself a value, although people sometimes happily accept their inconsistencies. The product model of accreditation uses a fitness-for-purpose value to attain consistency.
A program that fails to reach its goals can simply make them less demanding, so that the means and ends are more consistent with each other. The internal consistency criterion is inadequate to say whether this is a valid and necessary adjustment or an unacceptable lowering of standards; that is, it implies some relativist values.
Scriven points out that the product model is almost pseudo-evaluation. It is a hybrid between managerialism and social science. It uses a managerialist ideology, which manipulates means to achieve ends and then finds out whether the means reached the ends. It also uses a social science ideology that claims to be objective and value-free; evaluators try objectively to evaluate (that is, assign a value to) something, while avoiding responsibility for the values implied in it. This is a contradiction. (1983:234) Scriven shows that means-ends thinking is a value system. In this case, the central problem is that internal consistency is being used an over-riding value; evaluation is quite clearly not value-free.
Necessary inconsistencies. Inconsistency is unavoidable; some things will always be in tension. In any system of evaluation that uses an internal consistency criterion, the problem is that some things are necessarily inconsistent. The literature presents many examples:
- Academics thrive on differences of opinion, even when it makes no difference. (Dressel, 1976:380)
- Administrators are often deliberately ambiguous in their communications because they deal with many interest groups. (Dressel, 1976:381)
- Students and teachers see different purposes in evaluation. (Kogan 1986:136; Talbot and Bordage, 1986)
- Self-evaluation for improvement is in natural tension with summative external evaluation. (Kogan, 1986:135)
- Consensus is in natural tension with authoritativeness. (Kogan, 1986:135)
- Financial considerations in decision-making are often in tension with academic considerations, because increased funding can often improve a program. (E.g. Tatum, 1987:650)
- Academic interests are not the same as those of the wider constituency. For example, schools need to anticipate future needs at least as much as serve only present needs. Some curriculum models are concerned with understanding the reasons why something works, while practitioners are satisfied to know that it does. (Bragg, 1984:191)
- The school's constituency might change their expectations of the school.
- The kind of prospective students available might not suit the roles of graduates.
Although it is admirable that schools strive to make consistent things over which they sometimes have little control, consistency is an impossible ideal.
Ferris's solution. Ferris confronted the problem of lack of external consistency more indirectly by de-emphasizing the accept-reject distinction. He implied that recognition is context-based; anyone wanting to recognize degrees and receive transfer credits must have a similar context to the school where the credits or degrees were earned. This implies that credit is not always transferable because it can less easily cross contextual boundaries. (ATA, 1987:7f)
Such a cautious attitude seems more accepting of weaker programs but implies less recognition than accreditation has traditionally given. This attitude has the advantages of accounting for both context and the value-added effect, but it is not much of an answer to give accreditation more freely and have it mean less. The principle of only transferring credit to compatible programs applies anyway, but less quality assurance for recognition is little help to anyone asking whether credits and degrees are worthy of recognition.
Internal consistency at accreditor level. At macrocosmic scale, the accreditor faces the same overdependence on internal consistency that schools face at microcosmic level. Bias and conflict of interest problems loom large. Scriven complained that schools in the same accrediting community are "incestuous" when they accredit each other. (1983:252) While nobody respects a degree mill that sets up its own accreditor to accredit itself (Bear, 1980:28), the question is whether a group of schools can accredit themselves. After all, self-regulation is a euphemism for a cartel. All participants share interests in common, potentially biasing consensus groups which function at accreditor level. Apparently responding to the same problem, Kells noted that many accreditors stipulate that peer reviewers must not be staff of schools which compete directly with the evaluee school. (1986:142) Although Kells' suggestion does not solve the theoretical problem, it is helpful at a practical level. JCSEE pointed out that program evaluations frequently have conflict of interest problems, and the challenge is not so much to avoid them but to deal with them. (1981:70; see pp. 24ff, 70ff for further discussion.)
Generic objectives and interschool consensus groups are still systems of internal consistency, but simply in a bigger group. Both use a sociological ethical base with its overtones of relativism; something is deemed to be correct if people in a given population agree on it. Stakeholderism, for example, only ensures that the program evaluation will satisfy identifiable stakeholders as well as possible at the time of the evaluation. House calls this a subjectivist ethic based on the maximized happiness in a society, adding that a objectivist ethic of "justice-as-fairness" is possible but nobody has suggested an evaluation model using it. (1983:49f) However, Scriven's consumerism shows more than a trace of the idea of justice, and the formulations of generic objectives are at least open for rational debate. Accreditor's policies are normally written in a handbook where they are open to examination and evaluation.
External consistency and consensus groups. Schools can be easier to evaluate if part of the process is already done through adequate external consistency mechanisms. Where possible, a school can be more sure of its standards if its consensus group extends beyond its immediate stakeholders and even beyond the accreditation association to an even wider community of schools and practicing professionals. In this way, schools can have comparable standards (that is external consistency) with each other without having uniform processes and program goals.
A wider consensus group is not always possible. If other schools and experts are not available to form one, a school has no option but to depend on internal consensus. A wider consensus group is not always desirable either. Despite undoubted success in maintaining standards of content, a network of schools which becomes a consensus group faces the temptation to become a private club. A school's interests might conflict with the opinions of the wider academic community. A school might need someone to tell it that its program is all wrong, and it cannot always respond with accusations that others are unwilling to understand nontraditional education. Then again, the school might have an excellent program, but others might be so steeped in "traditional" models they consider the school to be weak. It is also quite possible that consensus groups think mainly in terms of institutional prestige or that they encourage empty typewriter practices that lower program quality.
The British have traditionally approached the problem by using wide consensus groups. While they also expect staff to be well-qualified and their institutions to be autonomous, new schools often seek a consensus group through other institutions. For example, all new British universities have started with an Academic Advisory Committee to ensure the establishment of adequate standards. (Perry, 1976:121; see also Booth and Booth, 1989:282) British-style schools employ thesis readers and examination markers from other schools to ensure comparable standards. London University and the CNAA have both tutored into existence new institutions, which only took more responsibility in granting their own degrees when they could maintain comparable standards. In this way, the new schools had strong programs from their inception.
Some consensus group members can have ongoing administrative authority in the school, such the members of an examination board. Others might carry permanent portfolios on an advisory board according to their specialist areas of expertise in content, professional practice, or education; alternatively, they might join the board temporarily to do a particular task with a set specifications. External thesis readers and examination markers can make real decisions according to their area of expertise without becoming part of the administrative structure.
Subject matter specialists maintain informal links with other people teaching in the same discipline, and the relationship can be important even without organizational links. Brennan calls the interschool network of people in the same field of study the "invisible college." According to one study, it is a more important reference point to British university teachers than their employing school. (1986:152)
Consensus groups can easily be national, and can become international when schools send theses to foreign readers. While American schools tend to sacrifice a potentially wider consensus group to maintain autonomy, the British prefer to forgo some autonomy in order to gain a wider consensus group. Both face the reality that schools cannot have complete academic independence while depending on others for academic standards.
Accreditation models, if they are to be adequate, cannot rely completely on internal consistency. The idea of wider consensus groups and the interests of the accrediting community all imply that external consistency between schools and with schools further afield is necessary and helpful. It is interesting that the SPABC allows members of the accreditor's visiting evaluation team to come from outside their association. (1988:8) This is a helpful trend as it allows cross-fertilization with a wider consensus group and avoids excessive dependence on internal consistency.
Interpersonal Models: Tentative Conclusions
Stakeholderism facilitates eclecticism, being open to many different sources and topics of information.
For example, it is open to information about processes and products, and to the opinions of content experts and program managers. It is similarly open to the opinions and observations of non-experts and consumers. It would be difficult to justify the neglect of any of these bodies of insight. Stakeholders might also want information from outside consultants and empirical studies. Its procedure is highly interactive; Ferris even sees responsive evaluation as a variation of stakeholderism. (1986:4f) Whatever other concepts of quality are in use, interpersonal models greatly depend on consensus and can take advantage of its many strengths.