WU

Quality in Education: Conceptions

Ross Woods

"Quality" is one of education's vaguest ideas. A great many books and articles on educational quality never define it, using their pages to discuss evaluation or being content to promote idiosyncratic views of curriculum or schooling. In a book entitled Improving the Quality of Schooling, for example, Hopkins devotes less than a page to defining quality. (1987:4f)

If accreditors want to certify it, they need to conceptualize it clearly in flexible enough terms to suit various curriculum models and program styles. As these chapters show, no current conception of quality is so faultless that it can stand alone, and hardly any is so weak that it can never be used. This is more than simple eclecticism; the nature of the beast is that it is necessarily multifaceted.

Quality is not is not necessarily the same as expensive education or big schools (e.g. Solmon, 1981:12), nor excellence, which is usually the idea that schools should aim to be of high quality rather than be content with meeting minimum standards. On the point of excellence, Perry notes that academics have traditionally seen their highest goal as the formation of scholars, although only a select few eventually become scholars. Perry observes that an education system also needs to provide for the majority of its students who take careers outside academia. (1976:53, 282f)

The idea of effectiveness is attractive, but Murphy shows that it is complex and unclear. With its so far unreconciled views, it is a microcosm of models which are analogous to models of program evaluation. (1987:48f)

A key issue in the different views of quality, is its referent, that is, what it is that is said to have quality. Referents vary markedly between models, reflecting different conceptualizations of education; those that are clear are identified.

The conceptions below either come or can be derived from the literature so far, and they undergird different views of curriculum, program evaluation, and accreditation.

The Speaker-approval View

"Purr words" are a category of words which, largely devoid of conceptual content, do little more than signify the speaker's approval. (Leech, 1974:51f) The term "quality" often seems like this; it is the norm rather than the exception to use it without definition. Consequently, to say that a school has quality without implying a clear meaning of quality is to invite the criticism that one is only saying that he likes it. Being conceptually vacuous, this is the most useless of all meanings of quality.

Dressel wryly notes how academics can easily lose their objectivity. They cannot say what quality is, but they are sure they are producing it. They complain about funding cuts lowering quality, but when cuts come, they will not admit publicly that standards are lower. Every program has high quality judged by its own definition of quality. (1976:381)

The Ineffable View

Some educators conceive of quality as too complex and situation-specific for human words to describe. For all practical purposes, infinitely many things are part of "quality" so no single description can suffice. (George, 1982:46; cf. also Pirsig, 1975:178; HEC, 1992:5f) As Dressel says, there are no clear criteria of success that have met general consensus, and with nothing to talk about, the ineffable view is sometimes no different from the speaker-approval view of quality. (1976:381) Quality is like the king's new clothes.

The great weakness of this view is that it prevents much further discussion. Fortunately, it usually has some friends. One is the idea that the goals of education are ultimately ineffable (discussed later) and the other is the consensus view of quality.

Value for Money

Another contemporary view of quality is value for money. It is actually is a kind of non-definition in that it provides no definition of quality itself, that is, of what it is for which one pays. However, it shows that, given the law of diminishing returns and normal financial constraints, program directors must find an optimum cost/benefit compromise.

However, this gives rise to the question of who must decide what is reasonable compromise and what is scrimping that unduly affects the program quality. From the consumers' viewpoint, a product that is not as good as another might be preferable if it costs considerably less. This might even be construed as an ethical issue; very poor value for money appears dishonest at some ill-defined point.

Structuralism

Another of the least helpful concepts of quality is structuralism, the idea is that quality is organizational structure. Some accreditors have largely assumed that the prominent element in quality education is good institutional, financial, time, and academic management, their prescribed employment policies, and predetermined credit structures. Brennan mentions it as "organizational characteristics" as opposed to intellectual content. (1986:153)

Despite having some value, these things do not in themselves give a reasonable assurance that students learn anything. By overemphasizing program management, they give inadequate attention to teaching and learning. (Hefferlin, 1974:154f, 165f; also Pomerville, 1973:31)

Comparative View

Some writers see quality as a comparison between institutions. For Seneca and Taussig, quality is what the most prestigious institutions do that others want to emulate. (1987:27) Similarly, Solmon reports the peculiarly American idea of rating schools in order starting from the "best"; he wisely adds that it is wrong and harmful always to view those not at the top as failures (1981:8, 12). Johnes and Taylor suggest that schools compare the quality of their graduates. (1987:582) George gives one description of quality as reputation, especially in the opinions of peers. (1982:46)

This idea of quality is also very weak. Admittedly, some schools simply are better than others but the view is fraught with problems. While it can stimulate well-known institutions into healthy competition (Solmon, 1981:9), it is not necessarily accurate (p. 8) and has nothing to offer small, new, or innovative programs, no matter how good they are.

Its greatest weakness, however, is that it completely begs the question of what ranking criteria one uses. Webster's survey of various ranking methods is particularly good. He lists their weaknesses and strengths, and concludes that no ranking system by itself is adequate. He also notes that it is "remarkable" how few quality rankings have been based on how much students learn. (Webster, 1981)

Intensity

Peters and Austin mention that intensity and dedication do more to ensure a high-quality program than what is actually done. An otherwise bad program with lots of vitality, empathy, and intensity can be much better than a program with good methods and no intensity. (1985:465ff) According to this view, quality refers to teachers' attitudes.

This factor affects educational research in the form of the Hawthorne effect. That is, experimental subjects respond differently when they know that they are part of an experiment; they try harder to make the experiment a success. For example, an experimenter might test a very poor teaching method. If the experiment's teachers realize they are part of an experiment using a new method and become enthusiastic, they might actually produce good learning results. That experimenters in education work so hard to neutralize this factor is acknowledgement of its power. (Cf. Tuckman, 1978:102f)

This view is most relevant to the interpersonal models of evaluation, which give much more prominence to the personal opinions and values of program participants.

Environment or Experience

In this view, quality is that to which students are exposed which evokes learning, and not so much what students are shown to have learned. Quality refers to environment or student's personal experience of learning.

George describes it as good process characteristics, such as good delivery systems, teaching, intellectual climate, and school morale. (1982:48) Hopkins expresses it as "learning climate" and the "teaching-learning process" (1987:4), and Bliss as students realizing their potential and their individual learning goals (1988:8). Carr speaks of it as the school's aspirations and its ethos (1986:25), almost equating it with excellence. Levine talks about the environment and culture of the school as shared beliefs and values, and their result in the quality of teaching. (1986:150-152; see also Barnett, 1987:281) This view best suits interactive models of teaching and humanistic models of curriculum, and it indirectly implies an interpersonal model of program evaluation.

The Value-added View

A number of writers list the value-added view as a type of quality. This view asks how far the student travelled during his study, without enquiring where he was when he started or finished. In other words, how much "value" did education add to the student? The idea is that a remarkably good student would start well ahead of the average student, but both would learn equal amounts. According to this view, the referent of quality is the aspect of the school seen as a catch-up or accelerator program.

Strictly speaking, value-added evaluation is empirical and belongs to the quantitative rather than qualitative class of program evaluations. Consequently, it is not a suitable way for an accreditor to evaluate schools. Tyler proposed value-added studies in his theory of curriculum and it became a specialized field. A researcher uses a predetermined set of objectives to test students before and after instruction so he can draw conclusions about how much "value" was added to them. In other words, this type of study aims to show how much the school taught, and whether the school only evaluates what student already knew. Despite being generally too difficult for widespread use, some government authorities have tried to demand that schools make them an internal procedure.

(Tyler, 1949:106; Bloom, 1970:28-30; Stake, 1970:87; Ball and Halwachi, 1987:400; Browne, 1984:46; Ewell, 1987:24; HEC, 1992:7; cf. also Solmon, 1981:11f. Barnett [1988b:17-20] and Dressel [1976:7] discuss some theoretical problems with value-added studies.)

At an experimental level, this closely resembles an objectives and product approach. The difference is that adding value means that students finish at different places, whereas uniform objectives imply that all students arrive at about the same place. Otherwise, value-added thinking cannot stand alone; it relies on some other model (usually product) even to measure the amount of value which is added.

The concept of value-added quality has several important advantages. It is useful for monitoring student progress. Besides, a weak student who puts in the effort to become average is in a way better than an above-average student who is satisfied with maintaining average results, even though both get the same results. The student who learns a great deal from a practicum gets more benefit that an already-capable student who does no more than what he already knows. The value-added view can reveal that an elite school with very capable graduates might really be achieving very little if it accepts only the best applicants and does not exploit their potential. Similarly, "average" schools are very successful if applicants of average ability generally become above-average achievers.

In essence, value-added thinking rejects absolute standards. In fact, its greatest value lies in showing that standards cannot be absolute. Lower stands are justified when culture, not educational short-cutting, makes schools include lots of basic teaching.

Examples abound. Schools can only achieve less when applicants have had an "empty typewriter" high school education. With a preconception of education that discourages creativity and initiative, students are disadvantaged when they enter a program that focuses on real learning. Many non-Western students have less general knowledge about their field of study when they commence study. (This is particularly the case in theological studies where many students are first generation Christians and can only start at a lower rung on the ladder.)

In studying culture-related subjects like theology, the literature in that language and culture might be at such an early stage of development that schools cannot aim to have a strongly academic program based on a long tradition and a large literature. "University entrance" implies vastly different levels between countries with very different standards of general education. One country's Bachelor's degree can be another country's Master's degree. Such realities show that international accreditation cannot completely escape relativism.

Despite the valid variations of degree meaning between schools and countries, value-added thinking is no excuse for overly relativizing the meaning of degrees. The approach means that weak students could pass at a very low standard by travelling as far as stronger students; no longer would one student's degree be similar to another's. In the same way, a weak school could use the value-added principle to try to justify degrees with very low standards. If an association wants graduates of its member schools to be able to continue education elsewhere, it needs to maintain the standards of degrees that are normally used as prerequisites for further education.

Metricism

Metricism, a term adapted from Ferris et al.'s analogy of the strict definition of the meter length (1986:3), is the idea that educational quality can be reduced to statistical information. For example, accreditors can require schools to have a certain ratio of full-time staff to students, so many books in the library, so many doctorates, and so much lecture-room space.

Metricism is not at all new. In 1976, Dressel wrote that quantitative evaluation (including measurement) was already obsolete as a means of determining program quality (p. 3). Surprisingly, however, its variations survive so well in the literature. Metricism takes several major forms, all of which obscure rather than illuminate the meaning of quality.

a. Length of time. The longer a program takes, the higher its supposed quality. For example, it is easy and practical to describe study in numbers of hours, and degrees in numbers of years of full-time study. It is also helpful to plan for course work to take a specified length of time during which students should work to capacity.

Nevertheless, the use of time totals alone is very inadequate. It says nothing about what type of learning students have, and is as feeble as any other effort to reduce education to numbers. Three-year bachelor programs can be quite equivalent (although not the same) as four-year bachelor programs by being more disciplinary and scientific.

b. Resources. Quality, supposedly, is a simple list of statistics. The school should have a certain ratio of full-time staff to students, so many books in the library, so many doctorates, and so much lecture-room space. (Ferris, 1986:3f) However, a lecturer might be boring, confusing, outdated, or over-opinionated. He might set a poor example, lecture simplistically, give too many dispensations, or use class time irresponsibly. Yet some of the most reputable accreditation agencies will accept him if he has the right degree from the right school and schedules the right amount of class time.

c. Standardized Tests and Examination Results. Some writers suggest this as the best quality measure. For example, Bee and Dolton (1985) suggest that in the English system, the proportion of first class and upper second class honors degrees compared with student intake three years earlier, adequately gauges "quality." Lerner (1986) argues that aptitude tests of verbal and mathematical reasoning ability give the most reliable and standardizable results. She mentions the Scholastic Aptitude Test (SAT) as a good example (cf. also Hopkins, 1987:4f).

Tests can be very valid if they qualitatively examine real learning or aptitude, and a school and its accreditor would be foolhardy to ignore below-average scores on a relevant, standard test. SAT has the favorable point of being as free as possible from bias toward any type of curricular content (Lerner, 1986:189) and it seems to work very well for the population for which it was designed. Governments can usefully require system-wide standardized testing when student populations are sufficiently homogenous and institutional goals are very similar. Even then, however, the criterion of quality is what the tests measure, not the scores themselves. That is, test scores can only correlate with a concept of quality assumed in a given test.

For accreditors, testing and examination results are not at all a suitable model of quality. Not only do they share many of the weaknesses of indicators (see below), tests can vary in quality, and suitable tests do not always exist. Standardized testing does not easily lend itself to a highly stratified society with many ethno-linguistic groups. Private international accreditors (such as those of ICAA) have no right to impose standardized testing on autonomous schools; schools would only accept it voluntarily if they saw it to be beneficial. Deciding the role of examinations in a school's program is likewise the prerogative of the school, not the accreditor.

Other than that, emphasis on examination results can take a very extreme form. As Hopkins points out, it reduces educational quality to examination-passing, and teaching to test preparation, thus possibly lowering the quality of teaching. (1987:5) This is to go further than to admit the validity of means-ends thinking; it is to create two potential dangers. The first is the assumption that accreditors have the right to impose a strict means-ends curriculum value on schools at the expense of the role of teaching as an interactive process between teachers and students. The second is that it is counter-productive if it moves students' attention away from understanding the subject and onto rote-learning to pass examinations.

d. Educational indicators. This approach assumes that quality is expressible in the kinds of standardizable statistics preferred by planners and decision-makers. Almost any statistic, no matter how indirectly related to quality, can become an educational indicator, including time, resources, and test results. The idea of indicators is not new, and its literature is as extensive as it is inconclusive.

It is probably fair to say that many indicators are helpful. Given a poor showing on relevant indicators, most responsible schools would take a careful look at their program. Indicators can be valid if they suit all schools equally. Moreover, most statistics have some underlying value; for example, a criterion asking for six thousand usable titles in a library really tries to ensure that students have adequate access to complex information. (Cf. ATA, 1987:4-6)

No indicator, however, represents quality. Rutherford claims to have developed indicators for qualitative measurement in higher education, but they depend so greatly on judgments by peer review and on arbitrary categories that they show little promise as a system of indicators. (1987a; see also Rutherford 1987b:103)

Lining up valid objections to indicators is easy, and many of them apply to statistical measures of quality in general. They show either that indicators fail to indicate quality or that present methodologies are inadequate:

1. The theoretical base of indicators is still very weak (Stern and Hall, 1987:6). They are not based on a substantiated model of education (Smith, 1988:489), so it is not surprising that there are too many suggested indicators but as yet too little consensus on which ones to use. (Eide, 1987:10) As a model is a simplification of reality, a single model cannot be comprehensive enough to portray reality accurately; even a theory can hardly claim to be the only possible explanation of all phenomena in a field of study.

2. Consequently, lists of statistics are too inflexible. There is no consensus on "good education" because different individuals and groups value different outcomes differently. (Eide, 1987:10) Many indicators are so restricted in scope that they can be irrelevant to the concerns of those involved. Selecting certain aspects of a program for making statistics can exclude other information more relevant to particular programs. (Dyer, 1973:33)

3. Quantitative methods are not yet adequate. For example, many important aspects of education are not yet quantifiable (Eide, 1987:9) and the tendency is to measure unimportant things that are easy to measure. (Rutherford, 1987b:96; Barnett, 1988a:101) Ways of aggregating statistics are still too imprecise (Stern and Hall, 1987:6) and defining outcomes is still problematical, with no concise, completely unambiguous method. (Eide, 1987:10; Dyer, 1973:19; Barnett, 1988b:21) Even if such outcome definitions were possible, it is questionable whether they could be epistemologically honest, as they would refer to simple knowledge. (Atomistic objectives are discussed in a later chapter.)

4. Some indicators depend on human judgments, assuming that individual biases will average out. However, statistics become unreliable when biases consistently favor particular values. (Dyer, 1973:32f)

5. Indicators are difficult to interpret. (Smith, 1988:488) For example, dropout rate is a statistical measure without a philosophical value-base. Even if it is agreed that high drop-out rates are undesirable, it is less certain what they indicate. A high dropout rate alone might signify poor student services, inadequate admission requirements, or dissatisfied students. It might also signify high product standards in a good school, or particular delivery systems like correspondence, which normally have very high dropout rates.

6. Indicators are political. Politics plays a role in their formulation and hinders agreement on concrete forms. They are subject to political incentives and disincentives. (Smith, 1988:489) The use of incentives puts unfair pressure of evaluees to conform, or even to "pervert" the reporting system. (Smith, 1988:490) For example, managerialists can use indicators to exercise power over academics. (Barnett, 1988a) While the intended accountability is laudable, this is hardly a valid use of power in accreditation.

7. Teachers might not take much notice of them anyway. (Fuhrman, 1988:486)

8. Indicators assume comparability between schools and similarity between delivery systems. The assumption does not hold when programs are very different; both schools and school populations can be different. (Stern and Hall, 1987:6; Smith, 1988:490) To make matters worse, decisions about indicators are seldom made at local levels, where they would be more useful. (Fuhrman, 1988:486) They become even less meaningful when students come from different cultures and educational backgrounds, and when institutions vary in models of schools, delivery systems, philosophies of education, and kinds of degrees. As statistical measures cannot suit all situations, accreditors cannot justifiably enforce them. For example, some programs, such as individually tutored higher degrees, might require a much higher staff-student ratio than is average.

9. A set of indicators only reflects selected aspects, never a whole educational system. As a result, it inherits the limitations of quantitative evaluation without the strengths of either qualitative or quantitative evaluation. At least in good quantitative evaluation, evaluators can select program aspects that are relevant to the programs being evaluated.

10. Indicators are indirect; they hardly ever deal what students actually learn. (Barnett: 1988a:102) Consequently they make unexamined assumptions (cf. Stern and Hall, 1987:6). Statistics are by nature arbitrary; they give a number rather than the principle for determining how much is enough.

Some indicators relate to teaching rather than learning. Means consequently become ends; if indicators are used to measure means, then program personnel aim to improve their means, without regard to the real aims of the program. (Barnett: 1988a:102)

Many are even further removed from student learning, emphasizing factors like facilities and staff qualifications which at best only promote good teaching. (Cross et al., 1974:166) For example, accreditors become interested in the percentage of staff with doctorates. Their assumptions are questionable; it does not necessarily follow that having more staff with doctorates will somehow promote better research supervision, that doctors are better teachers, or that they use their extra knowledge in everyday teaching. Another false assumption is that lower student-teacher ratios mean that students get greater personal attention, or that more books in the library means that students will read and learn more from them. Metricist systems never seek to prove the truth of these assumptions. (Dyer, 1973:19f) Yet another indefensible assumption of many indicator systems is a cause-effect relationship between teaching and outcome, which is very hard to prove. (Ibid.:20f) They incorrectly assume that if the teacher teaches, then the students learn. (When they do not learn, the "empty typewriter" syndrome occurs.) Hard evidence of a cause-effect relationship is especially difficult to get in forms of higher education where students initiate learning or study independently.

Statistics make too many other assumptions. Of course it is normally true that library-dependent programs can suffer through lack of books, that poorly informed teachers tend to lower standards, and that too many students in a normal classroom situation does not encourage good learning. The situation is much more complex than that, as later chapters will show. Suffice it to say at this stage that these assumptions only tend to true, and are sometimes untrue. The questions then arise, "Under what conditions are they untrue? Are they ever a hindrance?"

(See also Smith, 1988; Rutherford, 1987a; Ball and Halwachi, 1987; Haley, 1988; Madgic, 1988; Litten and Hall, 1989; Gregory, 1991:49)

The Product View

In this view, the quality of education is the quality of the objectives it reaches. Using means-ends thinking, it inquires into the "product," that is, what the student has done to show that he has learned something as predetermined in a set of objectives. Its essential values are purposefulness, fitness for purpose, and the articulation and realization of purposes. HEC (1992:6) even goes so far as to call name the product view "fitness for purpose".

The original analogy was to a factory; manufacturing processes contribute to making a product. Many educators accepted a sharp process-product differentiation for a long time, but the distinction has now become less sharp. In a field such as education, process and product are so intimately interrelated that too distinct a differentiation is often unrealistic and artificial. (Cf. Dressel, 1976:51, etc.)

The product view is closely akin to behavioral objectives and particular models of curriculum, evaluation, and accreditation. It is so difficult to discuss them in isolation that it is perhaps best to discuss the general issues under the topic of quality.

Besides normal course work objectives, another important type of product is research and writing projects. For this reason, major thesis programs normally do not have a semester hour rating; the quality of the product is far more important than time expenditure. No school ever accepted a doctoral dissertation based on how long it took to write.

Some sources mention product quality as little more than the ability to reach goals (e.g. Hopkins, 1987:4). Some additional elements are no more than part of goal-reaching, such as reaching the target group of a program and meeting real needs. (House, 1982:5-8; Freedman, 1987:165-167) Other authors mention seven variations of this view of quality:

a. Ways of reaching goals. In a means-end mentality, this is the means. The idea is that a school should not only show that it achieves its goals, but that it should check the quality of the ways it achieves them. Young includes it in his definition of quality, (Young, ed., 1983:450f) and Smith divides it into two parts, one being inputs and resources and the other being processes (1988:488). Wentling (1980:17) includes the evaluation of processes because qualitative evaluation by definition evaluates the whole program, not just outcomes. Brenninkmeijer, et al. (1985) mention the efficient use of means, appropriate planning, and cost-effectiveness. Similarly, House asks whether a program is efficient, how much it costs, and if it is cost-effective (1982:5-15). If a program achieves its goals, it might still have a quality problem if it has poorly organized content, wastes its resources, costs more than the institution can realistically afford, or costs too much for what it produces.

b. Conformance between goals and actions. (George, 1982:47) If what the school does suits its goals, then the program has quality, on this count at least. Needless to say, organizations easily busy themselves with activities that do not support their goals. The interpersonal models say more on this matter because the product view lacks the means to differentiate between goal-achieving and what the people in the program really do.

c. Immediate post-instruction product. The behavioral objectives literature often assumes that product refers to what students can do immediately after instruction. Unfortunately, this means that teachers should formulate specific objectives for each instruction period. Accreditors cannot enforce this and interactive, process-oriented teachers do not do it anyway.

d. Product at the end of the subject. For example it is not too difficult to write a list of specific, useful objectives for an individual subject, like Philosophy 206 or History 101. Means-ends curriculum developers suggest this level of goal, although they sometimes call them "aims" or "general objectives."

e. Graduation product. Some writers refer to the product at the end of the program of study, although the literature does not differentiate it from immediate post-instruction product or end-of-subject product. These conceptions of product often appear to be the meaning of effectiveness (House, 1982:5-8) and outcomes (Smith, 1988:488), and are sometimes translated into test scores. (House, 1982:5-8; cf. also George, 1982:48f) In this case, quality refers to graduates some time after graduation. (See also Elbow, 1971:241)

The product-at-graduation view is also quite compatible with research institutes and assessment programs that are more concerned with a tangible product (a thesis or a passed examination).

f. Culminating product. Some products are concrete pieces of work which represent the highest level of achievement of which the student is capable. Students in many Indonesian Bachelor of Theology programs formally present both a minor thesis and a project-like report of the major intensive practicum. These represent the culminations of the academic and practical aspects of their program.

g. School product. Especially in product-based accreditation, an important kind of product is defined in the school's statement of mission.

h. Eventual product. This kind of quality asks what students eventually do after graduation. Johnes and Taylor ask whether university graduates get jobs and how good their jobs are. (1987:582; also Barnett, 1987:281) Another variation mentions the accomplishments, professional expertise, and problem-solving ability of graduates. (Brenninkmeijer et al., 1985; Solmon, 1981:7, 11; Freedman, 1987:105)

h. Generic objectives. In this view, quality is the extent to which a school's goals and activities contribute to achieving the aims of higher education. The literature of the subject is very large; many academics write their own lists of goals for theoretical reasons with little practical purpose. Some of the lists of objectives are useful in that they articulate assumptions which would otherwise be left unsaid. American education generally has included enculturation and communication skills, while British education tends to emphasize critical thinking. (See e.g. Dressel and Thompson, 1973:178; Dressel, 1976:31; George, 1982:49ff; Barnett, 1988a:100, 104-6) Perhaps Barnett is correct when he says that defining higher education purposes is a hermeneutic exercise; the people in the system are constantly re-interpreting their experience. (Barnett, 1988a:105, based on Habermas, 1978)

As product accreditation depends on each school being an internally consistent unit, the idea of generic objectives is the basis for consistency between schools and is one of the main ways of distinguishing between higher education and anything else.

Nevertheless, it has its own problems. Firstly, there is not much point in evaluating particular lists because their content varies so greatly. As Barnett says, the problem is not that nobody knows, but that there are too many conceptions. (Cf. Barnett, 1988a:100) Besides, there is little hope that a major network of autonomous private schools will agree upon an expression of higher education goals that is concrete enough to demonstrate what schools should and should not do. It might even be impossible, because higher education goals are too generalized to have the advantages of specifics. It is doubtful whether such a statement of goals would be used anyway; there is little motivation to spend decades to test a statement operationally (even if it were possible), and little impetus to change should it be shown to be wrong.

Defining higher education as aims (i.e. teleologically) is not the only alternative either; it is equally valid to define what it is (i.e. ontologically).

Oddly enough, most lists of aims of higher education share a sameness, and the differences between them are mostly inconsequential. That institutions of higher education mainly teach and perform research is clear, but the correct or ideal relationship between teaching and research is a perennial and largely unresolved issue.

The goals of education are at best more a tentative conclusion than a starting-point in accreditation; a study of the types of degrees, the models of curriculum, the taxonomy of objectives, and the assumption of cognitivism all point towards particular conceptualizations of the types of knowledge that higher education is supposed to produce. No matter how helpful descriptive statements of educational goals are, they are not a prescriptive concept of quality. As descriptions, they are good examples of how strictly linear thought does not work. It is impossible to start at an ideal first cause and use it to prescribe programs. It is better to work with the subject at hand and describe its assumptions explicitly, defending them where necessary.

Strengths of the Product View

As an approach to quality, the product view has some major advantages. Firstly, it is flexible. It fits any program no matter how unique or contextualized, as long as it can express its goals as objectives. It has already been noted that the JCSEE standards for qualitative evaluation still tend to use means-ends thinking even when evaluation does not use product definitions. Secondly, as an epistemologically "hard" view, it assumes effability; the issues of quality and the goals of education are essentially expressible in language. Another implication of being epistemologically hard is that students produce concrete evidence of learning; the view has the advantages of behavioral objectives.

Third, it provides a rational means-ends basis for formulating and evaluating programs; it is hard to deny that means should suit their ends. For example, whether on campus or in extension it is unfair to provide an education that includes only theoretical and research skills but expect graduates to be fully-developed practitioners. It is equally unfair to expect practitioner trainees to have the same academic skills as their scientific counterparts.

Fourthly, the concept of product is valid and necessary; students need to be able to do the job for which they are trained. If they cannot, then the school has failed no matter how high its academic standards are and no matter how much its students learn. Besides, used correctly, the product approach helps minimize the empty typewriter syndrome, which is the result of clear means but unclear ends.

Fifthly, it provides a way to respond to students who are already practitioners who might to a larger extent already be producing the "product" before graduating. The hoped-for change is then the discrepancy between the student's present skills and the program objectives.

Sixth, it suits both teaching and non-teaching schools. Schools that teach need good teaching. Assessment and research programs by definition do not teach and depend totally on product evaluation, at least as far as product can be separated from process. Admittedly, the notion that means should have quality is inconsistent with the view that only the product is important; the two are mutually exclusive. However, it would be an extreme view that a teaching school need not take any responsibility for its teaching, that is, its means. Otherwise, the two notions are alternatives rather that contradictory opposites.

Solving Some of its Problems

Some of its most important problems lay within reach of solution. It is easy to identify erroneously the product concept of quality with all the weaknesses of behavioral objectives. The most valid criticisms do not refer to the concept of quality but either to kinds of objectives or to the content which they represent. A later chapter responds to these problems in greater detail.

It might be argued that the product view does not have inbuilt criteria to evaluate objectives, in the same way that some have argued that it has no clear concept of the sources of objectives. In reply, it must be said that the criteria for any one set of objectives are highly complex; they include context, curriculum presage (philosophical presuppositions), learners' needs, and feedback from previous implementation. (Cf. e.g. Print, 1987:22, 26f)

Furthermore, a strict product mentality does not suit people who prefer to think in terms of interactive processes. (Print, 1987:26) Dressel quite sensibly points out that many objectives reflect the learning processes needed to achieve the objective, not just a product. (1976:51) It can be better to simply change the mindset and keep the essential values. The product conception of quality really refers to what students learn. To criticize it is to say that education should be aimless or ineffable, hardly rational approaches to education.

Too many levels? It is both a strength and a weakness that the product view has so many different levels of product. Unfortunately, there is not much consensus about which ones are most important, and some are not normally differentiated. The list of levels gives the false impression that there are too many of them for accreditation purposes, and that objectives are intentionally given excessive emphasis.

In their favor, not all levels affect every program and not all affect accreditors. The levels almost make up a taxonomy because most levels supposedly subsume all levels beneath them. Teaching programs need to take responsibility for the ways they reach goals and the conformity between actions and goals. For the most part, this includes the immediate post-instruction product, which is really the responsibility of the teacher as part of the teaching process, who may well decide not to evaluate it. Some levels do not affect assessment programs, which by definition do not teach.

The product at the end of the subject is the lowest level that affects accreditors and supposedly it is part of the graduation product, which is a way of defining degrees. Culminating products can be the product at the end of either a subject or a degree; it may be the only activity in the program (such as the dissertation in an European research degree), or it may be the highest level, representing what was learnt in previous course work. When students achieve degree objectives, the school achieves its institutional purposes as stated in its statement of mission. Naturally schools like to see how successful their graduates eventually become, both as a source of program feedback, and to see how effectively they have achieved their goals.

The tenuous link between cause and effect. A satisfactory product description coupled with successful students does not prove that it is the school which caused the student's progress, especially in long-term programs. Measuring change over long periods is not a reliable guide to the success of the program, as people mature anyway whether or not they are studying. (Tuckman, 1978:97f) That is, evaluation procedures need methods other than the use of objectives. Additionally, student success at any stage is not totally dependent on the school; one cannot presume a strict cause-effect relationship between the school and later life. Some good students could succeed whichever school they choose. Graduates of even the best schools can drop out or fail professionally after graduation. It is almost a natural law of education that every program will produce some unintended outcomes; there will always be students whose learning will differ from what the school has planned.

Although a study of eventual products might give the best idea of overall success and be a very useful source of program feedback, the link between graduation and eventual products is quite vague. Its conclusions can say no more than what has tended to be the case, and cannot specify the extent to which the school gave graduates the knowledge that made them successful. Moreover, many programs simply do not have goals for eventual products, and they do not suit non-career generalist programs.

The links are very tenuous between what the student can do immediately after instruction, at the end of a subject, and at graduation time; they become even more tenuous in a long-term program. This relationship is a proverbial can of worms because it interrelates the value of instruction and the meaning of the degree, and a later chapter discusses it more fully.

The weak link between means and ends is not so much a fault in the product view of quality, but a warning against presuming too much in evaluation. Many other models do not even ask these questions. It is wrong to presume that evaluations should produce findings as certain as those produced under experimental conditions. In qualitative program evaluation, it is not necessary, possible, or even very helpful. (Cf. Cronbach, 1980:4f, 11)

A softer view of product. Many of its other so-called problems modify the model, showing how it includes epistemologically soft elements. As Houle has said, education refers to complex aspects of human beings which are highly resistent to mechanistic formulations. (1978:183) Seen in this light, some of these "problems" are not much more than valuable insights on how the model works. For example, the extremely hard version of the model assumes that programs are formulated logically in prescribed steps, whereas they are actually negotiated between people with different opinions who must come to consensus. It is worth remembering that means-ends curriculum evaluations use interpersonal and open-ended feedback systems.

As another example, practicing curriculum developers cannot follow the strict order of starting with objectives. The model has two starting points; the people in the program who have a particular concept of what they intend to do, and the multifaceted needs of the target population. Besides, curriculum developers in reality seldom try to follow the order. (Houle, 1978:172; Skilbeck, n.d.:26; Print, 1987:26) It is better to see the product view of quality and the means-ends curriculum as a rationale rather than as a rigid method. Certainly Skilbeck's model of curriculum is based on that "dynamic" view of means-end elements. He uses means-ends thinking, but does not use step-by-step formulation to predicate a program upon its objectives. (Skilbeck, n.d.)

Moreover, Solmon says that, ideally, a product is a school's ability to meet its institutional goals, but in practice it is the available resources which correspond to the probability that a school will reach its objectives. (1981:7, based on Troutt, 1979)

Similarly, "product" is a moving target. Program goals are mainly imperfect strategies to meet a perceived set of needs, but real needs faced in the field can be quite different. Consequently, by attempting to meet real needs, the program can run quite differently from its design. Completely static programs simply do not exist; evaluation and modification start no later than when implementation begins, and sometimes even before then. (Cronbach, 1980) Programs do not actually produce exactly what they intended; the actual product differs from the intended product, and this is not necessarily bad. (Browne, 1984:49)

Scriven has complained that product evaluations fail to evaluate program goals, side-effects, and factors not included in the goals (such as cost, lost alternatives), because they use program goals as the criterion of success. (1986:63) He also adds that the "rhetoric of intent" is no substitute for evidence of real success and that side-effects can be more important and desirable than intended products. Program goals can even blind evaluators to anything other than what they see in the light of the goals. Not only that, a program's goals can be very different from what actually happens in the program, and the evaluator needs to evaluate the whole program. (Scriven, 1974:34-42) This does not mean that product evaluation is invalid, but that it needs an infusion of the wider scope and naturalistic methodology found in other models.

(See also George, 1982:48f; Kaiser et al.: 1981:82. Beard et al. is one of the few books specifically on objectives in higher education, but adds little to other works.)

The Consensus View of Quality

The quality of a program, at least in this view, is whatever a group of people decides it is after discussing it in the light of their shared and competing values. The group normally has different interest groups which must negotiate with each other. The dynamics of the organization determines what constitutes a consensus, who should reach it, and how they reach it. That is, quality is an interactive process resembling interactive models of teaching and negotiation and dynamic models of curriculum; its referent is the program conceived as a complex whole. The NUS put it another way, saying that it is futile to seek a universal definition of quality because quality is a value judgment made according to the values of particular people or groups. (1992:24f)

The idea of consensus plays an important role in maintaining standards, particularly relating to content. Consensus groups need to be large and capable enough to functionally maintain standards. For accreditors, the matter is rather simple; whichever way a school chooses, it must show that it has a capable consensus group. Tatum interprets Freedman's view of quality as an issue of perspective, mentioning three possibilities, the producer, the consumer, or a composite of both. (1987:650) The undeveloped producer-consumer theme echoes the issues of goal-free evaluation, and the types of groups reflect some descriptors of the three main concepts of consensus group.

The Importance of the Consensus View

This view is highly influential, especially as traditional academia depends so heavily on consensus group evaluation. Perhaps its main weakness is that it provides only a sociological basis for ethics and educational ideals; it does not subscribe to ideals higher than group opinion. Like product accreditation, it depends on internal consistency, which brings its own share of problems and is discussed in a later chapter.

It has several major strengths. It has usually maintained high content standards, and assumes that infinitely many factors can affect quality. It also has the advantages of being qualitative and epistemologically "soft", and can utilize any other view of quality to which people are willing to agree.

To accept this view is to accept reality. Consensus has decision-making power to determine what will happen regardless of other factors; evaluation is partly a political process and in this sense at least, this view is obligatory. It is important, however, to differentiate between consensus groups as something that will happen anyway and as a positive tool in maintaining standards. (Cf. also Browne, 1984:45; Meyers, 1981:16)

To its credit, it does not impose values from outside the consensus group. Small groups often depend on wider consensus groups; for example, a school can find a guiding consensus in an association of schools, and an association can depend on a national education system or an international network.

Internal to each school. One kind of consensus group is the group of people who plays a part in the accountability structures of a school, including members of the board of governors, teaching staff, administrators, and thesis readers. North Americans aim to develop a highly qualified teaching staff so that each school has its own consensus group and so maintain its autonomy. Having a qualified faculty is the simplest and usually most practical consensus group for day-to-day internal quality control, and private accreditors normally require it. For example, Freedman says that the regular campus faculty can become the consensus group. (1987:163-165; cf. also Sadler, 1987:199)

If the regular teaching staff are the only members of the consensus group, a review of their degrees and the dynamics of their ways of forming consensus is sometimes almost enough for accrediting that aspect of the school. Freedman implies that degrees are an adequate assurance of quality (1987:164). In the past, it has been this group that has carried out the institutional self-evaluation for accreditation. Unfortunately, this tends to maintain conservative values, resist innovations, and protect the vested interests of staff. It also assumes that all staff have traditional accredited degrees, and does not work so well for staff whose foreign qualifications do not easily translate into local degrees, or for staff whose ability is equivalent to degreed personnel but are not so certified. (Better equivalency systems could circumvent this weakness if accreditors were to accept them.)

The consensus group extends no further than its own school, raising questions about how big it needs to be to maintain standards. The size of a faculty depends on the school's delivery systems and its optimum student-teacher ratio; some types of school can have too few faculty members to maintain standards. Relapsing to metric criteria does not help; the number of staff depends on the school and the abilities of its staff.

Another kind of consensus group, which is also limited to the school, is the academic advisory council. The council should take active responsibility for standards, not just lend the names of well-known scholars to act as a nominal "rubber stamp." This option is most needed in schools that are very new or small or have staff without traditional qualifications, but it can still play a significant role in larger schools.

Degrees. Accreditation methods have traditionally assumed that academic degrees represent consensus groups. Degrees, presumably, are a measure of expertise according to the standards of the school that issued the degree. A school seeks a consensus group with other schools by accepting their graduates as teachers. For example, if a teacher has a Master's degree from a good school, it is assumed that he has mastered his field of content well enough to teach it in a Bachelor program at an academic standard comparable to his old school. He will remember what his school expected of him as a student and expect something comparable of his students.

One does not need to be much of a philosopher to see the weaknesses of the assumption, especially if it is extended to say that students would learn comparable amounts to their teacher's Bachelor program. It needs some obvious qualifications, including (among others) the teacher's teaching ability, his field of expertise, his concern for standards, and the similarity of student populations.

The problem of credentialism has already been mentioned, and it is not the only problem in the assumption. One cannot assume that teaching staff are unaffected by what they perceive to be the academic standards in the school where they teach. A well-qualified teacher can easily lower his expectations if he perceives that his students are generally of low ability.

Furthermore, some schools follow the instructional institution model so closely that some teachers only appear on campus to teach. Although they determine what goes on in the classroom and often take responsibility in evaluating students, they contribute little to the faculty's policies on quality. In extreme cases, the academic dean alone develops policies on program quality.

Overdependence on degrees as a measure of quality decreases dependence on consensus within schools and between schools. It follows that if these consensus systems are adequate, then degrees are not really necessary; peer review is an adequate alternative as a consensus-based method for schools to guard their content standards. In the context of schools functioning as self-critical consensus groups, it is easy to see why Barnett sees peer review as a descriptor of higher education. (1988a:108) Webster mentions an interesting case where a well-known Harvard department had three full professors without Ph.D. degrees at one time; one had a law degree, one a Master of Education, and one a Bachelor of Arts. (1981:22)

Nevertheless, countries do not solve their financial problems by abolishing money. Degrees are a helpful gauge of expertise and schools will continue to use them, although accreditors should recognize their limitations.