You are on page 1of 17

PERCEPTION OF CHEESE: A COMPARISON OF

QUALITY SCORING, DESCRIPTIVE ANALYSIS AND


CONSUMER RESPONSES

MARGRETHE HERSLETH1,2,6, MARI AUSTVOLL ILSENG3,


MAGNI MARTENS1,4 and TORMOD NÆS1,5
1
Matforsk, Norwegian Food Research Institute
Osloveien 1, N-1430 Ås
Norway
2
Norwegian University of Life Sciences
Department of Chemistry, Biotechnology and Food Science
3
Tine BA
Oslo, Norway
4
The Royal Veterinary and Agricultural University
Copenhagen, Denmark
5
University of Oslo, Blindern
Oslo, Norway

Accepted for Publication January 5, 2005

ABSTRACT

The main objective of this study was to study perception of cheeses by


comparing quality scores from expert assessors, descriptive profiling data
from selected assessors and consumer responses. Twelve cheeses were evalu-
ated by expert assessors and profiled by selected assessors. Five cheeses were
selected for consumer testing and rated for hedonic liking, plus flavor intensity
and degree of soft/firm texture. Analysis of variance and multivariate analyses
of the data showed that the expert assessors scores for consistency, flavor and
overall quality correlated positively with descriptive profiling attributes as
mature flavor/odor, firmness, graininess and dryness of the cheeses. Prefer-
ence mapping showed an even distribution of the consumers in the sensory
map, which indicated different sensory segments. Some consumers preferred a
firm cheese with a mild, mature flavor and others preferred a doughy cheese
with more acid, fermented flavor. The expert assessors represented the pref-
erences of the first group in their scoring procedure.
6
Corresponding author. TEL: +47-64-97-01-00; FAX: +47-64-97-03-33; EMAIL:
margrethe.hersleth@matforsk.no

Journal of Food Quality 28 (2005) 333–349. All Rights Reserved.


© Copyright 2005, The Author(s) 333
Journal compilation © Copyright 2005, Blackwell Publishing
334 M. HERSLETH ET AL.

INTRODUCTION

A basic requirement of any sensory quality control system in the food


industry is the definition of standards with tolerance limits on a sensory basis
for the product (Lawless and Heymann 1998). The standards and the tolerance
limits should be based on consumers’ expectations to the products and it is
important for the industry to quantify and understand these expectations. ISO
(2000) defines quality in the following way: “Quality is the ability of a set of
inherent characteristics of a product, system or process to fulfill requirements
of customers and other interested parties.” When it comes to food products,
sensory properties are probably the most fundamental characteristics to fulfill.
In addition, product attributes linked to nutrition, convenience and image may
also play an important role for consumers. An extensive description of quality
programs and quality rating methods is given in Muñoz et al. (1992). More
recently a review of sensory evaluation in quality control including new
developments and future opportunities was published by Muñoz (2002).
The evaluation of sensory quality of dairy products has traditionally
been done by highly trained experts who recognize product defects and
assign overall quality scores (Bodyfelt et al. 1988). According to Lawless and
Claassen (1993) these methods provide a standardized vocabulary for sensory
defects and are cost-efficient, rapid methods for identifying problems and
scoring their potential impact on product acceptance. However, Muñoz et al.
(1992) pointed out the following critical aspects of using these methods: “The
sensory terms used are vague, integrated and technically incorrect, the terms
are lacking attention to attributes that affect the product’s acceptance, the
scales are inappropriate and incorrect statistical analyses are often performed.”
Published studies that discuss the use of quality scores from dairy experts and
the correlation with consumer acceptability are McBridge and Hall (1979),
Smuskowski et al. (2003) and Lawless and Claassen (1993).
Quality classification of dairy products and sensory descriptive analysis
have different objectives and the methods differ in several ways. Dairy “expert
assessors” (IDF-99C 1997) predict the likelihood of consumer rejection when
defects occur, while “selected assessors” (ISO 1993), performing descriptive
analyses, do not evaluate acceptability at all. The vocabularies used by the two
panels are also different. Dairy expert assessors refer to defects, while in
descriptive tests trained assessors use a detailed description of the products. A
sensory profile allows the sensory scientist to obtain a complete sensory
description of products, making it possible to identify underlying ingredient
and process variables and/or to determine which sensory attributes are impor-
tant for acceptance (Lawless and Heymann 1998). Sensory profiling is there-
fore a necessary tool when working with product development. A review of
descriptive sensory analysis is given in Murray et al. (2001).
PERCEPTION OF CHEESE 335

Whether working with quality scores or descriptive data, it is often


relevant and important to relate the collected data to consumer acceptance
data. In this article, the main objective was to study perception of different
cheeses by comparing quality scores from cheese expert assessors, descriptive
data from selected assessors and consumer responses. We chose to work with
different samples of one cheese variety. Some cheese samples had a quality
corresponding to established sensory product specifications and some samples
had minor defects. Essential questions were whether the consumers were
sensitive enough to realize this variation in cheese quality and how this
variation eventually influenced the reported hedonic liking.

MATERIALS AND METHODS

Products
Altogether 12 cheese samples of the same variety (“Norvegia cheese”)
were evaluated by cheese expert assessors and profiled by a trained panel of
selected assessors. The cheese samples were produced at three different dairies
in Norway, they represented different production batches and had a 12-week
storage period. Based on the results from the expert assessors and the sensory
profiling, five cheeses were selected for consumer testing.

Sensory Analysis by Expert Assessors


Five expert assessors, who had extensive experience with sensory evalu-
ation of cheese, evaluated the 12 different samples in two independent, ran-
domized replicates. The used procedure is described in Tine (2003) and is
based on IDF-99C (1997). The cheeses were presented coded (numbers from
1 to 24) and the experts evaluated consistency (body and texture), flavor (odor
and taste) (IDF-99C 1997) and overall score (Tine 2003). The scoring was
performed at a 1–5 point numerical interval scale where 5 means “in accor-
dance with the sensory specification” (corresponds to the highest quality of the
cheese) and 1 means “considerable deviation from the sensory specification”
(corresponds to the lowest quality of the cheese). The expert assessors used
half and whole points for determination of the scores. When the given scores
for consistency and/or flavor were below point 4, the expert assessors reported
terms for defects.
Cheese blocks of 5 kg were tempered to 13C and the experts sampled the
cheeses with a slicer according to the standard procedure in Tine (2003). Water
was served to rinse the mouth during the test. Serving order of the samples was
randomized.
336 M. HERSLETH ET AL.

Sensory Analysis by Selected Assessors


The cheese samples were evaluated by a sensory panel at Matforsk using
descriptive sensory profiling according to “Generic Descriptive Analysis”
described in Lawless and Heymann (1998). The sensory panel consisted of 10
selected assessors (ISO 1993). The sensory laboratory used was designed
according to guidelines in ISO (1988). The surface of the cheese samples was
removed and 50 g were served to each assessor. The cheeses were tempered to
13C. Water was served to rinse the mouth during the test. Assessors developed
a test vocabulary describing differences between samples and they agreed upon
a list of totally 17 attributes. No attribute describing appearance of the cheese
was included. A continuous, nonstructured scale was used for evaluation. The
left side of the scale corresponded to the lowest intensity of each attribute (value
1.0) and the right side corresponded to the highest intensity (value 9.0). In a
pretest session, the assessors were trained in the definition of the attributes by
testing samples that were considered as extreme on selected attributes typical
for the cheeses. Each assessor did a monadic evaluation of the samples at
individual speed on a computerized system for direct recording of data (CSA
Compusense, version 4.4, Guelph, Ontario, Canada). Two replicates were
performed by each assessor for each cheese sample. All samples and replicates
were served in a randomized order. The average response over replicates and
assessors for significant attributes were used in the multivariate analyses.

Consumer Testing
A consumer panel consisting of 110 consumers evaluated the five selected
cheeses. The consumers were recruited from local clubs and associations in the
community and were selected according to the following criteria: Eating
“Norvegia cheese” minimum twice a week, 25–55 years old and not employee
at Matforsk or the nearby Norwegian University of Life Sciences, Department
of Chemistry, Biotechnology and Food Science. The test was carried out in
sensory booths in the laboratory at Matforsk and the consumers arrived in
groups of 10. The consumers were presented with five coded samples of 200 g
cheese (blind test), which were equipped with a cheese slicer, and were
requested to remove one slice from the sample before tasting. Water was
served to rinse the mouth during the test. The consumers were given two
different questionnaires. The first questionnaire asked the consumers to score
the cheese samples for overall liking on a seven-point hedonic, numerical,
category scale anchored with “dislike extremely” and “like extremely” and
with a neutral center point of “neither like nor dislike.” After removing this
questionnaire the consumers got a new questionnaire and they were asked to
report the perception of flavor and texture for each of the cheese samples on
seven-point intensity scales (also numerical and categorical). For intensity
PERCEPTION OF CHEESE 337

rating of cheese flavor, the scale was anchored with “little flavor” to “much
flavor” with a center point of “neither little nor much flavor.” For intensity
rating of cheese texture the scale was anchored with “soft” to “firm” with a
center point of “neither soft nor firm.” No more information concerning the
products or the experiment was given during the tasting session. Serving order
was varied according to a cyclic design balanced for order and carry-over
effects (MacFie et al. 1989).

Statistical Methods
Analysis of Variance (ANOVA) was performed on the three data sets. The
models for expert data and descriptive data included main effect of product and
main effect of assessor, plus interaction effects between product and assessor.
The effects of products were considered fixed, while the effects of assessors
and the interaction effects were considered random (Næs and Langsrud 1998).
For consumer data we used an ANOVA model with main effects for product
and consumer. The interaction in this model was confounded with the error
term and not estimable due to the design. The analyses of the expert data and
the consumer data were performed using Minitab version 14. The statistical
analysis of the sensory descriptive data was performed using SAS version 8.2.
Reported P-values at 0.01 means a P-value equal to or less than 0.01.
Principal component analysis (PCA) and partial least squares regression
(PLSR) were performed using the Unscrambler statistical package (Camo,
version 8.0, Oslo, Norway). PCA was used to study the main sources of
systematic variation in the average sensory descriptive data. Furthermore,
PLSR were conducted to study the relationship between descriptive data from
selected assessors and scores from the expert assessors and the relationship
between descriptive data and hedonic liking from the consumers. The variables
were standardized and full cross-validation was applied. Correlation loading
plots were applied (Westad et al. 2003) with circles indicating 50 and 100%
explained variance, respectively. In the correlation loadings plots products
were included as dummy variables (passified in the data matrix) to improve the
visual interpretation (Martens and Martens 2001).

RESULTS

Sensory Analysis by Expert Assessors


The results from ANOVA of scores from expert assessors showed that the
cheese samples were perceived as significantly different at the 5% level for
flavor, consistency and overall score (P = 0.01 for all three parameters). The
338 M. HERSLETH ET AL.

random effects of assessors were also significant at the 5% level for all three
parameters (P = 0.01). The interaction effect (product ¥ assessor) for flavor
was not significant (P = 0.74), interaction effect for consistency showed a
P-value of 0.05 and interaction effect for overall score showed a P-value of
0.01. These P-values indicated some differences between assessors with
respect to the scoring of perceived difference between samples for consistency
and overall score.
Figure 1 shows average scores for the 12 cheese samples with letters
indicating significantly different samples at the 5% level. The average scores
for flavor ranged from 3.0 to 4.0 (Fig. 1a), for consistency from 2.4 to 4.1
(Fig. 1b) and overall score ranged from 2.6 to 3.8 (Fig. 1c). Figure 1a shows
that sample number 1 got the lowest average score for flavor (3.0), close
behind sample number 3 (3.0). Sample number 5 got the highest average score
for flavor (4.0). Figure 1b shows that sample number 3 got the lowest average
score for consistency (2.4) while sample numbers 4 got the highest average
score for consistency (4.1). Figure 1c shows that sample number 3 got the
lowest average overall score (2.6) while sample numbers 4 and 10 got the
highest average overall scores (3.8 for both samples). The raw data (not
shown) displayed that the assessors used a relatively small part of the scale –
the individual scores given ranged from 2.0 to 4.5.
The expert assessors performance was evaluated individually by use of
graphical procedures (Lea et al. 1995). The plots demonstrated that the MSE-
values, that is, the repeatability for the expert assessors, varied to some extent.
One of the five expert assessors had P = 0.47 for flavor and P = 0.19 for overall
score (based on individual F-values). The assessor with best performance had
the following P-values: flavor P = 0.02, consistency P = 0.01 and overall score
P = 0.01.
Chosen samples for consumer testing were 3, 4, 7, 10 and 11. Average
scores from the expert assessors and reported defect terms for these samples
are given in Table 1. The table shows that the expert assessors reported defects
for all samples although overall score ranged from 2.6 to 3.8. The total number
of terms given was considerably higher for sample 3 (overall score 2.6) than
for the other cheese samples.

Sensory Analysis by Selected Assessors


Results from ANOVA showed significant differences (P-values = 0.01)
for all 17 attributes used. Figure 2 shows the correlation loadings plot with the
two first significant principal components from PCA on average sensory
attributes with samples included as dummy variables. PC1 accounted for 77%
of the systematic variation in the data and PC2 only 10%. Figure 2 indicates
that the variation between samples was explained by several flavor and texture
PERCEPTION OF CHEESE 339

(a)
5
4.5 a a a a a a
4 a a
b b
3.5 b b
3
2.5
2
1.5
1
0.5
0
Flavor
(b)

5
4.5 a
(ab)
4 b d bc ab (bc) ( bc )
3.5 cd
3
2.5
2
1.5
1
0.5
0
Consistency

(c)

5
4.5 a abc
4 a
ab bcd ( abc ) (abc)
d
3.5
cd
3
2.5
2
1.5
1
0.5
0
Overall score
1 2 3 4 5 6 7 8 9 10 11 12

FIG. 1. AVERAGE SCORES FOR (a) FLAVOR (ODOR AND TASTE), (b) CONSISTENCY
(BODY AND TEXTURE) AND (c) OVERALL SCORE GIVEN BY EXPERT ASSESSORS
Different letters means different ratings at the 5% level of significance. n = 12 samples.
340

TABLE 1.
AVERAGE SCORES AND DEFECT TERMS FROM EXPERT ASSESSORS

Sample Flavor (odor Defect terms, flavor Consistency Defect terms, Consistency Overall Total number of
and taste) (body and texture) score defect terms

3 3 Sour, harsh, nontypical, 2.4 Doughy 2.6 25


unclean
4 3.7 Unclean, harsh 4.1 Doughy 3.8 8
7 3.7 Nontypical, harsh 3.4 Firm, heavily soluble, grainy 3.5 13
10 3.8 Unclean, sour 3.8 Firm 3.8 10
M. HERSLETH ET AL.

11 3.9 Salty, sour 3.4 Heavily soluble, firm, dry 3.4 11

Samples selected for consumer testing, flavor, consistency and overall score.
PERCEPTION OF CHEESE 341

Saltiness
0.8

Odour int.
0.6

0.4 11 Flavour int.


2 3 Acidity
0.2 1 Fermented fl.
Firmness 5
Elastisity 9
Firmn.cutting Solubility
0
Mature odour Doughiness
Dryness 7 10
12
–0.2
Mature flavour 8 After flavour
Graininess
–0.4 Firmn. chewing
4 Bitter flavour
6
–0.6

–0.8

–1
–1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 1

FIG. 2. CORRELATION LOADINGS PLOT FROM PCA OF SIGNIFICANT DESCRIPTIVE


SENSORY DATA FOR 12 CHEESE SAMPLES (PC1: 77%, PC2: 10%)
Samples for consumer testing are marked.

attributes along PC1, while PC2 mainly was described by variation in saltiness
and bitter flavor. When interpreting the results from the descriptive profiling, it
is important to note that the cheese samples were quite equal, possible differ-
ences consisted of minor deviation from specified quality.
The performance of the descriptive panel was evaluated by use of graphi-
cal procedures, especially the “egg-shell plot” (Lea et al. 1995). This plot
demonstrated the variation in agreement between selected assessors for dif-
ferent attributes. The plots revealed a relatively large variation in some of the
odor and flavor attributes, that is, odor intensity, flavor intensity, mature odor
and mature flavor. The reason was probably that the samples were perceived as
quite similar for these attributes and possibly that the assessors were not
correspondingly calibrated. This variation also caused relatively high MSE-
values and high P-values for some attributes and some individual assessors.
This weakness was to a certain extent compensated with the relatively high
number of selected assessors in the trained panel.
342 M. HERSLETH ET AL.

Samples selected for the consumer test were 3, 4, 7, 11 and 10 and Fig. 2
shows that these samples represented the variation among the cheese samples.
Cheese number 3 was characterized with a relative high degree of fermented
flavor, acidity, flavor intensity and odor intensity. Additionally this sample had
a relatively high degree of doughiness and solubility. Cheese numbers 7 and 11
were on the other hand characterized with a relative high degree of mature
odor and mature flavor. Besides, these samples were firmer, dryer, more grainy
and more elastic than the other samples presented for the consumers.

Consumer Testing
The result from ANOVA showed that effect of sample on reported
hedonic liking was close to significance at the 5% level (P = 0.07). The
cheeses got very similar average rating, ranging from 4.4 (sample number 11)
to 4.8 (samples number 3). It was interesting to note that cheese number 3,
which the dairy experts gave the lowest score, was the cheese that got the
highest average score from the consumers. This will further be commented on
when discussing the results from descriptive profiling (Figs. 3 and 4).
Figure 5 shows the average ratings for intensity of (a) flavor and (b)
texture given by the consumers. The consumers were able to distinguish
between samples for intensity of flavor (P = 0.01) and texture (P = 0.01), and
the effect of consumer was also significant at the 5% level in both analyses
(P = 0.01). Sample 3 was perceived to have a significantly higher intensity of
flavor than the other samples. Sample 3 was also perceived as the softest
sample and sample 11 as the firmest sample. These results from the consumer
test go well together with the results from the quality scoring and the descrip-
tive testing, and show that the consumers perceived the cheeses as different
with regard to flavor and texture properties.

Relationship between Different Data Set


For a full interpretation of the relationship between descriptive data and
expert assessors scores, a PLSR was performed using descriptive attributes as
X-variables and scores from the cheese experts as Y-variables. The result from
the analysis is shown in Fig. 3. Explained variance from PLSR was relatively
high, as the two first significant components described 72% of the variation in
Y. Figure 3 shows that consistency, flavor and overall score by the cheese
experts correlated positively with descriptive attributes as mature flavor/odor,
graininess, firmness, elasticity and dryness. On the other hand, the same
parameters (experts) correlated negatively with flavor intensity, fermented
flavor, acidity, doughiness and solubility.
The result from a PLSR using descriptive data as X-variables and liking
rated by the consumers as Y-variables is shown in Fig. 4. Explained variance
PERCEPTION OF CHEESE 343

0.8
Cons
C onsis
iste
tenc
ncy
0.6
4
Over
Overall
ll sco
score
re
0.4 Bitt
Bi tter
er flflav
avou
our

10 8
0.2 6 7
Afterr flflav
Afte avou
our
Doughine
Do ughiness
ss 9
Flav
Fl avou
our
Grai
Grainine
ninessss
0 Dryn
Dr ynes
ess
Solu
So lubi
bilility
ty 1 12 Firm
Fi rmn.n. ch
chewing
ewing
2 Elas
El astitisi
sity
ty Matu
Ma ture
re odour
odou
–0.2 Ferm
Fe rmen
ente
ted
d fl. Odou
Odourr in
int.
t. Matu
Ma ture
re flav
flavou
our
5 Firm
Fi rmn.
n.cu
cutti
tting
ng
Firm
Fi rmne
ness
ss
Flav
Fl avou
ourr in
int.
t.
–0.4 Acidit
Acid ity
3 11

–0.6

–0.8
Saltltin
Sa ines
ess
–1
–1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 1

FIG. 3. CORRELATION LOADINGS PLOT FROM A PARTIAL LEAST SQUARES


REGRESSION USING DESCRIPTIVE DATA AS X-VARIABLES AND QUALITY
PARAMETERS AS Y-VARIABLES (X-EXPL: 77%, 9% Y-EXPL: 41%, 31%)

was also relatively high in this case; the two first significant components
described 54% of the variation in Y. Figure 4 shows an even distribution of
samples and consumers; for example, some consumers rated sample number
11 highest while some rated sample number 3 highest. This distribution in
hedonic liking illustrates the reason why we found only minor differences in
average hedonic liking of the cheese samples.

DISCUSSION

The expert assessors differed significantly between samples for all the
three quality parameters scored. The random effects of expert assessors were
significant, which demonstrated a different use of the scale. Additionally, we
344 M. HERSLETH ET AL.

1 56

83
69
15
32
58
0.8 77
4
60
72
86

0.6 81
57
88 Bitt
Bitter
er flavour
flavour
44 68
7
34 22
14
97
1 26 42 10 a
55
87
92
64
0.4
82 45 40
93 51
20
2 30 99
110
94 Afterr fl
Afte flav
avou
our
0.2 Mature flflav
avou
our 7 66
47 95
16
21
29 59
Gr
Grai
78
ainine
niness ss
12
90 10 37 105
96
Dough
Doughin
ines
ess
0 DrDryn ynesess
71
109 5 28 85
107

103
Firm
Fi rmn.
n. chewin
chewing 63 100
101
Solu
Solubi
bilility
ty
87 b

108
74
Mature odour 106
4
48 49 33

–0.2 36 13
38 91
Elasti
El astisisity
ty 9 8
89
6
104
79
67
Firm
Fi rmn.
n. cutti
cutting
ng 80
41 35 98
39

–0.4 Firm
Firmne nessss 2762
73
46
53
18
3
3 Fermen
Fermente
ted
d fl.
fl
25 54
17 Odour in
int.
t.
–0.6 11
19
11
61
Flavour int.
23
84 52 50
Ac
Acidity
70
–0.8 31
24
65
76 7543
Saltin
Saltines
ess
–1
–1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 1

FIG. 4. CORRELATION LOADINGS FROM A PARTIAL LEAST SQUARES REGRESSION OF


CHEESE SAMPLES (MARKED) WITH DESCRIPTIVE DATA AS X-VARIABLES AND
HEDONIC RATINGS AS Y-VARIABLES (X-EXPL: 73%,18% Y-EXPL: 32%, 22%)
Consumers are marked as dots, n = 110.

observed interaction effects between expert assessors and products for consis-
tency and overall score. However, in view of the relatively small number of
expert assessors, we may conclude that this panel performed a consistent
scoring of presented cheeses. When looking at reported terms for flavor and
consistency in Table 1, the impression may be that all the cheeses had similar
degree of defects. However, the total number of terms shows that sample
number 3 had considerably more defects than the other cheeses, which prob-
ably caused the relatively low scores.
It is important to point out that the principles in sensory evaluation of
cheese (expert assessments) using integrated parameters like “flavor,” “con-
sistency” and “overall score” and in addition reporting defect terms, is very
different from descriptive profiling. It is obvious that scores given from the
PERCEPTION OF CHEESE 345

(a)
7
6 a
5 b b b
b
4

3
2
1
Flavor intensity

(b)
7
a
6 b b

5 c
4
d
3
2
1
Texture (soft – firm)

3 4 7 10 11

FIG. 5. AVERAGE INTENSITY RATINGS OF CHEESES FOR (a) FLAVOR AND


(b) CONSISTENCY, N = 110 CONSUMERS
Different letters mean different rating at the 5% level of significance. For flavor: 1 = little flavor,
4 = neither little nor much flavor, 7 = much flavor. For texture: 1 = soft, 4 = neither soft nor firm,
7 = firm.

expert assessors do not describe the attributes in cheeses that can be relevant in
a product development project. Current dairy technology is often aimed at
development of alternative ingredients and processes to achieve desired nutri-
tional profiles. New ingredients may give rise to unexpected flavors and
textural changes during storage, which do not correspond to existing specifi-
cations, the basis for the expert assessments. For example, a newly developed
low fat cheese may be an excellent example of low fat cheese, and yet receive
low scores in traditional dairy scoring systems. Therefore, descriptive profiling
346 M. HERSLETH ET AL.

is a necessary tool for the dairy industry as a supplement to expert assessments.


Another issue of interest is the scale, of which the expert assessors only use a
relative small part. This often makes it difficult to differentiate between
samples (and to demonstrate significant effects) when using quality scores as
responses in design of experiments. The way the expert assessors use the scale
may raise a question about the appropriateness of using a parametric test for
analyzing data. Therefore, we also transformed the scoring data to ranking data
and performed a Friedman two-way ANOVA (nonparametric test). The results
from this analysis supported the results already shown in Fig. 1.
Descriptive profiling showed that much of the variation between samples
was described by the first component in a PCA plot (Fig. 2). This is not
unexpected when profiling different samples from only one variety of cheese.
The maturing process causes decomposition of proteins and amino acids,
which increases the complexity in flavor at the same time as the texture
changes. However, Figs. 2 and 3 show that mature flavor was anticorrelated
with flavor intensity, fermented flavor and acidity. The same figures show that
the selected assessors obviously perceived a positive correlation between the
mature flavor/odor dimension and the firmness, the graininess and the dryness
of the cheeses. Moreover, Fig. 3 indicates that a relatively high intensity of
these attributes was positively correlated to an ideal maturing process defined
by the expert assessors (consistency, flavor and overall score in the PLSR-
plot). On the other hand, a relatively high degree of fermented flavor and
acidity perceived in descriptive profiling was probably linked to reported
flavor defects as “sour, harsh, strange and unclean” from expert assessors
(Table 1, sample 3).
As already observed, the consumers as a group had no large differences
in reported overall liking. However, it is important to interpret average hedonic
liking data with caution. The correlation loadings plot in Fig. 4 gives a more
complete impression of the reported liking from the consumers. This figure
shows an even distribution of the consumers in the sensory map; some of them
are situated close to sample 3 (2.6 in overall score) and some of them are
situated close to sample 10 (3.8 in overall score). This clearly indicates dif-
ferent sensory segments in the consumer sample. Some consumers preferred a
firm cheese with a mild, mature flavor and others preferred a doughy cheese
with more acid, fermented flavor. What is interesting to note is that the expert
assessors represented the preferences of the first group of consumers in their
scoring procedure. However, there is probably also a potential in the market for
the “doughy” and “sour” cheese. These results correspond with reported
results from McBridge and Hall (1979). They compared Cheddar cheeses with
different scorings with preference tests by 1754 consumers and found no
correlation between overall consumer preference scores and dairy “graders”
scores. However, when investigating different age groups, they found that
PERCEPTION OF CHEESE 347

those aged 21 years and over preferred the cheeses considered by graders to be
second quality. These cheeses were assigned lower scores because of various
flavor defects and tended to have stronger flavors. McBridge and Hall (1979)
summed up their study like this: “Thus it is impossible to deduce if, for
example, the adult consumers liked the cheese graded 86 points specifically
because of its ‘unclean, fermented, bitter’ character, or simply because it had
more flavor.” Another study which also demonstrates the importance of
looking for different clusters of consumers within a sampled population was
performed by Lawlor and Delahunty (2000). This study identified different
sensory segments with regard to preference for each of the speciality cheeses
investigated. Examination of overall preference alone would have missed this
important point.
An important principle for the food producers in quality control is not to
tolerate too much product variation, but to keep the limits of tolerance given in
the specification. In the dairy industry the critical expert assessors probably
want to be on the conservative side in demanding the highest quality obtain-
able. Possibly they also evaluate products based on anticipated shelf-life or
aging potential. However, this study shows that mild levels of sensory defects
in dairy products may not always be objectionable to consumers. It is impor-
tant for a food producer to regularly compare the approved sensory specifica-
tions with consumer preference data. If a food company identifies changes in
the consumer target group or formation of new segments within this group, this
may give new possibilities for new market strategies and increased sale.
It is important to do more research on methods for obtaining sensory
specifications with consumer input. An interesting article which investigated
how large the intensity of a sensory defect could be before a consumer rejected
the product was recently published by Hough et al. (2004). This article pre-
sented the use of “survival analysis statistic” as a tool to answer this question.
It is, however, likely that the food industry performs many studies where the
aim is to identify consumer tolerance limits for food products without pub-
lishing the results. We encourage product developers and sensory scientists
from the industry more often to publish their results, to increase the knowledge
about this topic in the area of food science.

ACKNOWLEDGMENTS

The authors wish to acknowledge the sensory panel at Matforsk and in


particular the panel leader, Laura Blümlein, for contribution in the reported
experiment. Per Lea is thanked for help with the statistical analyses and
Steffen Solem, Norsk Matanalyse is thanked for valuable discussions.
348 M. HERSLETH ET AL.

REFERENCES

BODYFELT, F.W., TOBIAS, J. and TROUT, G.M. 1988. Sensory Evaluation


of Dairy Products, pp. 36–58, Van Nostrand/AVI Publishing, New York,
NY.
HOUGH, G., GARITTA, L. and SÁNCHEZ, R. 2004. Determination of con-
sumer acceptance limits to sensory defects using survival analysis. Food
Qual. Prefer. 15, 729–734.
IDF-99C. 1997. Sensory evaluation of dairy products by scoring. Reference
method, In International IDF Standard pp. 1–15.
ISO. 1988. Sensory analysis – General guidance for the design of test rooms,
In International IDF Standard pp. 1–10.
ISO. 1993. Sensory analysis – Methodology – General guidance for the selec-
tion, training and monitoring of assessors – Part 1: Selected assessors, In
International IDF Standard pp. 1–11.
ISO. 2000. Quality management systems – Fundamentals and vocabulary ZH,
In International IDF Standard pp. 1–29.
LAWLESS, H.T. and CLAASSEN, M.R. 1993. Validity of descriptive and
defect-oriented terminology systems for sensory analysis of fluid milk.
J. Food Sci. 58, 108–119.
LAWLESS, H.T. and HEYMANN, H. 1998. Sensory Evaluation of Food,
Principles and Practices, p. 827, Aspen Publishers, Gaitherburg, MD.
LAWLOR, J.B. and DELAHUNTY, C.M. 2000. The sensory profile and
consumer preference for ten speciality cheeses. Int. J. Dairy Technol. 53,
28–36.
LEA, P., RØDBOTTEN, M. and NÆS, T. 1995. Measuring validity in sensory
analysis. Food Qual. Prefer. 6, 321–326.
MACFIE, H.J.H., BRATCHELL, N., GREENHOFF, K. and VALLIS, L.V.
1989. Designs to balance the effect of order of presentation and first-order
carry-over effects in hall tests. J. Sens. Stud. 4, 129–148.
MARTENS, H. and MARTENS, M. 2001. Multivariate Analysis of Quality.
An Introduction, p. 445, J. Wiley & Sons Ltd, Chichester, U.K.
MCBRIDGE, R.L. and HALL, C. 1979. Cheese grading versus consumer
acceptability: An inevitable discrepancy. Aust. J. Dairy Technol. 34,
66–68.
MUÑOZ, A.M. 2002. Sensory evaluation in quality control: An overview, new
developments and future opportunities. Food Qual. Prefer. 13, 329–339.
MUÑOZ, A.M., CIVILLE, G.V. and CARR, B.T. 1992. Sensory Evaluation
in Quality Control, p. 240, Van Nostrand Reinhold, New York, NY.
MURRAY, J.M., DELAHUNTY, C.M. and BAXTER, I.A. 2001. Descriptive
sensory analysis: Past, present and future. Food Res. Int. 34, 461–471.
PERCEPTION OF CHEESE 349

NÆS, T. and LANGSRUD, Ø. 1998. Fixed or random assessors in sensory


profiling? Food Qual. Prefer. 9, 145–152.
SMUSKOWSKI, M., PING, Y., WENDORFF, W.L. and RAO, R.D. 2003.
Cheese defects in U.S. graded cheeses. Dairy Pipeline 15(3), 1–12.
TINE BA. 2003. Tine Meierienes Analysebok, A database.
WESTAD, F., HERSLETH, M., LEA, P. and MARTENS, H. 2003. Variable
selection in PCA in sensory descriptive and consumer data. Food Qual.
Prefer. 14, 463–472.

You might also like