Skip to main content

Preference Testing

  • Chapter
Sensory Evaluation of Food

Part of the book series: Food Science Text Series ((FSTS))

Abstract

Preference testing refers to consumer tests in which the consumer is given a choice and asked to indicate their most liked product, usually from a pair. Although these tests appear straightforward and simple, several complications are encountered in the methods, notably how to treat replicated data and how to analyze data that include a “no-preference” option as a response. Additional methods are discussed including ranking more than two products, choosing both the best and worst from a group, and rating the degree of preference.

The number of judges that are involved in a study may be such that rather unimportant differences may receive undue attention. It is quite possible to produce statistically significant differences in preference for product which have little practical value by simple increasing the number o judges that are utilized.

—H. G. Schutz (1971).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 89.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Amerine, M. A. and Roessler, E. B. 1983. Wines: Their Sensory Evaluation. Freeman, San Francisco.

    Google Scholar 

  • Angulo, O. and O’Mahony, M. 2005. The paired preference test and the no preference option: Was Odesky correct? Food Quality and Preference, 16, 425–434.

    Article  Google Scholar 

  • ASTM International. 2008. Standard Guide for Sensory Claim Substantiation. Designation E 1958–07. Vol. 15.08 Annual Book of ASTM Standards. ASTM International, Conshohocken, PA, pp. 186–212.

    Google Scholar 

  • Basker, D. 1988a. Critical values of differences among rank sums for multiple comparisons. Food Technology, February 1988, 79–84.

    Google Scholar 

  • Basker, D. 1988b. Critical values of differences among rank sums for multiple comparisons. Food Technology, July 1988, 88–89.

    Google Scholar 

  • Bech, A. C., Engelund, E., Juhl, H. J., Kristensen, K. and Poulsen, C. S. 1994. Qfood: Optimal design of food products. MAPP Working Paper 19, Aarhus School of Business, Aarhus, Denmark.

    Google Scholar 

  • Beckman, K. J., Chambers, E. IV and Gragi, M. M. 1984. Color codes for paired preference and hedonic testing. Journal of Food Science, 49, 115–116.

    Article  Google Scholar 

  • Berglund, B., Berglund, U. and Lindvall, T. 1975. Scaling of annoyance in epidemiological studies. Proceedings, Recent Advances in the Assessments of the Health Effects of Environmental Pollution. Commission of the European Communities, Luxembourg, Vol. 1, pp. 119–137.

    Google Scholar 

  • Bi, J. 2006. Sensory Discrimination Tests and Measurements. Blackwell, Ames, IA.

    Google Scholar 

  • Braun, V., Rogeaux, M., Schneid, N., O’Mahony, M. and Rousseau, B. 2004. Corroborating the 2-AFC and 2-AC Thurstonian models using both a model system and sparkling water. Food Quality and Preference, 15, 501–507.

    Google Scholar 

  • Cardello, A. V. and Schutz, H. G. 2006. Sensory science: Measuring consumer acceptance. In: Handbook of Food Science, Technology and Engineering, CRC Press, Boca Raton, FL. Vol. 2, Ch. 56.

    Google Scholar 

  • Caul, J. 1957. The profile method of flavor analysis. Advances in Food Research, 7, 1–40.

    Article  CAS  Google Scholar 

  • Chapman, K. W., Grace-Martin, K. and Lawless, H. T. 2006. Expectations and stability of preference choice. Journal of Sensory Studies 21, 441–455.

    Article  Google Scholar 

  • Chapman, K. W. and Lawless, H. T. 2005. Sources of error and the no-preference option in dairy product testing. Journal of Sensory Studies 20, 454–468.

    Article  Google Scholar 

  • Cochrane, C.-Y. C., Dubnicka, S. and Loughin, T. 2005. Comparison of methods for analyzing replicated preference tests. Journal of Sensory Studies, 20, 484–502.

    Article  Google Scholar 

  • Coetzee, H. 1996. The successful use of adapted paired preference, rating and hedonic methods for the evaluation of acceptability of maize meal produced in Malawi. Abstract, 3rd Sensometrics Meeting, June 19–21, 1996, Nantes, France, pp. 35.1–35.3

    Google Scholar 

  • Coetzee, H. and Taylor, J. R. N. 1996. The use and adaptation of the paired-comparison method in the sensory evaluation of Hamburger-type patties by illiterate/semi-literate consumers. Food Quality and Preference, 7, 81–85.

    Article  Google Scholar 

  • Ennis, D. M. 2008. Tables for parity testing. Journal of Sensory Studies, 32, 80–91.

    Article  Google Scholar 

  • Ennis, D. M., and Ennis, J. M. 2009. Equivalence hypothesis testing. Food Quality and Preference, doi:10.1016/j.foodqual.2009.06.005.

    Google Scholar 

  • Engen, T. 1974. Method and theory in the study of odor preferences. In: A. Turk, J. W. Johnson, Jr. and D. G. Moulton (Eds.), Human Responses to Environmental Odors. Academic, New York, pp. 121–141.

    Google Scholar 

  • Ferris, G. E. 1958. The k-visit method of consumer testing. Biometrics, 14, 39–49.

    Article  Google Scholar 

  • Finn, A. and Louviere, J. J. 1992. Determining the appropriate response to evidence of public concern: The case of food safety. Journal of Public Policy and Marketing, 11, 12–25.

    Google Scholar 

  • Filipello, F. 1957. Organoleptic wine-quality evaluation. 1. Standards of quality and scoring vs. rating scales. Food Technology, 11, 47–51.

    Google Scholar 

  • Gacula, M. C. and Singh, J. 1984. Statistical Methods in Food and Consumer Research. Academic, Orlando, FL.

    Google Scholar 

  • Gacula, M., Singh, J., Bi, J. and Altan, S. 2009. Statistical Methods in Food and Consumer Research. Elsevier/Academic, Amsterdam.

    Google Scholar 

  • Garnatz, G. 1952. Consumer acceptance testing at the Kroger food foundation. In: Proceeding of the Research Conference of the American Meat Institute, Chicago, IL, pp. 67–72.

    Google Scholar 

  • Gridgeman, N. T. 1959. Pair comparison, with and without ties. Biometrics, 15, 382–388.

    Article  Google Scholar 

  • Harker, F. R., Amos, R. L., White, A., Petley, M. B. and Wohlers, M. 2008. Flavor differences in heterogeneous foods can be detected using repeated measures of consumer preferences. Journal of Sensory Studies, 23, 52–64.

    Article  Google Scholar 

  • Hein, K. A., Jaeger, S. R., Carr, B. T. and Delahunty, C. M. 2008. Comparison of five common acceptance and preference methods. Food Quality and Preference, 19, 651–661.

    Article  Google Scholar 

  • Jaeger, S. R. and Cardello, A. V. 2009. Direct and indirect hedonic scaling methods: A comparison of the labeled affective magnitude (LAM) scale and best-worst scaling. Food Quality and Preference, 20, 249–258.

    Article  Google Scholar 

  • Jaeger, S. R., Jørgensen, A. S., AAslying, M. D. and Bredie, W. L. P. 2008. Best-worst scaling: An introduction and initial comparison with monadic rating for preference elicitation with food products. Food Quality and Preference, 19, 579–588.

    Article  Google Scholar 

  • Jellinek, G. 1964. Introduction to and critical review of modern methods of sensory analysis (odour, taste and flavour evaluation) with special emphasis on descriptive sensory analysis (flavour profile method). Journal of Nutrition and Dietetics, 1, 219–260.

    Google Scholar 

  • Kim, H. S., Lee, H. S., O’Mahony, M. and Kim, K. O. 2008. Paired preference tests using placebo pairs and different response options for chips, orange juices and cookies. Journal of Sensory Studies, 23, 417–438.

    Article  Google Scholar 

  • Kimmel, S. A., Sigman-Grant, M. and Guinard, J.-X. 1994. Sensory testing with young children. Food Technology, 48(3), 92–94, 96–99.

    Google Scholar 

  • Koster, E. P., Couronne, T. Leon, F., Levy, C. and Marcelino, A. S. (2003) Repeatability in hedonic sensory measurement: A conceptual exploration. Food Quality and Preference, 14, 165–176.

    Article  Google Scholar 

  • Lucas, F. and Bellisle, F. 1987. The measurement of food preferences in humans: Do taste and spit tests predict consumption? Physiology and Behavior, 39, 739–743.

    Article  CAS  Google Scholar 

  • Marchisano, C., Lim, J., Cho, H. S., Suh, D. S., Jeon, S. Y., Kim, K. O. and O’Mahony, M. 2003. Consumers report preference when they should not: A cross-cultural study. Journal of Sensory Studies, 18, 487–516.

    Article  Google Scholar 

  • Meyners, M. 2007. Easy and powerful analysis of replicated paired preference tests using the c2 test. Food Quality and Preference, 18, 938–948.

    Article  Google Scholar 

  • Moskowitz, H. R. 1983. Product Testing and Sensory Evaluation of Foods. Marketing and R&D Approaches. Food and Nutrition, Westport, CT.

    Google Scholar 

  • Mueller, S., Francis, I. L. and Lockshin, L. 2009. Comparison of best-worst and hedonic scaling for the measurement of consumer wine preferences. Australian Journal of Grape and Wine Research, 15, 1–11.

    Article  Google Scholar 

  • Newell, G. J. and MacFarlane, J. D. 1987. Expanded tables for multiple comparison procedures in the analysis of ranked data. Journal of Food Science, 52, 1721–1725.

    Article  Google Scholar 

  • Odesky, S. H. 1967. Handling the neutral vote in paired comparison product testing. Journal of Marketing Research, 4, 199–201.

    Article  Google Scholar 

  • Prescott, J., Norris, L., Kunst, M. and Kim, S. 2005. Estimating a “consumer rejection threshold” for cork taint in white wine. Food Quality and Preference, 18, 345–349.

    Article  Google Scholar 

  • Roessler, E. B., Pangborn, R. M., Sidel, J. L. and Stone, H. 1978. Expanded statistical tables for estimating significance in paired-preference, paired difference, duo-trio and triangle tests. Journal of Food Science, 43, 940–941.

    Article  Google Scholar 

  • Quesenberry, C. P. and Hurst, D. C. 1964. Large sample simultaneous confidence intervals for multinomial proportions. Technometrics, 6, 191–195.

    Article  Google Scholar 

  • Saliba, A. J., Bullock, J. and Hardie, W. J. 2009. Consumer rejection threshold for 1,8 cineole (eucalyptol) in Australian red wine. Food Quality and Preference, 20, 500–504.

    Article  Google Scholar 

  • Scheffe’ H. 1952. On analysis of variance for paired comparisons. Journal of the American Statistical Association, 47, 381–400.

    Google Scholar 

  • Schmidt, H. J. and Beauchamp, G. K. 1988. Adult-like odor preference and aversions in three-year-old children. Child Development, 59, 1136–1143.

    Article  CAS  Google Scholar 

  • Schraidt, M. F. 1991. Testing with children: Getting reliable information from kids. ASTM Standardization News, March 1991, 42–45.

    Google Scholar 

  • Schutz, H. G. 1971. Sources of invalidity in the sensory evaluation of foods. Food Technology, 25, 53–57.

    Google Scholar 

  • Sidel, J. L., Stone, H. and Bloomquist, J. 1981. Use and misuse of sensory evaluation in research and quality control. Journal of Dairy Science, 64, 2296–2302.

    Article  Google Scholar 

  • Sidel, J., Stone, H., Woolsey, A. and Mecredy, J. M. 1972. Correlation between hedonic ratings and consumption of beer. Journal of Food Science, 37, 335.

    Article  Google Scholar 

  • Stone, H. and Sidel, J. L. 1978. Computing exact probabilities in discrimination tests. Journal of Food Science, 43, 1028–1029.

    Article  Google Scholar 

  • Tepper, B. J., Shaffer, S. E. and Shearer, C. M. 1994. Sensory perception of fat in common foods using two scaling methods. Food Quality and Preference, 5, 245–252.

    Article  Google Scholar 

  • Tuorila, H., Hyvonen, L. and Vainio, L. 1994. Pleasantness of cookies, juice and their combinations rated in brief taste tests and following ad libitum consumption. Journal of Sensory Studies, 9, 205–216.

    Article  Google Scholar 

  • Van Trijp, H. C. M. and Schifferstein, H. N. J. 1995. Sensory analysis in marketing practice: Comparison and integration, Journal of Sensory Studies 10, 127–147.

    Article  Google Scholar 

  • Vickers, Z. and Mullan, L. 1997. Liking and consumption of fat free and full fat cheese. Food Quality and Preference, 8, 91–95.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

13.1 Appendix 1: Worked Example of the Ferris k-Visit Repeated Preference Test Including the No-Preference Option

A consumer test with 900 respondents is completed with the following results.

Questions: (1) Is there a significantly higher preference for product A or product B? (2) Is the preference for the winning product higher than 45%? (Example from Ferris (1958) and Bi (2006), pp. 72–76).

Table 6

(N = 900)

Here are the basic equations we need:

  • N y = N ao + N oa + N bo + N ob

  • N x = N ab + N ba

  • M = NN aaN bb

$$p = \frac{{M - \sqrt {\left[ {M^2 - (N_{{\textrm{oo}}} + N_y /2)(2N_x + N_y )} \right]} }}{{2N_{{\textrm{oo}}} + N_y }}$$

So for this data set:

M = 100 (all those not showing AA or BB behavior, i.e., a consistent choice for one of the two products)

  • N x = 8 + 14 = 22

  • N y = 14 + 12 + 17 + 11 = 54

  • p = 0.257

Now we need the equations for the best estimates of each segment/proportion:

$$\pi _{\textrm{A}} = \frac{{\left[ {N_{{\textrm{aa}}} (1 - p^2 )} \right] - \left[ {(N - N_{{\textrm{bb}}} )p^2 } \right]}}{{N(1 - 2p^2 )}}$$
$$\pi _{\textrm{B}} = \frac{{\left[ {N_{{\textrm{bb}}} (1 - p^2 )} \right] - \left[ {(N - N_{{\textrm{aa}}} )p^2 } \right]}}{{N(1 - 2p^2 )}}$$
and
$$\pi _{\textrm{o}} = 1 - \pi _{\textrm{A}} - \pi _{\textrm{B}}$$

Now we can get our segment size estimates:

  • π A = 0.497 or 49.7% true preference for product A.

  • π B = 0. 370 or 37% true preference for product B.

  • π o = 0.133 or 13.3% no real preference.

Next, we need the variability and covariance estimates for the Z-tests:

$${\textrm{Var}}(\pi _{\textrm{A}} ) = \frac{{\pi _{\textrm{A}} (1 - \pi _{\textrm{A}} ) + {{(3\pi _{\textrm{o}} p^2 )} \mathord{\left/ {\vphantom {{(3\pi _{\textrm{o}} p^2 )} 2}} \right. \kern-\nulldelimiterspace} 2}}}{N}$$
$${\textrm{Var}}(\pi _{\textrm{B}} ) = \frac{{\pi _{\textrm{B}} (1 - \pi _{\textrm{B}} ) + {{(3\pi _{\textrm{o}} p^2 )} \mathord{\left/ {\vphantom {{(3\pi _{\textrm{o}} p^2 )} 2}} \right. \kern-\nulldelimiterspace} 2}}}{N}$$
$${\textrm{COV}}(\pi _{\textrm{A}} ,\pi _{\textrm{B}} ) = \frac{{{{(\pi _{\textrm{o}} p^2 } \mathord{\left/ {\vphantom {{(\pi _{\textrm{o}} p^2 } {2) - (\pi _{\textrm{A}} \pi _{\textrm{B}} )}}} \right. \kern-\nulldelimiterspace} {2) - (\pi _{\textrm{A}} \pi _{\textrm{B}} )}}}}{N}$$

  • Var(π A) = 0.000296 (so π A is 47.9% ± 1.7%, 0.017 = \(\sqrt{0.000296}\))

  • Var(π B) = 0.000297

  • COV(πA, πB) = –0.000198

Now for the hypothesis tests:

$$Z = \frac{{\pi _{\textrm{A}} - \pi _{\textrm{B}} }}{{\sqrt {{\textrm{Var}}(\pi _{\textrm{A}} ) + {\textrm{Var}}(\pi _{\textrm{B}} ) - 2{\textrm{Cov}}\pi _{\textrm{A}} \pi _{\textrm{B}} } }}$$

Note that this is a little different from the simple binomial test for paired preference. In the simple case we test the larger of the two proportions against a null value of 0.5. In this case we actually test for a difference of the two proportions, since we do not expect a 50/50 split any more with the no-preference option.

So the Z for test of A versus B gives Z = 4.067, an obvious win for product A.

Finally, a test against a minimum required proportion or benchmark:

$$Z = \frac{{\pi _{\textrm{A}} - 0.45}}{{\sqrt {{\textrm{Var}}(\pi _{\textrm{A}} )} }}$$

Z-test for A versus benchmark of 0.45 (45%).

13.2 Appendix 2: The “Placebo” Preference Test

In this method a pair of identical samples are given on one of two preference test trials (Alfaro-Rodriguez et al., 2007; Kim et al., 2008). These physically identical samples are not expected to differ, hence the parallel to a placebo, or a sham medical intervention with no expected therapeutic value. In theory, this could provide a baseline or control condition, against which performance in the preference test (with a no-preference option) could be measured. However, the amount of information gained from this design is relatively small and once again the analysis becomes more complicated. For these reasons, the sensory professional should consider the potential cost, additional analysis, and interpretations that are necessary. A recommended analysis is given at the end of this section.

Issues and complications. The use of a no-preference option was proposed to be a solution to the problem of a 50/50 preference test result, which could result from two stable segments of consumers who have a (perhaps strong) preference for each of the versions, respectively. Hence the idea was to offer a no-preference option with the reasoning that if there were no preferences (rather than stable segments) respondents should opt for the no-preference response. However, persons given identical samples will avoid the no-preference option 70–80% of the time, as discussed earlier in this chapter. So an answer concerning the question of stable segments cannot be obtained by this approach. Evidence for stable segments could be found by replicated testing or by converging evidence from different kinds of tests and/or questions.

Possible analyses. It might be tempting to just eliminate those persons expressing a preference for one of the identical pair members, on the grounds that such respondents are biased. However, this could eliminate 70–80% of the consumers. It is generally not advisable to pre-select consumers on any other grounds than their product consumption, and this approach eliminates individual who are in fact a portion of the representative population we are trying to generalize the results to. Such persons may not be “biased” in any dysfunctional way. Even identical samples may seem different from moment to moment. Furthermore, individuals are clearly responding to the demands of the task (you are expecting them to state a preference in a preference test!).

If historical data are available concerning the frequencies of response to identical pairs, a chi-square test can be performed as shown below. Note that this analysis cannot be done for a test in which the same subjects provide the “placebo” judgments because the chi-square test assumes independent samples.

Placebo analysis #1. Using historical data for expected frequencies.

Cells in the top row (A1, NP1, and B1) are expected frequencies (expected proportion × N judges). Cells in row 2 (A2, NP2, and B2) are the obtained data, frequencies of response in the preference comparison for the actual (different) test samples.

Table 7

Placebo analysis #2. The same consumers participate in the placebo trial and the test pair.

If the same people are used for the “placebo” trial and the normal preference comparison trial, an option is to recast the data into another 2 × 3 table, with the rows now representing whether the individual expressed a preference (or not) on the placebo trial. The columns remain Prefer A, no preference, Prefer B. A chi-square test will now tell you whether the proportions of preference changed comparing those who elected the no-preference option for the placebo pair versus those who expressed a “false preference” for the identical pair. This does not provide evidence for the existence of any stable segments nor will it tell you if there is a significant preference from this analysis alone.

If there is no significant chi square, you can feel justified in combining the two rows. If not, you may then analyze each row separately. The correct analyses are as stated in the no-preference Section 13.4: eliminating no-preference judgments, apportioning them, doing a test of d-prime values, or the confidence interval approach if the assumptions are met.

Each judge is classified into one of the six cells, A1, A2, B1, B2, NP1, or NP2. The first row contains the data from people who reported no preference on the placebo pair. The second row contains the data from people who expressed a preference with the placebo pair. A chi-square test will now show whether the two rows have different proportions. If not, the rows may be combined. If they are different, separate analyses may be performed on each row, using the methods of analysis of the no-preference option discussed earlier in the chapter.

Table 8

13.3 Appendix 3: Worked Example of Multinomial Approach to Analyzing Data with the No-Preference Option

This approach yields multinomial distribution confidence intervals for “no-preference” option, Data should be from a large test, N > 100, and the no-preference option was used rarely (<20%) (Quesenberry and Hurst, 1964, p. 193, Eq. (2.9)).

Upper and lower confidence interval boundaries are given by

$${\textrm{CI}} = \frac{{\chi ^2 + 2X \pm \sqrt {\chi ^2 \left[ \displaystyle{\frac{{\chi ^2 + 4X(N - X)}}{N}} \right]} }}{{2\left( {N + \chi ^2 } \right)}}$$

where

  • \(\chi _{{\textrm{critical}}}^{\textrm{2}} = 5.99\) for α at 5% and 2 df,

  • X = number of observed preference votes for one sample,

  • N = sample size.

Example: For: N = 162, X 1 = 83, X 2 = 65, and no preference = 14.

First find the confidence interval for product X 1 (choices = 83/162)

$$\begin{aligned}{\textrm{CI}}&= \frac{{5.99 + 2(83) \pm \sqrt {5.99\left[ {\frac{{5.99 + 4(83)(162 - 83)}}{{162}}} \right]} }}{{2\left( {162 + 5.99} \right)}}\\& = \frac{{171.99 \pm 31.15}}{{335.98}}\end{aligned}$$
which gives an interval from 0.42 to 0.60 for product X 1.

Then find the confidence interval for product X 2 (choices = 65/162)

$$\begin{aligned}{\textrm{CI}}&= \frac{{5.99 + 2(65) \pm \sqrt {5.99\left[ {\frac{{5.99 + 4(65)(162 - 65)}}{{162}}} \right]} }}{{2\left( {162 + 5.99} \right)}}\\& = \frac{{135.99 \pm 30.39}}{{335.98}}\end{aligned}$$
which gives an interval from 0.31 to 0.50 for product X 2. The lower bound of the higher proportion (0.42) overlaps with the upper bound of the lower proportion (0.50). We therefore conclude that there is not enough evidence for any difference in the preference proportions.

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Lawless, H., Heymann, H. (2010). Preference Testing. In: Sensory Evaluation of Food. Food Science Text Series. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-6488-5_13

Download citation

Publish with us

Policies and ethics