Sensory Testing Methods

Sources: Coffee Sensory and Cupping Handbook by Fernández-Alduenda & Giuliano (SCA, 2021)

Sensory science uses three distinct categories of tests, each answering a different question. They should not be mixed in a single session, as they activate different cognitive modes in the taster.

Category	Question	Who answers it	Example tools
Difference testing	”Are these two coffees detectably different?”	Trained tasters or cuppers	Triangulation, 3-AFC
Affective testing	”How much do people like this?”	Consumers or cuppers	9-point hedonic scale, JAR scales
Descriptive analysis	”What are this coffee’s sensory characteristics?”	Trained descriptive panel	CATA, full descriptive panel

Difference Testing

Purpose

To determine whether a sensory difference exists between two (or more) coffee samples. Does not identify what is different or whether the difference is desirable — only that it is detectable.

Triangulation test

The most widely used difference test in coffee. A taster is presented with three cups — two of the same coffee (A, A) and one different (B) — and asked to identify the “odd” cup.

Setting up a triangulation:

Pre-grind samples from the same batch brew to ensure the “same” samples are truly the same
Serve at the same temperature; use identical cups; control visual cues (cup crust, color)
Randomize the position of the odd cup (AAB, ABA, BAA, BBA, BAB, ABB all used equally)
Minimum 6 tasters per session for reliable results

Interpreting triangulation results: The δ’ (delta-prime) statistic measures the sensory magnitude of the difference between two coffees. At δ’ = 1.75, about 10/18 tasters pass; at δ’ = 2.98, about 14/18 pass.

Controlling difficulty (δ’): The level of difficulty of a triangulation can be deliberately adjusted by blending two very different coffees (C. arabica and C. canephora) at different ratios. A 20:80 arabica:canephora blend vs. pure arabica creates a very high difficulty level (low δ’); a 5:95 blend vs. pure arabica creates a lower difficulty level (higher δ’).

Applications:

Q Grader exams: candidates must pass 5/6 triangulations at a given difficulty level
Taster training: assess and develop individual sensory discrimination ability
Processing experiments: test whether a change in fermentation time, drying method, etc. produces a detectable difference

3-AFC (Alternative Forced Choice)

Much more powerful than triangulation for detecting differences in a specific attribute. A taster is given three cups and asked to identify which has the highest intensity of a named attribute (e.g., “Which cup has the highest acidity?”).

3-AFC is approximately twice as powerful as triangulation for the same attribute and sample — meaning you need fewer tasters to detect the same level of difference. However, the attribute must be clearly defined and tasters must understand it.

Applications:

Research into processing variables (does longer fermentation increase acidity intensity?)
Roast profile comparison (does a 1°C temperature change affect body intensity?)

Setting up difference tests well

Ensure “same” samples come from the same batch and brew
Serve all cups at the same temperature (coldest brew = slightly easier to find differences; hottest = most volatile aromatics)
Use black or very dark cups to eliminate visual cues if color of brew is a variable
Prohibit communication between tasters until all forms submitted
Use paper forms rather than digital during the test (reduces social signaling)

Affective Testing

Purpose

To measure how much tasters like a product, or which of two products they prefer. Affective tests measure subjective experience — the whole point is to capture individual variation, not suppress it.

9-point hedonic scale

The gold standard for affective food testing. Developed by the US Armed Forces in the 1940s to measure soldiers’ food preferences. Terms:

Score	Label
9	Like extremely
8	Like very much
7	Like moderately
6	Like slightly
5	Neither like nor dislike
4	Dislike slightly
3	Dislike moderately
2	Dislike very much
1	Dislike extremely

The scale has equal psychological distance between each level. It has been validated across thousands of studies and numerous product categories. It can be used with cartoon faces instead of words for children or to reduce language barriers.

Limitation: does not tell you why people like or dislike — only how much. Must be combined with descriptive data to be actionable.

JAR (Just About Right) scales

Measures the appropriateness of a specific attribute. Example for “acidity”:

Much too low
A little too low
Just about right
A little too high
Much too high

Useful for diagnosing specific consumer complaints and optimizing recipes (e.g., finding the ideal brew strength for a café’s house blend).

Preference mapping

Combines hedonic data with descriptive data to build a “map” of consumer preferences. Uses hierarchical cluster analysis and principal component analysis (PCA) to:

Segment consumers by shared preference patterns
Identify which sensory attributes drive liking in each segment
Map coffee products to consumer segments

Example finding relevant to Kaiserblick: Research on brewing preferences across a student population found two consumer clusters: “strong coffee likers” (TDS ~1.5%, driven by nutty, roasted, dark chocolate attributes) and “weak coffee likers” (TDS ~1.0%, driven by tea/floral, sweet, cereal attributes). Kaiserblick’s specialty light roast targets the second cluster, who are underserved by mainstream commodity coffee.

Descriptive Analysis

Purpose

To objectively quantify the sensory attributes of a coffee. Unlike affective testing, descriptive analysis deliberately suppresses preference and focuses on neutral, accurate characterization. Output can be correlated with processing variables, chemical composition, cupping scores, or consumer preference data.

Full descriptive panel

8–12 trained panelists with calibrated sensory references
Panelists assess attribute intensities on unstructured 15cm line scales (no numbers; tick marks only)
Data analyzed by ANOVA, spiderweb plots, principal component analysis
Gold standard for research; expensive and time-consuming for commercial operations

CATA (Check-All-That-Apply)

A rapid, lower-cost profiling method. Panelists check all descriptors from a predefined list that apply to the sample. More practical than full descriptive analysis for commercial use.

Setting up a CATA test:

Use the Coffee Taster’s Flavor Wheel nine primary categories as the starting attribute list, or a subset appropriate to the coffees being evaluated
Each panelist checks all applicable terms independently
Statistical analysis: chi-square test on frequency of each attribute; correspondence analysis builds “flavor maps”
CATA data can be overlaid with affective data (hedonic scores) to connect descriptors to liking

Descriptive cupping (Fernández-Alduenda rapid method): Combines trained cuppers with the SCA Cupping (Cata) protocol and CATA descriptors. Cuppers use structured SCA scoring plus CATA attribute notes. Dramatically cheaper than a full sensory panel while still providing usable descriptive flavor data. The minimum panel size is 6 cuppers.

Applying These Methods at Kaiserblick

Use case	Recommended method
Q Grader certification training	Triangulation, controlled difficulty level
Pre-shipment vs. arrival comparison	3-AFC (specific attributes) or triangulation
Processing experiment (fermentation variable)	Triangulation + 3-AFC for acidity/body
Lot description for export customers	Descriptive cupping (CATA) using Flavor Wheel
Café customer preference research	9-point hedonic scale + JAR for brew strength
Roast profile development	Descriptive cupping + hedonic scale

Kaiserblick Knowledge

Explorer

Sensory Testing Methods

Sensory Testing Methods

Difference Testing

Purpose

Triangulation test

3-AFC (Alternative Forced Choice)

Setting up difference tests well

Affective Testing

Purpose

9-point hedonic scale

JAR (Just About Right) scales

Preference mapping

Descriptive Analysis

Purpose

Full descriptive panel

CATA (Check-All-That-Apply)

Applying These Methods at Kaiserblick

Graph View

Table of Contents

Backlinks

Kaiserblick Knowledge

Explorer

Sensory Testing Methods

Sensory Testing Methods

Difference Testing

Purpose

Triangulation test

3-AFC (Alternative Forced Choice)

Setting up difference tests well

Affective Testing

Purpose

9-point hedonic scale

JAR (Just About Right) scales

Preference mapping

Descriptive Analysis

Purpose

Full descriptive panel

CATA (Check-All-That-Apply)

Applying These Methods at Kaiserblick

Related pages

Graph View

Table of Contents

Backlinks