Every cyclist who's spent more than six months training with a power meter has done at least one FTP test and wondered whether the number was real. Maybe the number went up by 15 watts after a good block and it felt like validation. Maybe it went down after a hard week and it felt like failure. Maybe it came back the same number as last time and you couldn't tell if your fitness was stable or your test was inconsistent. The frustration is universal, and it comes from a misunderstanding that most cycling content encourages: the belief that FTP is a fixed physiological number you're trying to measure, when in fact FTP is an estimation protocol for a number the protocol can't actually measure directly.
This guide is the practical, evidence-based version of what FTP tests measure and don't measure. It draws on Andrew Coggan's original work on the FTP framework, the lactate-threshold research that underlies it, and the critical literature that has complicated the picture in the twenty years since FTP entered amateur cycling culture. The goal is to give you an honest picture of what your FTP number actually represents, why different test protocols give different numbers for the same athlete, and how to use the number for training decisions without being misled by its apparent precision.
What is FTP supposed to represent?
Functional Threshold Power, as originally defined by Andrew Coggan in the early 2000s, is 'the highest power output a cyclist can maintain in a quasi-steady state for approximately one hour'. That definition is important because it's simultaneously physiological and operational. Physiologically, FTP is meant to correspond to the maximal lactate steady state — the intensity at which lactate production and lactate clearance are still in balance, and above which lactate accumulates rapidly until the effort becomes unsustainable. Operationally, FTP is meant to be measurable in the field without requiring a lab lactate test.
The practical framework that Coggan built around FTP — the 7-zone training system, TSS, Intensity Factor, Normalized Power, the Performance Management Chart — is what made the concept dominant in amateur cycling coaching. Zones, load tracking, and training targets all derive from a single reference number, and that reference number is FTP. If the FTP number is accurate, the whole framework works. If the number is off, every derived calculation is off too.
The catch is that there is no easy way to measure the true physiological threshold in the field. A proper lactate threshold test requires a laboratory, a gas analyzer, a blood lactate measurement, and a trained physiologist to interpret the results. The 20-minute test and the ramp test and the 8-minute test are all attempts to approximate that physiological reality with a field test that any cyclist with a power meter can perform. The approximation is usually good enough to drive training decisions, but it is an approximation, and the difference between approximation and measurement is the source of most confusion about FTP numbers.
What does the Coggan 20-minute test actually measure?
The 20-minute test, as Coggan originally proposed it, consists of a warm-up, a 5-minute all-out effort (to open up the legs and dump some glycogen), 10 minutes of easy spinning, and then a 20-minute maximal effort. The average power of the 20-minute effort is multiplied by 0.95 to get FTP.
The 0.95 multiplier comes from the empirical observation that, for most trained cyclists, 1-hour maximum sustainable power is about 95 percent of 20-minute maximum sustainable power. Coggan derived this from a large sample of trained cyclists' actual power data, and it holds reasonably well for the population — most athletes are within a few percent of the 95 percent rule. But individual athletes can be systematically off in either direction. Some cyclists have unusually high anaerobic capacity and can hold 97 to 98 percent of their 20-minute power for an hour; others are more aerobically limited and can only hold 90 to 92 percent. For those athletes, the 0.95 multiplier produces a systematically wrong FTP.
The 20-minute test also measures what you can do in 20 minutes, which is not the same as what you can do in 60 minutes. The metabolic systems that fuel a 20-minute effort include a meaningful anaerobic contribution — blood lactate rises significantly during the test, the cyclist is working above the true sustainable threshold for some portion of the effort, and the final few minutes are often in a state of accumulating fatigue. The 0.95 correction tries to subtract the anaerobic component to isolate the sustainable power, but the correction is an average, not an individual truth.
The 20-minute test is still the most commonly used FTP test in amateur cycling, and it's a reasonable default for most athletes. But it should be understood as an indirect estimate that assumes you're average, and athletes who suspect they're not average should cross-check with other test protocols or pay attention to their actual sustainable-hour power when they do long efforts.
What does the ramp test measure?
The ramp test, popularized by Zwift and TrainerRoad, consists of a warm-up followed by a progressive ramp where power increases by 20 to 25 watts every minute until the athlete cannot continue. FTP is calculated as 75 percent of the average power during the last minute of the test (or equivalently, 75 percent of the maximal aerobic power, which is what the ramp test is really measuring).
The ramp test measures maximal aerobic power (MAP) — the highest sustained 1-minute power the athlete can reach before failure — and then uses a fixed multiplier (0.75) to estimate FTP. This is a fundamentally different approach than the 20-minute test. Instead of estimating sustainable power from a long effort, it estimates sustainable power from a maximal aerobic ceiling. The assumption is that the MAP-to-FTP ratio is stable across athletes at approximately 0.75, and for the population average this is roughly correct.
But individual variance here is even larger than for the 20-minute test. The ratio between MAP and FTP depends on an athlete's aerobic-to-anaerobic profile. Athletes with strong anaerobic capacity can produce very high MAP numbers without having correspondingly high sustained power, which makes their ramp-test FTP estimate too high. Athletes with strong aerobic base but less explosive capacity have the opposite problem — their MAP is modest, but their sustainable power is closer to their MAP than the 0.75 ratio assumes, and their ramp-test FTP comes in too low.
The practical consequence is that the ramp test often gives a different FTP number than the 20-minute test for the same athlete on the same day. Differences of 10 to 20 watts are common. Neither number is 'right' in an absolute sense — they're both estimates of the same underlying physiology from different angles, and the 'correct' number depends on which estimate better captures the individual athlete's profile.
What does the 8-minute test measure?
The 8-minute test, developed by Hunter Allen and Andrew Coggan as an alternative to the 20-minute test, consists of two 8-minute maximal efforts separated by 10 minutes of easy spinning. FTP is calculated as the average of the two 8-minute efforts multiplied by 0.9.
The rationale is that some athletes cannot sustain a 20-minute all-out effort well — the pacing is harder, the mental demand is greater, and the result is sometimes unreliable. Two shorter 8-minute efforts are easier to pace and produce more consistent numbers. The 0.9 multiplier is smaller than the 0.95 of the 20-minute test because the 8-minute effort is shorter and more anaerobic, so the sustainable-power estimate drops further from the measured value.
The 8-minute test has a similar systematic issue to the 20-minute test: the multiplier is a population average, not an individual truth. Athletes with high anaerobic capacity hold 92 to 94 percent of their 8-minute power for an hour; athletes with strong aerobic profiles hold only 85 to 88 percent. For the outliers, the 0.9 multiplier is wrong.
In practice, the 8-minute test is less commonly used than the 20-minute test, and most cyclists who have done both report that the 8-minute test produces a slightly higher FTP number than the 20-minute test. The difference is usually 5 to 15 watts, which is meaningful for training zones and suggests that the two tests are not measuring exactly the same thing even when applied correctly.
What is MLSS and why doesn't any field test actually measure it?
Maximal Lactate Steady State is the true physiological definition of threshold: the highest constant-power effort at which blood lactate concentration remains stable over 30 minutes, rather than rising progressively. Above MLSS, lactate accumulates and the effort becomes unsustainable within roughly an hour. Below MLSS, the effort could theoretically be sustained much longer. MLSS is what FTP is trying to estimate, and it's the closest thing to a physiological gold standard for endurance threshold power.
Measuring MLSS directly requires a laboratory protocol. The athlete rides at a fixed constant power for 30 minutes while blood lactate is drawn at multiple intervals. If lactate rises by more than 1 mmol/L across the effort, the power is above MLSS and the test is repeated at a lower power the next day. If lactate is stable, the power is at or below MLSS. A complete MLSS determination typically requires 3 to 5 test sessions on separate days to bracket the exact value, which is why it's not a routine part of amateur cycling assessment.
FTP, measured by any field test, does not equal MLSS exactly. The relationship is close but not identical. Research comparing field-test FTP to lab-tested MLSS in trained cyclists has found differences of 5 to 15 watts in either direction, depending on the athlete and the specific test protocol. For most athletes, FTP is within 10 watts of MLSS, which is close enough for training purposes but not physiologically identical.
The practical implication is that FTP is an operational proxy for MLSS, not a direct measurement of it. Training zones derived from FTP are useful because they roughly approximate the physiological zones you'd get from a true MLSS-based framework, but they're not perfect, and the 'my zones should be exactly X percent of FTP' level of precision is an artifact of the framework rather than a biological reality.
For most amateur cyclists, the distinction between FTP and MLSS is academic — training zones based on field-test FTP produce results that are close enough to ideal to drive good training decisions. The distinction matters more for elite athletes fine-tuning race performance and for researchers trying to isolate specific physiological adaptations.
Why is your indoor FTP different from your outdoor FTP?
Almost every cyclist who tests both indoors on a smart trainer and outdoors on the road finds that the numbers differ. The difference is usually 5 to 20 watts, and it's almost always in the direction of outdoor higher than indoor. The reasons are practical, not physiological, and they matter for how you interpret your FTP number.
- Cooling. Indoor cycling produces much more heat accumulation than outdoor cycling because the airflow is almost zero and sweat evaporation is poor. Core temperature rises faster, heart rate drifts higher at the same power, and sustainable output drops. Adding a large fan indoors can close the gap but rarely eliminates it.
- Motivation and mental demand. A 20-minute all-out indoor effort in a silent room is much harder mentally than a 20-minute effort on a favorite outdoor segment with Strava running. Athletes usually produce higher numbers when mentally engaged with the outdoor environment, even when the physical capacity is identical.
- Pacing. Outdoor tests allow the athlete to use terrain and traffic cues for pacing, which is easier than staring at a power number on a screen. Better pacing means more uniform effort distribution, which produces a higher average power.
- Power meter variance. Indoor smart trainers and outdoor crank or pedal power meters are often calibrated differently and can diverge by 3 to 8 percent for the same athlete. A number from a Kickr and a number from a Stages crank are not always comparable, and the 'difference' between indoor and outdoor FTP sometimes reflects meter variance rather than real physiological difference.
- Position. Indoor training is often done in a more upright, comfortable position than outdoor racing. The upright position produces slightly higher power at the same effort (because it reduces hip flexion pressure), but the difference is usually small.
The practical rule is to test FTP in the environment you'll primarily train in, and to use that number as your training anchor. If you mostly train indoors, test indoors. If you mostly train outdoors, test outdoors. Comparing numbers across environments will just confuse you.
Does a rising FTP always mean you got fitter?
Not necessarily, and this is one of the more subtle issues with FTP as a fitness proxy. The number can change for reasons other than underlying fitness improvement.
A rising FTP can reflect real fitness improvement — you're producing more watts for the same effort, your aerobic system is absorbing more oxygen, your lactate clearance has improved, your muscular efficiency has risen. This is the case most cyclists hope for after a training block, and it's genuinely common.
But a rising FTP can also reflect other things: a fresher test day, better weather, better pacing on the test, improved fueling, better rest, or just a lucky test where you pushed harder than usual. Conversely, a flat or declining FTP after a training block doesn't always mean the block failed. It can reflect test-day fatigue, subpar pacing, equipment issues, or the athlete being at a different point in the training cycle (a base phase is not the best context for peak FTP).
The most reliable way to tell whether FTP has really changed is to look at the 4-week trend rather than a single test. Multiple tests over a month, with consistent protocols and conditions, produce a more meaningful picture than a single test result interpreted in isolation. Athletes who retest frequently (every 4 to 8 weeks) have more reliable data than athletes who retest once a season and treat each result as definitive.
The other reliable indicator is performance in actual training and racing. If your FTP went up and your sweet-spot sessions feel appropriately harder, your 70.3 bike split is faster, and your perceived effort at a given pace is lower, the fitness change is real. If your FTP went up but your training sessions feel the same and your race times are identical, the number change is probably a test artifact rather than a real adaptation.
What are the most common FTP testing mistakes?
Five mistakes catch most amateur cyclists.
- Testing when you're tired and interpreting the result as 'declining fitness'. FTP is a maximal test, and maximal tests require fresh legs. Testing at the end of a hard training week reliably produces a number 10 to 30 watts below your actual sustainable power. The fix is to schedule tests after recovery days or easy weeks, not after hard ones.
- Pacing the 20-minute test like a time trial instead of like an all-out effort. The correct pacing for a 20-minute FTP test is to start slightly below expected, hold steady for 10 minutes, and finish above it. Athletes who go out too hard blow up and produce a low number; athletes who go out too easy leave watts on the table. Getting the pacing right usually takes 2 to 3 attempts.
- Comparing FTP numbers across different test protocols. A 20-minute test FTP and a ramp test FTP for the same athlete are not the same number and shouldn't be compared directly. If you switch protocols, expect a meaningful shift in the number that doesn't reflect a change in fitness.
- Treating FTP as a universal fitness score rather than a training anchor. FTP is useful for setting your training zones, calculating TSS, and prescribing interval intensities. It is less useful as a generic fitness comparison, because it doesn't account for durability, endurance over long rides, climbing ability, anaerobic capacity, or race execution. A higher FTP is not always a better cyclist.
- Retesting at the wrong time. Testing FTP during a base phase usually produces a lower number than testing after a build phase, even if fitness is equal, because FTP tests are acutely sensitive to recent intensity work. Testing immediately after a tough interval block or a race usually produces a higher number than the athlete's true sustainable capacity. The most representative tests come at the beginning of a build block after a taper week.
Key takeaways
- FTP is an estimation protocol for the maximal sustainable power over approximately one hour, and an indirect proxy for the maximal lactate steady state (MLSS).
- The 20-minute test multiplies 20-minute power by 0.95; the ramp test multiplies 1-minute max aerobic power by 0.75; the 8-minute test multiplies 8-minute average by 0.9. All three use population-average multipliers that can be wrong for individual athletes.
- None of these tests measure MLSS directly. MLSS requires a laboratory protocol with blood lactate measurement, and field-test FTP typically differs from true MLSS by 5 to 15 watts.
- Different test protocols produce different FTP numbers for the same athlete — differences of 10 to 20 watts between ramp test and 20-minute test are common.
- Indoor FTP is usually 5 to 20 watts lower than outdoor FTP due to heat, motivation, pacing, and power meter variance, not real physiological difference.
- A rising FTP does not always reflect fitness improvement — it can reflect better pacing, freshness, or luck on the test day. Look at trends, not single tests.
- FTP is a useful training anchor for setting zones, calculating TSS, and prescribing intervals, but it's not a universal fitness score and shouldn't be compared directly across athletes or test protocols.
- The best practice is to test in the environment you train in, with consistent protocol and conditions, and to interpret results as trend data rather than as individual measurements.
Frequently asked questions
Which FTP test is most accurate?
Depends on the athlete. The 20-minute test is the most widely used and produces a reasonable estimate for most athletes, but it requires good pacing discipline and mental commitment. The ramp test is faster, less physically demanding, and easier to pace, but it's more sensitive to individual differences in the MAP-to-FTP ratio. The 8-minute test is shorter and some athletes prefer it. None is 'most accurate' in an absolute sense — the best test is the one you can execute consistently and that produces numbers you can trust as trend data. Most cyclists pick one protocol and stick with it.
Why does my FTP number change between tests on the same week?
Usually because FTP tests are maximal efforts and maximal efforts are acutely sensitive to pacing, freshness, motivation, and environmental conditions. A test on Monday after a recovery day and a test on Friday after a hard week can easily differ by 15 to 25 watts without any real change in fitness. The solution is to test less often, under consistent conditions (same time of day, same warm-up, similar pre-test training), and to interpret trends over multiple tests rather than reacting to individual results.
Is FTP the same as lactate threshold?
Close but not identical. Lactate threshold has several definitions in the research literature — LT1 is the first rise in blood lactate above resting baseline (top of true Zone 2), LT2 or MLSS is the highest sustainable intensity where lactate production and clearance balance. FTP is an estimate of LT2 / MLSS, not LT1. But even for LT2, FTP is an operational proxy, not a direct measurement. Field-test FTP typically differs from lab-tested MLSS by 5 to 15 watts in either direction depending on the athlete and the test protocol.
Does FTP matter for anything outside cycling?
Not directly. FTP is a cycling-specific metric because it depends on the power-sensing capabilities of bike-specific equipment. Running has its own threshold concept (critical speed, or running threshold) measured differently. Swimming has its own threshold pace concept. The zones and training principles derived from FTP on the bike do have parallels in running and swimming, but the numbers themselves don't transfer — a 300W cyclist isn't automatically a sub-3-hour marathoner. Each discipline has its own threshold measurement and its own training zones.
How often should I re-test my FTP?
Every 6 to 12 weeks during an active training cycle, less often in base or recovery phases. Testing too frequently produces noisy data and reacts to short-term variance rather than real fitness changes. Testing too rarely means your training zones drift out of date as your fitness shifts. Most coaches working with amateur cyclists re-test at major phase transitions: start of build, start of specific phase, and sometimes mid-cycle if the athlete requests it. Between tests, trust the zones and adjust only if the perceived effort at zone targets clearly diverges from expectation.
Can I just use perceived effort instead of FTP?
Yes, and many experienced cyclists do. Perceived effort (RPE) is a legitimate training intensity metric that requires no equipment and no test protocol. It's less precise than power-based zones for short intervals but just as useful for most aerobic work, and it's robust to fatigue, fueling, and environment in ways that power numbers are not. The case for FTP and power zones is that they're more precise for prescribing specific interval intensities and for load tracking. The case for RPE is that it doesn't depend on any assumptions or estimations. Many good plans use both — power for intervals, RPE for easy days and long rides.
How CoreRise handles your FTP number
When you enter your FTP in CoreRise, the coach treats it as a training anchor for setting zones, prescribing intervals, and calculating TSS — not as a universal fitness score or a fixed physiological constant. The plan adjusts intensity targets based on your FTP, but the coach also pays attention to your perceived effort and the actual execution of sessions to catch cases where the number is wrong. If your sweet-spot session at 88 percent of FTP feels like a race effort, the coach will suggest re-testing or adjusting the anchor. If the same session feels suspiciously easy, the coach may push you to re-test upward.
Cora can also help you interpret FTP test results honestly. If your FTP went up by 20 watts after a base phase, the coach can help you check whether the rise reflects real fitness gains or whether it's plausibly a testing artifact. If your FTP went down after a build phase, the coach can diagnose whether it's fatigue, pacing, or real decline — and recommend the appropriate response. FTP is one of the most misinterpreted numbers in amateur cycling, and the coach's job is to make sure the number is serving your training rather than the other way around.
- FTP is used as a training anchor for zones and prescription, not as a fixed physiological measurement.
- Session intensity targets adjust to your FTP but the coach cross-checks against perceived effort and session execution.
- Re-testing is scheduled at major phase transitions rather than on a fixed calendar, reducing noise from over-frequent testing.
- Indoor and outdoor FTP are treated as potentially different numbers and the coach knows which environment you primarily train in.
- FTP changes are interpreted against session execution and race results rather than as standalone fitness verdicts.