Comparing the Real-World Accuracy of Mechanical Cycling Power Meters

Mechanical cycling power meters are delicate devices that perform hundreds of measurements every minute, in the very noisy environment of a bike pounded continuously by an aggresive rider, the wind, and the bumps of the road. It is remarkable that, with all these complications, the manufacturers of mechanical power meters report accuracies as good as 1-3%. Independent tests reported in scientific journals have indeed shown comparable levels of accuracy (albeit closer to 3-5%; as you can see, e.g., for SRM, PowerTap, and Stages) but only under carefully reproduced conditions, when riding at constant speeds and at constant power outputs. The question in everyone’s mind, of course, is how well do power meters perform in the real world, when used by cyclists on real roads, with all the speed and power changes of a regular ride. (A future post will feature a comparison of the accuracy of mechanical and GPS based cycling power meters).

Mechanical power meters estimate the power output of a cyclist using the simple expression

Power = Force * Cadence

where the Force on a rotating device (e.g., the crank, the hub, or the pedal, depending on the manufacturer) is measured by a number of strain gauges. The most immediate concern in this technique is, of course, related to the accuracy in the calibration of the strain gauges and has been discussed at length elsewhere. This calibration depends on the ambient temperature, which is usually taken into account using models devised and calibrated by each manufacturer. It also depends on any minor deformation of the device caused by regular usage, which is why you need to send regularly a power meter to the manufacturer, for servicing and recalibration.

There are, however, two fundamental limitations in this approach to power measurement that are hard to overcome, even in the unrealistic case that one uses a perfectly calibrated sets of strain gauges. These limitations arise from the fact that neither can the cadence be measured nor does the force remain constant at the 1% level, during one single revolution of the crank. But let’s consider each one of these two effects in turn, starting with the cadence.

Measuring the Cadence

Most cyclists ride at a cadence of 80-90 RPM, but values as low as 60 RPM or as high as 120 RPM are not uncommon during steep uphills and downhills. This range of cadence corresponds to 1-2 revolutions per second in the crank (or something similar in the hub, depending on the gear ratio used). In other words, in order to achieve a 1% accuracy in power reported every second, a mechanical power meter needs to measure the cadence (i.e., the number of revolutions per second) with an accuracy of 1%, by using at most one or two full revolutions of the crank. This is a tall order, even for the most sophisticated devices found in research labs.

If a cyclist keeps his/her cadence constant for many revolutions (as is the case in the studies linked above), then a simple averaging of the measurements over many seconds helps increase the accuracy of the measured cadence. However, during a normal ride (and especially during accelerations and decelerations) the cadence changes continuously. In this case, the accuracy can be improved by averaging over several revolutions (i.e., several seconds). However, there is a tradeoff between improving the accuracy of the measurements and not being able to follow the rapid changes in power that may occur every few seconds.

In a study published in 2009 in the International Journal of Sports Medicine, researchers explored this limitation using a dynamic calibration rig to apply torque on a stationary bicycle equipped with an SRM power meter (which is arguably the most accurate mechanical power meter available commercially). The benefit of using the rig (as opposed to a human) is that one can dial in any desired power and cadence profile. When the cadence and power were kept constant for a prolonged time (30 minutes to an hour) and the data were averaged accordingly, the accuracy of the SRM was as good as two tenths of a percent (0.2%). However, when the cadence was increased from 0 to 120 RPM over 3-4 seconds, SRM showed a lag in measuring the corresponding power increase and reported lower peak power (both of which are evidence of averaging).

The Figure below is based on the data of the above study and shows the difference between the applied (Rig) and the measured (SRM) power during one of the cycles of rapid change in the pedaling cadence. If we average the power during the first 12 seconds of the acceleration (shown by the dashed line), the difference in power is 11%. The same SRM device that was accurate to a fraction of a percent during a prolonged test of constant cadence becomes 50 times more inaccurate when the cadence changes rapidly.


The variable nature of the force on the pedals

Most cycling power meters report power at 1 second intervals or, in other words, power averaged over one or two revolutions of the crank. It is well known, however, that the force applied on the pedals is not constant but changes between the downward and upward motion of the revolution. Even if this change is relatively smooth, how accurate a device can measure this average depends on how often it records the force during each revolution of the crank.

If the force changes by X% during a revolution and is measured N times, then the accuracy of the average measurement is simply X/N percent (This is straightforward to show but too elaborate to do it here; for the mathematically inclined, it is important that it does not scale with the square root of N). In other words, if we assume that the force changes by X=50% (factors of 2 or 3 are often seen in cyclists at an intermediate level) and the power meter makes N=10 measurements per revolution, then the best accuracy that can be achieved is about 50/10=5%. This is again a limitation in the accuracy of measurement that is as large as 5-10%.

Real world comparisons

What is then the true accuracy of a mechanical power meter in a real-world situation? The above arguments suggest that there are at least two limitations (at the 5-10% level) in the measurement of cycling power with a mechanical power meter due to the variable nature of the applied force and of the cadence. On top of these one needs to add uncertainties in the calibration of the strain gauges, the presence of additional stresses (e.g., caused by chain stretching in crank-based devices), losses of power in the drive train (which can be as large as 3%), and environmental effects related to temperature changes and usage deformations of the power meter.

Unfortunately, it is very difficult to predict how large the combination of these effects is for each device and rather unpractical to measure it by putting a (very large and heavy) calibrated rig on a bicycle during a ride in the countryside. The next best thing to do is attach two different power meters on a bicycle and compare in detail their output. Even though neither of the two power meters might be precise or accurate, it is extremely unlikely that they will both be affected by the same errors, at the same time.

There have been several studies comparing two commonly used power meters on the same bicycle (SRM and PowerTap), both in the scientific literature (see, e.g., one in 2001 and one in 2005) and at a more informal level (see, e.g., the Rosetta Stone Files), as well as more daring tests cross-comparing three power meters (such as Boyd Johnson’s comparison).

The Figure below compares the power recorded by SRM to that recorded by PowerTap during the Rosetta Stone ride (see this link for the details of the ride). There is an overall 7% difference in the average power, which is comparable to the number reported in the detailed study published in 2001. However, a closer look at the two power curves reveals that PowerTap does not simply report 7% lower values of power during the whole ride. Instead, the two power measurements oscillate around each other, especially during segments of significant change in power ouput.

STP PowerTap comparison

The following Figure shows the distribution of the power differences in each segment, for the entire ride. As expected, the distribution peaks at +7%, which is the difference between the two averages. However, the distribution is significantly broader, with 1/3 of the segments having differences larger than 15% in the reported power. Is this surprising? Not at all give our discussion above. It only takes 2-3 sources of uncertainty, each of which contributing about 7%, to reach an overall level of accuracy that is 15%. And to put it in more useful numbers, if you average 150 Watts in a ride, do not bother looking at ups and downs in your power curve that are smaller than about 20 Watts; they are just noise.SRM_PTAP_diff


The punch line? Mechanical power meters have gone a long way toward providing useful results for cycling training and racing. However, if you find yourself caring about 20-30 Watt changes during your rides, then (to repeat a wonderful quote from DC Rainmaker), please send me the money you would have spent on a power meter and I will happily provide you regularly with a set of random numbers for you to use.


Leave a comment

You must be logged in to post a comment. Log in