Testing and Profiling Athletes: Recommendations for Test Selection, Implementation, and Maximizing Information PDF
Document Details
Uploaded by Deleted User
Jonathon Weakley, Georgia Black, Shaun McLaren, Sean Scantlebury, Timothy J. Suchomel, Eric McMahon, David Watts, and Dale B. Read
Tags
Related
- 77 Pathology in the Athlete (Dikis 2024) PDF
- Care of the Athlete With Type 1 Diabetes Mellitus PDF
- Testing and Profiling Athletes: Recommendations for Test Selection, Implementation, and Maximizing Information PDF
- Final Study Guide for App PDF
- Statistics: Hypothesis Testing with Means and Proportions (PDF)
- Female Athlete Health PDF
Summary
This article discusses recommendations for selecting tests and evaluating physical qualities in athletes to improve training.
Full Transcript
Testing and Profiling Athletes: Downloaded from http://journals.lww.com/nsca-scj by BhDMf5ePHK...
Testing and Profiling Athletes: Downloaded from http://journals.lww.com/nsca-scj by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywC Recommendations for X1AWnYQp/IlQrHD3i3D0OdRyi7TvSFl4Cf3VC1y0abggQZXdtwnfKZBYtws= on 04/12/2024 Test Selection, Implementation, and Maximizing Information Jonathon Weakley, PhD,1,2,3 Georgia Black, PhD,4 Shaun McLaren, PhD,5 Sean Scantlebury, PhD,3,6 Timothy J. Suchomel, PhD,7 Eric McMahon, MEd,8 David Watts, MSc,9 and Dale B. Read, PhD3,10 1 School of Behavioural and Health Sciences, Australian Catholic University, Brisbane, Queensland, Australia; 2Sports Performance, Recovery, Injury and New Technologies (SPRINT) Research Centre, Australian Catholic University, Brisbane, QLD, Australia; 3Carnegie Applied Rugby Research (CARR) Centre, Carnegie School of Sport, Leeds Beckett University, Leeds, United Kingdom; 4Queensland Firebirds, Nissan Arena, Brisbane, Queensland, Australia; 5 Newcastle Falcons Rugby Club, Newcastle Upon Tyne, United Kingdom; 6England Performance Unit, Rugby Football League, Leeds, United Kingdom; 7Department of Human Movement Sciences, Carroll University, Waukesha, Wisconsin 8National Strength and Conditioning Association, Colorado Springs, Colorado; 9Queensland Academy of Sport, Brisbane, Queensland, Australia; and 10Department of Sport and Exercise Sciences, Institute of Sport, Manchester Metropolitan University, Manchester, United Kingdom ABSTRACT develop impactful testing batteries so batteries can ensure a competitive that practitioners can maximize their edge over the opposition by providing Understanding the physical qualities of understanding of athletic development information to better guide training athletes can lead to improved training while helping to monitor changes in prescription and monitor changes in prescription, monitoring, and ranking. performance (58). Furthermore, infor- performance to better individualize and Consequently, testing and profiling mation gleaned from testing can be support training. It also provides rec- athletes is an important aspect of used to identify talent and help justify ommendations on the selection of strength and conditioning. However, the selection of athletes (16,36,72,81). tests and their outcome measures; results can often be difficult to interpret However, testing can also be misused, considerations for the proper inter- because of the wide range of available resulting in physical qualities being pretation, setup, and standardization of tests and outcome variables, the misunderstood or misrepresented testing protocols; methods to maxi- diverse forms of technology used, and (34,51). Therefore, if information is mize testing information; and tech- the varying levels of standardization being gathered to help guide the deci- niques to enhance visualization and implemented. Furthermore, physical sions of coaches, it is important to interpretation. qualities can easily be misrepresented ensure that the most accurate and im- without careful consideration if funda- pactful information is being collected mental scientific principles are not INTRODUCTION followed. This review discusses how to KEY WORDS: he testing and profiling of ath- Address correspondence to Jonathon Weak- ley, [email protected]. T letes are essential for strength and conditioning coaches. Data from carefully constructed testing physical qualities; monitoring; S&C; technology; coaching; strength and conditioning Copyright Ó National Strength and Conditioning Association 159 Strength and Conditioning Journal | www.nsca-scj.com Copyright © National Strength and Conditioning Association. Unauthorized reproduction of this article is prohibited. Testing and Profiling Athletes can help guide the tests that are selected. Furthermore, once the tests have been decided on, “how” testing occurs is essential to establish as this ensures the integrity of the retrieved Downloaded from http://journals.lww.com/nsca-scj by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywC information. How testing is conducted can make a substantial difference to the outcomes of nearly all tests and X1AWnYQp/IlQrHD3i3D0OdRyi7TvSFl4Cf3VC1y0abggQZXdtwnfKZBYtws= on 04/12/2024 encompasses how tests are standard- ized and implemented, the equipment and variables used, and how the data are handled. With physical testing being an integral part of strength and conditioning, it is important to acknowledge and detail the key considerations that can ensure Figure 1. When deciding on a test, it is important to consider whether you can rank, effective, efficient, and impactful imple- monitor, and prescribe training for athletes with the collected data. mentation. This narrative review builds Although 2 of these outcomes may suffice, ideally, a test would have all 3. on previous work (45,46) by providing An example of a commonly used test with all 3 considerations is the 1 an overview of essential reasoning and repetition maximum (1RM) back squat. Coaches can prescribe with these justification that can help improve test data (particularly if these are combined with a load-velocity profile), use selection, provide practical and scien- this information to help rank athletes as strength is an important physical tific recommendations to ensure accu- quality across most sports, and monitor changes in strength over time as it has acceptable levels of reliability. 1RM 5 1 repetition maximum. rate and reproducible testing that can maximize the interpretation of physical and presented. This is particularly large amounts of data often being avail- qualities, and offer suggestions to pro- important for teams or sporting orga- mote optimal uptake of information. It able (56). This can cause practitioners nizations investing significant time and will also provide examples and real- to be overwhelmed with information resources into an athlete. world evidence to support the interpre- (i.e., “paralysis through analysis”), tation of recommendations. Considering the importance of testing select inappropriate testing methods for coaches and athletes, it is essential or outcomes (i.e., the lack of under- SELECTING TESTS to consider why and how the testing is standing of the test and its underpin- Testing within strength and condition- being implemented. While the grow- ning physiological/biomechanical ing should be simple. Fundamentally, ing acceptance of sports science and constructs), or cause “testing for test- important physiological qualities technology has helped to continue ing’s sake.” Thus, understanding the should be assessed (e.g., speed or the development and innovation “why” can support decisions around strength) and testing protocols should within strength and conditioning what information is retained and help be completed consistently across time. (85,99), it has also led to extremely determine the purpose, which in turn Although it may be tempting to try and Figure 2. Types of validity and how they interact with each other. 160 VOLUME 46 | NUMBER 2 | APRIL 2024 Copyright © National Strength and Conditioning Association. Unauthorized reproduction of this article is prohibited. Table 1 Overview and explanations of the different types of validity Types of validity Explanation and example Downloaded from http://journals.lww.com/nsca-scj by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywC Test validity Translational The extent to which a test outcome is a good reflection of what it intends to represent. A test can be considered to have good translational validity if it possesses adequate face X1AWnYQp/IlQrHD3i3D0OdRyi7TvSFl4Cf3VC1y0abggQZXdtwnfKZBYtws= on 04/12/2024 and/or content validity. Face A subtype of translational validity and sometimes referred to as logical validity. What a test superficially seems to measure, regardless of what it actually measures. A vertical CMJ might have limited face validity for inferring an athletes upper-body power. Note, this does not mean it is not a useful test for an intended purpose, or that it may be correlated to upper-body power. Content A subtype of translational validity. The extent that the content of a test matches and measures all elements of a given construct. A questionnaire intending to assess perceived recovery may need to contain several items that cover different domains of recovery (e.g., physical or mental). These items should be determined by consensus from subject matter experts (48). Construct The test’s ability to accurately represent the underlying construct. Tests such as the 30–15 IFT and the YYIRT (1 and 2) share strong associations and are believed to represent high-intensity intermittent running capacity—a performance construct underpinned by several physiological qualities (e.g., aerobic, anaerobic, and neuromuscular) (70). Convergent A subtype of construct validity. The extent to which 2 tests that should seem reflective of a similar construct are indeed related. A large correlation has been evidenced between a standardized running test that is purported to measure “leg stiffness” and a repeated hopping test that is claimed to assess “leg stiffness” (41). Discriminant A subtype of construct validity. The extent to which test outcomes or groups tested on an outcome that should not be related are indeed unrelated. Isometric midthigh pull peak force is able to distinguish between amateur and professional rugby players. Discriminant validity is evident because these 2 groups can be expected to differ in their maximal strength due to training status and playing standard (known group difference) (13). Criterion The strength of an association between the scores from an alternative test and the scores from a criterion measure. There is a strong, positive association between velocity calculated from 3D motion capture systems and linear position transducers during resistance training exercises (90). Predictive A subtype of criterion validity. How accurate an alternative test can predict future behavior of a criterion measure or performance indicator. A test of maximal dynamic strength (e.g., a maximal back squat) may accurately predict performance in weightlifting (e.g., competition snatch 1RM) (73). Concurrent A subtype of criterion validity. The strength of association and agreement between 2 different assessments measured at the same time. The concurrent validity of 10-m sprint time (timing gates vs motion capture) would be substantially reduced if the athlete was to start 50 cm behind the starting gates rather than directly behind (e.g., 1 cm) the starting gates (89). (continued) 161 Strength and Conditioning Journal | www.nsca-scj.com Copyright © National Strength and Conditioning Association. Unauthorized reproduction of this article is prohibited. Testing and Profiling Athletes Table 1 (continued ) Methodological validity Downloaded from http://journals.lww.com/nsca-scj by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywC Internal The degree of control taken to account for potential confounding variables that can influence a test outcome. A field-based test of maximal aerobic speed may be internally valid if all assessments are performed on the same surface, at the same time of day, and in comparable weather X1AWnYQp/IlQrHD3i3D0OdRyi7TvSFl4Cf3VC1y0abggQZXdtwnfKZBYtws= on 04/12/2024 conditions (e.g., wind, heat). External The extent to which test results can be generalized to other athletes, places, or time points. The influence of growth and maturation on physical test performance is generalizable across both sexes and sports. Athletes who mature early typically have a physical advantage over their less-mature counterparts (80). Ecological A subtype of external validity. How well a test relates to actual sporting performance and matches the athletes real sporting context. Using a running-based assessment of aerobic capacity for sports that involve running would help strengthen the ecological validity of the test. Conclusion Sometimes referred to as statistical conclusion validity. The extent to which conclusions about relationships or effects are accurate, credible, or believable, as far as statistical issues are concerned. Concluding that athletes with a higher weekly internal training load had greater improvements in preseason fitness (based on a positive correlation between the 2) may be inaccurate if baseline fitness (confounding variable) was not accounted for. make a test seem more “specific” to a least 2 (and ideally 3) of the following that if 2 athletes from the same playing sport (e.g., adding a basketball free concepts can be achieved: pool are compared, and all other physi- throw after a change of direction test), Ranking cal qualities and technical/tactical skill- by altering a test, the assessment of the Monitoring sets are equal, the athlete with the underlying physiological quality is often Prescription greater ability in the tested quality lost, and what is being quantified is no These concepts, which are not listed in should be ranked higher. It should be longer clear. Ironically, attempts to the rank order of importance, help noted that the physical quality should make a test more sport specific often ensure that there is a purpose behind be important for sporting performance undermine the development of an ath- each assessment and that the test can or has established indirect relationships lete because the test loses construct and be used to guide practice. Figure 1 and with performance. For example, a wide ecological validity. Therefore, when the explanations below discuss ranking, receiver in American football needs to testing athletes, the physiological qual- monitoring, and prescription. have high levels of acceleration and ity must be identified (e.g., maximum strength or aerobic capacity) and prac- The ability to rank athletes is an impor- maximum speed (63). Therefore, if 2 titioners should be comfortable know- tant concept that helps guide athlete players were to be compared and all ing that a single test cannot assess all selection. Ranking refers to the concept other physical qualities and technical/ physiological capacities simultaneously. The tests that coaches select and implement with their athletes should serve a purpose. Both athletes and coaches have limited time and the col- lection of data that is unusable or not maximized can be a waste of time and resources. Therefore, it is important to consider the test’s purpose and what can be gleaned from its completion. To help guide practitioners in their selection of tests, it is proposed that when assessing physical qualities, at Figure 3. Visual representation of validity and reliability. 162 VOLUME 46 | NUMBER 2 | APRIL 2024 Copyright © National Strength and Conditioning Association. Unauthorized reproduction of this article is prohibited. Table 2 Monitoring performance of a test should Recommendations and considerations for improving test reliability in sports also be placed within the context of an science athlete’s entire physical development. Tests can be confounded by a range of Recommendations for the improvement of testing reliability and outcomes variables that, if not accounted for, may Downloaded from http://journals.lww.com/nsca-scj by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywC All equipment and recording forms are available and prepared. shroud the true change in an athlete’s performance. For example, athlete sprint All equipment is calibrated and operating correctly. times may not seem to improve over a X1AWnYQp/IlQrHD3i3D0OdRyi7TvSFl4Cf3VC1y0abggQZXdtwnfKZBYtws= on 04/12/2024 Testing conditions are similar. This includes environmental conditions and surfaces collegiate career. However, when body (e.g., running track). mass is accounted for, it is clear that sub- Time of day is consistent. stantial improvements in momentum could have occurred (42). For collision All pretest protocols (e.g., warm-ups) have been specified and implemented sports, this is naturally a great advantage. consistently. Similarly, increases in body mass may Testing order is consistent. mask improvements in aerobic capacity as athletes develop. However, increased Testers are familiar and competent with all testing protocols. body mass and maintenance in aerobic Athletes are in good health, have had sufficient rest before testing, and are injury free. field tests can indicate greater running economy and improved high-intensity Athletes are dressed appropriately (e.g., light and nonrestrictive clothing) and running ability (11). Similar statements consistently (e.g., running spikes are not used on only 1 sprint occasion). can be made for commonly imple- Athletes are familiar with testing protocols. mented tests, such as the countermove- Similar levels of encouragement are provided on each occasion. ment jump and corresponding kinetic variables (e.g., force), which can be Ensure that biological and technological error are established and appropriately strongly influenced by changes in body attributed to the correct source. mass. Consequently, practitioners must Normal dietary intake is consumed with alcohol and caffeine consumption limited carefully scrutinize their data beyond before testing. absolute values and understand the inter- action of other physical qualities on per- Based on Woolford et al. (102). formance. This can not only provide an improved understanding of physical changes for practitioners but also reas- tactical skillsets were equal, the player important qualities can be used to guide sure and educate athletes who have not with the greatest acceleration and max- the ranking of athletes. seen the results they desire from a test. imum speed should be preferentially Grounded in the concept of reliability Using testing information to guide ranked as this would promote greater and sensitivity, selecting tests that allow training prescription should be a pri- performance outcomes. Alternatively, if practitioners to accurately monitor mary consideration for the strength 2 rugby league players were to be com- whether improvement has occurred is and conditioning practitioner. The pared, one who had high levels of lower- essential for longitudinal tracking. Ide- ability to test athletes, identify their body strength and another who had low ally, the test should be reliable so that strengths and weaknesses, and also levels of strength, it could be argued that there are small amounts of noise (i.e., improve their training is essential and the stronger athlete should be ranked variability in performance) and sensitive tests have varying levels of application. higher than the weaker athlete. enough to measure when an improve- For example, the 30–15 intermittent Although rugby league is a complex ment in the physical quality has fitness test (30–15 IFT) has greater occurred. In tests that have a range of application than a Yo-Yo intermittent sport and the relationship between outcome measures (e.g., the counter- recovery test because several program- improvements in lower-body strength movement jump), the use of highly var- ming tools have been developed and and on-field performance is difficult to iable metrics, such as rate of force validated to guide prescription from directly ascertain, greater strength is development (26), is not recommended this test (4). Alternatively, tests of max- likely essential for helping mitigate the as these make monitoring changes imal dynamic strength (e.g., 1-3RM effects of collisions, support fundamental extremely difficult. It is acknowledged back squat) have greater prescriptive skills (e.g., wrestling within rucks), and that theoretically an outcome variable utility than an isometric assessment support recovery postmatch (18,35,82). can be interesting, but due to the vari- (e.g., isometric midthigh pull [IMTP]) Consequently, selecting tests that accu- ability associated with the measure, it is because strength coaches can prescribe rately measure fundamental and difficult to monitor. loads as a percentage of maximum 163 Strength and Conditioning Journal | www.nsca-scj.com Copyright © National Strength and Conditioning Association. Unauthorized reproduction of this article is prohibited. Testing and Profiling Athletes Table 3 Approximate between-day coefficient of variation (%) for commonly used measures of physical capacity Test Approximate coefficient of variation (%) Reference Downloaded from http://journals.lww.com/nsca-scj by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywC Strength measures Back squat 2.0 (2) Front squat 2.5 (96) X1AWnYQp/IlQrHD3i3D0OdRyi7TvSFl4Cf3VC1y0abggQZXdtwnfKZBYtws= on 04/12/2024 Bench press 2.0 (96) Chin up 3.5 (96) Prone bench pull 2.5 (19) Isometric midthigh pull peak force 3.5 (53,78) Jump and jump-related variables CMJ height 3.5 (8,65,68) CMJ concentric peak power 3 (8,65) CMJ concentric mean power 4 (8,65) CMJ concentric peak force 3 (8,65) CMJ concentric mean force 2 (8,65) CMJ concentric impulse 2 (52) Squat jump height 5 (43,61) Dynamic strength index (SJ:IMTP PF) 5 (78) Reactive strength index (FT:GCT) 4 (6) Sprint times 10-m sprint 3 (12,68) 20-m sprint 2 (12,68) 30-m sprint 2 (12,68) 40-m sprint 2 (12,68) Field endurance assessments YoYo IR1 10 (1,15,38,76) YoYo IR2 10 (1,39,76) 30–15 IFT (VIFT ) 2 (15,77) 2-km time trial 2 (23) All values are approximate values due to differences between populations and technology used during data collection. SJ 5 squat jump; IMTP 5 isometric midthigh pull; PF 5 peak force; FT 5 flight time; GCT 5 ground contact time; IR 5 intermittent recovery; IFT 5 intermittent fitness test; VIFT 5 final running velocity at the completion of the 30–15 intermittent fitness test. capacity using this information. Con- VALIDITY, RELIABILITY, AND collected are often a poor reflection of sidering this, if faced with the need to SENSITIVITY—THE HEART OF the individual’s capacity or not a reflec- assess a capacity, coaches should stra- ATHLETE TESTING tion of that capacity at all. Further- tegically select tests that allow for Fundamental to athlete testing and more, when the sensitivity of a test is improved training prescription to help profiling is the concepts of validity, poor, interpretation of changes in the individualize and maximize the subse- reliability, and sensitivity. If a test has test between time points can be quent training block. low validity and/or reliability, the data extremely difficult. Consequently, 164 VOLUME 46 | NUMBER 2 | APRIL 2024 Copyright © National Strength and Conditioning Association. Unauthorized reproduction of this article is prohibited. example, when assessing accelerative ability with a 10-m sprint, starting an athlete 50 cm behind a timing gate or triggering timing using a front foot trigger (as is commonly conducted Downloaded from http://journals.lww.com/nsca-scj by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywC within practice and throughout the scientific literature (12,92,93)) substan- tially reduces the concurrent (criterion) X1AWnYQp/IlQrHD3i3D0OdRyi7TvSFl4Cf3VC1y0abggQZXdtwnfKZBYtws= on 04/12/2024 validity because these methods rou- tinely miss ;20–50% of the athlete’s acceleration phase (89). In this instance, the criterion validity of the timing gates is not changed (i.e., the timing system is accurate), but modi- fications to the starting method have substantially altered the outcome. In a situation such as this, criterion validity becomes the victim but the issue stems from internal validity (i.e., the test design does not allow a true reflection of the observed results). Figure 4. Visual example of how “test error” can influence the interpretation of a Conversely, on the opposite end of the performance score. Bars are the CV (the standard error of measurement as a methodological validity spectrum, eco- percentage, shown hypothetically as 2%, 5%, and 10%). CV 5 coefficient of logical validity refers to how well a test variation. relates to actual athlete performance and whether it can be applied to real- when considering whether to use a reflection compared with a standard life settings. For instance, asking field test, it is important to establish whether measure (i.e., criterion validity) (86,87). hockey athletes to complete a cycling a test indeed measures that physical Recent reviews of global and local posi- time trial to establish V̇ O2max has lim- quality. In addition, it is important to tioning systems (10) and commonly used ited ecological validity. Alternatively, a quantify the normal variation between resistance training monitoring devices field-based running assessment (e.g., assessments (i.e., the repeatability of (e.g., linear transducers or accelerome- 30–15 IFT) may be more appropriate. the test and its outcomes). ters) (90) have highlighted several con- This example additionally highlights cerns and considerations with these the consideration for construct validity. Validity. Validity refers to whether a forms of technology. Specifically, these Coaches may use tests such as a test indeed measures what it was de- reviews highlight the importance of laboratory-based V̇ O2max assessment signed to measure (32). There are sev- comparing devices to a “gold-standard” or the 30–15 IFT to assess cardiovas- eral forms of validity which can be criterion. This is important because if the cular “fitness.” The former achieves this classified based on the accuracy of criterion does not accurately reflect a through direct measurement of aerobic the outcome measure (i.e., test validity) measure, then the device that it is being capacity, whereas the latter is a con- or how trustworthy the protocols, con- compared with can have a misleading struct within itself (high-intensity inter- clusions, and generalizations are (i.e., amount of error (either increased or mittent running ability) that comprised methodological validity [often termed decreased). Furthermore, it is essential aerobic capacity, as well as other phys- experimental validity in a research set- to establish the accuracy of different out- ical qualities such as anaerobic and ting]; Figure 2). In Table 1, we detail come measures that are reported from neuromuscular qualities. Therefore, it the different forms of validity and how technology. For example, when measur- is important for practitioners to under- they relate to testing physical qualities. ing back squat performance, mean and stand which physical constructs are peak barbell velocity can both be as- being assessed and the extent to which All types of validity are important when the tests used are an accurate represen- sessed. However, a single device can assessing an athlete’s physical qualities tation of the definitions of that report very different levels of accuracy and evidence for validity in several of construct. dependent on which outcome measure its subdomains is often necessary. With is used (9,87,91). the growing uptake of sports technology for the monitoring of athletes, it is impor- Threats to validity can occur not only Reliability. Reliability refers to the tant to establish whether the equipment from technology but also from the test degree of repeatability, reproducibility, being used provides an accurate instructions and protocols used. For or consistency in a measure (49,102). A 165 Strength and Conditioning Journal | www.nsca-scj.com Copyright © National Strength and Conditioning Association. Unauthorized reproduction of this article is prohibited. Testing and Profiling Athletes across days (59). Consequently, to make accurate inferences about changes in performance, coaches should quantify the reliability of each test and outcome measure with their Downloaded from http://journals.lww.com/nsca-scj by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywC cohort of athletes or have strong grounds to justify the reliability from a similar cohort in the literature X1AWnYQp/IlQrHD3i3D0OdRyi7TvSFl4Cf3VC1y0abggQZXdtwnfKZBYtws= on 04/12/2024 (8,65,68). Recommendations for enhancing test reliability and reducing measurement error are supplied in Table 2. For tests of physical performance or capacity, it is recommended that the reliability of a test is established across the period that data will be routinely collected and interpreted (i.e., between-day reliability). Typically, longer periods between test-retest assessments result in less reliable out- comes. This has implications for tests of a more exhaustive nature, such as those assessing maximal high- intensity intermittent running ability, which are typically performed.6 weeks apart (50). Furthermore, it is important to test in a standardized state (e.g., 48 hours of rest before the test) and when changes in physical per- formance/capacity would not be believed to have changed (e.g., after Figure 5. Annotated example of change in 10-m sprint performance in a single strenuous exercise). If human error athlete (youth soccer player) across 12 weeks. The top figure illustrates the can be introduced through assessment raw times (s) presented with the SEM, which is approximately 1.6% (24). (e.g., skinfold measurements for esti- The bottom figure demonstrates the corresponding test change score mates of body composition), intrarater relative to the first testing occasion, presented with the adjusted SEM. The shaded region depicts the SWC for 10-m sprint time in soccer players, and interrater reliability should be which is said to be around 2% (24). A difference of 2% in 10-m time would quantified and, if possible, minimized. allow a player to be ahead of an opponent over this distance in a one-on- To reduce this variability and improve one duel to win the ball (25). SEM, standard error of measurement; SWC, measurement reliability, all assessors smallest worthwhile change. should be adequately trained (e.g., International Society for the Advance- test outcome can be reliable even if it is or a specific outcome measure. For ment of Kinanthropometry for body not valid (Figure 3), but if it is not reli- example, jump height during the coun- composition measurements), and able, then it cannot be valid. To be able termovement jump could be influ- changes in assessor between premea- to assess changes in performance, the enced by the instructions provided to surements and postmeasurements reliability of the test needs to be estab- should be avoided if possible. Finally, the athlete (37,62), the method of cal- environmental conditions (e.g., tem- lished (test-retest reliability). If a test culation (e.g., flight time versus perature, wind, and testing surface) cannot be reliably reproduced, coaches impulse-momentum relationship ver- should be standardized to enhance cannot confidently state whether an sus take-off velocity) (55), or the tech- the reliability of physical performance athlete has truly improved in a test. nology used (60). Alternatively, for testing. Naturally, this can be difficult As with internal validity, a range of anthropometry and body composition, when testing outdoors. Therefore, factors can influence reliability, and food or fluid consumption could alter practitioners should carefully consider these factors are often unique to a test outcomes and should be standardized where and when testing occurs. 166 VOLUME 46 | NUMBER 2 | APRIL 2024 Copyright © National Strength and Conditioning Association. Unauthorized reproduction of this article is prohibited. Approximate between-day coefficient of variations (CVs; the SEM expressed as a percentage of the mean) for com- monly used physical capacity tests are provided in Table 3. The CV or SEM Downloaded from http://journals.lww.com/nsca-scj by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywC can be used to assess test sensitivity, such as tracking changes within an individual. Full details on reliability X1AWnYQp/IlQrHD3i3D0OdRyi7TvSFl4Cf3VC1y0abggQZXdtwnfKZBYtws= on 04/12/2024