FST Research

9.6 Field Validation Studies

9.6.1 1977 Study[1] Overview

In 1977, NHSTA undertook a study:

(1) To evaluate currently used physical coordination tests to determine their relationship to intoxication and driving impairment,

(2) To develop more sensitive tests that would provide more reliable evidence of impairment, and

(3) To standardize the tests and observation[2]. Tests Considered in the Study

The researchers originally tested with the following types of tests:

“Alcohol Gaze Nystagmus (AGN)

The jerking movement of the eye, which is known as Alcohol Gaze Nystagmus, occurs upon lateral gaze when BAC exceeds a critical level (≈.06%). The eye jerks in the direction of gaze, independent of head position.

Person is asked to cover one eye and follow movement of a small light or object with other eye without changing head position. Light is moved slowly to points requiring 30° and 40° lateral deviation of the gaze. Test is then repeated with the other eye. Eye is observed for jerking movement.

Walk and Turn, Heel-Toe

Person is instructed to walk straight line, touching heel to toe each step for nine steps, then turn and return along same line in the same manner. Demonstration is given.

Romberg (Balance)

Person is instructed to stand with feet together, head tipped back, eyes closed, arms at side. Position is demonstrated. Observe anterior-posterior sway, 45 sec. trial.


Person stands erect with eyes closed, arms extended horizontally. Instructions are to touch nose with index finger, alternating right and left hands as instructed. Demonstration is given.

One-Leg Stand

Person is instructed to stand with one leg held straight, slightly elevated off floor, forward, for 30 sec. trial. Eyes remain open.

Finger Count

Person is instructed to touch and count each finger in succession, counting aloud. Demonstrate, “Watch what I do. 1-2-3-4-5-5-4-3-2-1.”

Tongue Twisters

Person is asked to repeat such words as “methodist, episcopal, sophisticated statistics.”

Subtraction, Addition, Count Backwards

Person is instructed to subtract 3, beginning for example at 102, continuing to some specified number (or add continuously). Same general instructions are given for counting backwards.

Tapping Rate

Person is instructed to tap a telegraph key as rapidly as possible. Number of taps are recorded by electronic counter during 10 sec. trial.

Letter Cancellation

Person is asked to cancel all of a given letter in a paragraph of text during 30 sec. trial.


Person is asked to trace paper pathway (maze). Three 20 sec. trials are given.

Grip Strength

Person is instructed to squeeze as hard as possible a dynamometer of the type shaped like a pistol grip with grooves for each finger. This instrument measures force exerted in isometric contraction.

Coin Pick-Up

Three coins (or chips, matches) are placed on floor. Person is instructed to stand in one location and to pick up the coins one at a time, handing them to the examiner. Demonstration is given.

Two-Point Tactile Discrimination

Person is given 2-point tactile stimulation (forearm or back of hand, eyes closed) beginning with no separation of the two points, and is asked “How many places am I touching your arm?” Trials are repeated with increasing separation. Response measure is the first separation to which person responds “two.”

Color Naming (Attention Diagnostic Method, modified)

Card presents number 10-59, in random order, in 4 colors by row. Person is instructed to find sequence of 10 numbers, beginning with some designated number, and to report the color of each. Verbal response, for example, might be, “Ten-blue, eleven-white, twelve-yellow, thirteen-red, etc. . .” Response measure is the time to report the colors of ten numbers.

Serial Performance

The device for this test consists of a small box. Five toggle switches and a small bulb are mounted on the face of the box. The box is presented to the subject with all switches in the center position. Subject is told to move the switches and that when they are in the correct sequence of up-down positions, the red light will come on”[3]. Final Battery of Six Field Sobriety Tests

The Test battery was narrowed down to HGN, Walk & Turn, One Leg Stand, Finger to Nose, Finger Count, and Tracing.

It is important to note all of the tests were done in a laboratory setting. NHTSA even recognized this would improve the results.

“The most common practice is to test a DWI suspect at roadside, but it also is possible to delay all tests until the person has been transported to the station. There is considerable advantage to always giving tests in the same environment.[4]

“At BACs ≥.10% the officers correctly decided to arrest 84% of the cases, and for BACs <.10% They made the correct decision to release 73% of the time. However, note that the officers indicated they would have arrested 101 persons, 47 of whom had BACs below .10%. Obviously, an error rate of 47% in making arrests is not acceptable. Actually, officers in the field are reluctant to err in the direction of false alarms, and observations indicate that the most common error probably is a false negative. In the laboratory where the same consequences do not ensue from false alarm decisions to arrest, there was a tendency to be less conservative and to lower the criterion for arrest.

There is a fundamental problem for the officers, stemming from the fact that BAC is a continuously distributed measure. As with any such distribution there is a limit on the related decision process, because the human organism can discriminate accurately only a limited number of points on such a scale. Since .10% is an arbitrary level which does not coincide with the onset of impairment, the difficulty of the task of categorizing DWI suspects is increased”[5].

Again, the tests are designed TO MEASURE BAC, NOT IMPAIRMENT. “If the officer was required simply to decide whether or not a driver showed impairment, or if the criterion BAC was closer to the point where impairment initially is apparent, there would be fewer decision errors at roadside”[6].

The results of the 6 tests were broken down to show the correct decision if just that test was used.


% Overall                                   % < .10                         % >=.10








Walk and Turn








Finger Count








One-Leg Stand




Nystagmus - left




- right




- total





NHTSA recognized “as apparent in the false alarms, decision errors occur most often with middle range levels of intoxication. Quite simply, there are no behavioral cue which differentiate infallibly in a ± .02%  BAC margin”[7].

The results showed “the nystagmus measure is superior to any other single test and compares favorably to a long battery. (Note: the differences between left and right eye seem to be due primarily to vision problems. e.g., restricted vision in one eye due to brain injury, one artificial eye, etc.)[8]

“Nystagmus (was) the best single index of intoxication. It is particularly valuable because it is an involuntary response. Police officers can readily learn to observe and evaluate the jerking movement. A simple device can be used to control the extent of eye deviation precisely, but the phenomenon also can be induced and observed in any environment without special equipment[9].”

It is important to note that the Nystagmus test used in the laboratory is significantly different than the HGN test as presently administered. In the study, “for the Alcohol Gaze Nystagmus measure a simple device was developed by SCRI which utilizes the position of the small light to control the angle of eye deviation. The individual was asked to cover the left eye and to follow with the right eye the movement of the small light as the examiner moved to it to 30° and 40° positions on the right. He then was asked to cover the right eye, and the same procedure was followed for the left eye in the left visual field[10].

The researchers determined that “the jerking movement of the eye, which is known as Alcohol Gaze Nystagmus, occurs upon lateral gaze when BAC exceeds a critical level (~.06%)”[11].

So, in the study, the researchers worked in a lab, with a device to measure angles, and only checked for for nystagmus at a certain degree, covered one eye, and determined the test would show a BAC of .06. This is quite different than the present version done on the side of the road without optimum light, not covering the eyes, guessing at 45 degrees, doing three different tests for HGN, and determining BAC is either .04, .08, or .10 depending on which NHTSA study is used.

After concluding the initial research, NHTSA decided to do further testing and came up with the standardized battery of field sobriety tests used today; One Leg Stand, Walk and Turn, HGN.


9.6.2 1981 Study[12] Overview

“Administration and scoring procedures were standardized for a sobriety test battery consisting of the walk-and-turn test, the one leg stand test, and horizontal gaze nystagmus. The effectiveness of the standardized battery was then evaluated in the laboratory and, to a limited extent, in the field. Ten police officers administered the tests in the laboratory to 297 drinking volunteers with blood alcohol concentrations (BACs) ranging from 0 to 0.18%. The officers were able to classify 81% of these volunteers, on the basis of their test scores, with respect to whether their BACs were above or below 0.10%. Officer estimates of the BACs of people they tested differed by 0.03% on the average from the actual BAC”[13].

It is interesting to note the .03 margin of error. If the test is measuring a BAC of .10, then with the margin of error, the test is just as likely measuring a BAC of .07. How many cases with a chemical test of .07 are not brought forth by the police, reduced by the prosecutor, or won at trial? Yet the State (and usually the judge) thinks the FSTs are the best evidence ever, besides a chemical test. Study Findings

The 1981 study showed some important limitations in the FSTs, as well as administering them in conditions more ideal than on the side of the road. Again, most of the tests were in a laboratory. Walk and Turn

Walk and Turn Test: “The suspect is asked to assume a heel-to-toe position on a designated line, with his/her arms at the sides, while the remainder of the instructions are given. He or she is then told to make nine heel-to-toe steps on the line, to turn around keeping one foot on the line, and to return in nine heel-to-toe steps. The suspect is requested to watch his/her feet at all times, making sure that every step is heel-to-toe and that the steps are taken in a straight line.[14]” (emphasis added).

“Requesting that people “watch their feet” while performing this test also increases its sensitivity to alcohol, but makes the task difficult for people with monocular vision (i.e., poor depth perception). Performing the walk-and-turn task with the eyes open with enough light to see some frame of reference is essential if sober individuals are to perform the test without difficulty. Finally, we found that the time taken to walk the line and the number of steps taken were relatively unimportant variables in terms of altering the sensitivity of the test to alcohol.[15]

How often do police ask the driver if the driver suffers from monocular vision? I have never heard an officer testify that is one of the things asked about. If someone has poor depth perception, according to NHTSA, the test is “difficult.” Further, “enough light”, while not defined, certainly could be absent in DWI stops on a side of the road at night where the only light is the police officer’s headlights and/or spotlight or flashlight shining directly behind the person.

‘Certain individuals have difficulty with this test when sober, including: people over 65 years of age; people with back, leg, or middle-ear problems; and people with high-heeled shoes (over two inches). We recommend that only the nystagmus test be used with the first four categories of stopees, while people with high-heeled shoes should be asked to remove them[16].’

So, it appears NHTSA originally conceded the test should not even have been offered to those individuals, suggesting it has no reliability. That certainly sounds different than a “totality of the circumstances”, or going to the “weight of the evidence.”

“Standardizing this test for every possible road condition was beyond the scope of this project, so we recommend that the walk-and-turn test be performed on a dry, hard, level, nonslippery surface and under relatively safe conditions. If these requirements cannot be met at roadside, we recommend that the suspect be asked to perform the test elsewhere or that only the nystagmus test be used. The test also requires a line which the police officer can manufacture.”[17] (emphasis added).

If the arresting officer actually administered the FSTs in a laboratory or police station, maybe the results would be a little more reliable. One Leg Stand

“The suspect is asked to stand with his/her heels together, feet at a slight angle and arms at the sides. He or she is then asked to raise one leg about six inches off the ground (i.e., with both legs kept straight) and to hold that position while counting rapidly from 1001 to 1030. Either leg may be raised.

Generally, few variables alter the sensitivity of the one-leg stand test. The most sensitive variable was time. We found that a suspect at a BAC of 0.10% might easily keep his/her balance for 20-25 seconds, but would likely falter after that time period. Consequently, the officer must ask the stopee to count aloud from 1001 to 1030 in order to estimate the passage of 30 seconds.[18]

I am not sure how NHTSA determines counting rapidly from 1001 to 1030 takes 30 seconds (testing it myself, I average around 15-20 seconds). It is interesting how NHTSA emphases that the time is crucial, yet the officer can just base it upon the count of the person instead of a watch (which is the present guideline/standard for this test).

“Two other important variables are that: (1) the suspect must be able to see in order to orient himself or herself; and (2) the police officer must stand back from the suspect in order not to provide an artificial reference frame which could distract the suspect. Generally, if the stopee cannot see or orient with respect to a perpendicular frame of reference, then this test will be difficult to perform even if sober[19].”

What must the suspect be able to see? How often have you heard an officer testify that he stood in a spot as to not provide an artificial frame of reference?

“Certain individuals will have difficulty performing this test under sober conditions, including: people over 65 years of age; people with leg, back, or middle ear problems; people who are overweight by 50 or more pounds. These individuals should only be given the nystagmus test. Suspects who are wearing over two-inch heels should remove them before performing the test.

The one-leg stand test should be performed only on a hard, dry, level, nonslippery surface under relatively safe conditions. When these requirements are not met at roadside, then the stopee should be asked to perform the test elsewhere or only the nystagmus test should be used[20].” Alcohol Gaze Nystagmus (Presently HGN)

“We checked for nystagmus in 42 sober individuals, including 27 former alcoholics and 25 staff members. Approximately half of the people tested showed a slight nystagmus in at least one eye when their eyes were deviated maximally. The occurrence of nystagmus in these sober individuals was not related to (1) age, (2) visual acuity, or (3) a history of alcoholism. We did notice that the maximal angle of deviation, measured twice by each of two observers using the device shown in Figure 1 was 3.03 degrees larger in the left eye than in the right eye (t, 40, = 5.8, p .001). This occurred in 28 of the 42 subjects and was not related to handedness. We saw no tendency for nystagmus to occur more often in one eye than the other.[21]” This explanation shows why it is important that officers presently check for SUSTAINED nystagmus at maximum deviation. Or, you can try asking the officer if he is aware that half of the population has a nystagmus when their eyes are deviated maximally.

It is interesting to note the difference of the left versus the right eye. Some research, including NHTSA, says a nystagmus in one eye but not the other can be a medical condition.

“Instructions: First, corrective lenses should be removed. The stimulus should be placed above the eyes in order to elevate them and reduce squinting. At night, if the street lighting is inadequate, a penlight must be used as the stimulus or a flashlight is required to illuminate the face. In looking for the onset of nystagmus, we recommend that the stimulus be moved fairly slowly (i.e., at about 10 degrees per second), but not too slowly, otherwise normal oscillation of the eyeball may be mistaken for nystagmus. The suspect should keep his/her head still. The officer’s free hand makes a good chin rest for suspects who persist in moving his/her head. The officer should move the stimulus twice to the left twice to the right, looking at the eye on the side of the head to which he is moving the stimulus. On the first movement, the officer should observe whether or not the onset of the nystagmus occurs before 45 degrees with at least 10% of the conjunctive (i.e., the white of the eye) showing. The 45 degree angle is easy to estimate as it splits the angle connecting the tip of the nose and the center of the ear with the middle of the head. Some individuals cannot deviate their eyes more than 45 degrees, so at least 10% of the white of the eye must show to ascertain that nystagmus is not occurring at the most extreme deviation for that individual.

The second movement in each direction should be faster (about 20 degrees per second) and the observer should note whether or not the suspect can follow smoothly and how distinct the nystagmus is at the maximum lateral deviation. The breakdown of the smooth pursuit and greater amplitude nystagmus at maximum deviation are also good signs of a BAC over 0.10%. Thus, the police officer has three eye signs to look for: (1) onset of nystagmus before 45 degrees; (2) the distinctness of the nystagmus at the maximum lateral deviation; and (3) the breakdown of smooth pursuit eye movements.[22]

It is again interesting that NHTSA thinks the nystagmus at maximum deviation is important given half of their test subjects not drinking alcohol had it.

How often in determining nystagmus prior to 45 degrees does the officer look for at least 10% conjunctive or testify it was not normal oscillation of the eye? (Try using these big words after you ask the officer where he got his degree in ophthalmology).

“The gaze nystagmus test may not be applicable to individuals wearing contact lenses, since hard contacts may prevent extreme lateral eye movements. About 3% of the population will show early-onset nystagmus, and impaired balance, with no alcohol in their system. This nystagmus could be the result of drugs other than alcohol (e.g., barbiturates or phencyclidine), the result of brain damage, of illness (e.g., Korsakoff”s syndrome), or of unknown etiology.

Since police officers often arrest intoxicated persons after midnight, possible effects of fatigue or circadian rhythms on gaze nystagmus could be significant.[23]

So, I guess all those DWI arrests that occur at night might be due to fatigue and circadian rhythms. Other Notes of the Study

In the study, the officer was supposed to fill out his data immediately after the arrest. However, “One problem that arose in filling out both data forms was that most deputies waited until the end of their shift to fill out their forms. At this point in time all forms were completed at once from their police logs.

We urged the deputies to fill out the forms immediately, but our urgings did not help as most of them continued to fill out the forms at the end of the shift. We then stressed the importance of filling out forms for suspects given sobriety tests, so that the tests would be properly scored. We doubt that most officers complied with this request except when observers were in the car[24]

If it is important to write a police report immediately after an arrest, how reliable are all those police reports written days later?

Finally, the study provided evidence that speeding is not a sign of impairment. While this may be common sense, and is in line with the present NHTSA manual, some officers still insist speeding is a sign of impairment. “

Based upon our police officer estimates of the BAC of the stopees, only 5.1% of the speeders were over 0.10%, which is probably less than the percentage of legally intoxicated drivers on the road.[25]

9.6.3 1983 Study[1]

The 1983 was the final study before NHTSA implemented its Standardized Field Sobriety Testing manuals. The research paper was very short as compared to the two previous ones.

The objectives of the study were to:

“develop standardized, practical and effective procedures for police officers to use in reaching an arrest/no arrest decision when giving one or more of the three sobriety tests;

test the feasibility of use in operational conditions by police officers; and

secure data to help determine if the tests will discriminate about as well in the field as in the lab.[2]

“Police officers participating in the field evaluation were requested to administer the sobriety battery tests to all persons they stopped for suspicion of DWI during a three month period. This was done in conjunction with their normal DWI arrest. They were asked to administer and score the sobriety battery tests prior to using a preliminary breath testing (PBT) device. The reason for this ordering was to reduce the possibility that the police officers’ scoring of the sobriety tests might be influenced by the BAC results obtained from the PBT device[3].”

“Efforts were made to secure data for all DWI traffic stops for all tests and to minimize the possibility that knowledge of PBT results would be available to officers before administering or recording battery scores. However, the data were collected in operational situations where the first priority was law enforcement and public protection rather than research data collection. It was not possible for researchers to routinely accompany the patrols and supervise or observe the actual data collection.[4]

Accordingly, NHTSA cannot confirm that the officer’s always used the PBT AFTER giving a BAC estimate.

The percent of drivers that were given all three sobriety tests varied from a low of 70 percent to a high of 88 percent[5].

“The accuracy of the Combined Procedure for all Police Agencies (83 percent) compares favorably with the 80 percent accuracy computed from the laboratory data. Of the misclassifications; 16 percent involved classification of a driver’s BAC as greater than or equal to 0.10% when his/her BAC was less than 0.10%; and 1 percent involved classifying a driver’s BAC as less than 0.10% when his/her BAC was greater than or equal to 0.10%.[6]

So, even under NHTSA ideal situation, a false arrest is 16 times more likely than a false positive based upon the results of the FSTs.

“The data… should NOT be used to draw conclusions about the precise accuracy of using only one given test by itself as opposed to using another one of the three by itself. The main reason is that in most cases, all three tests were given in the same order with gaze nystagmus first. The results of the gaze nystagmus test were then known to the officer and may have had some subtle influence on his expectations and scoring of the next two tests[7].”

“Two major reasons make it necessary to be extremely cautious in analyzing the data collected in this study to draw conclusions about the relative effectiveness of the different techniques that were used. First of all, officers were not randomly assigned to different groups and differences in outcomes may be due to selection and assignment bias. Second, the only effectiveness data available in this study relates to the BAC distributions for subjects who were arrested, and for some others who were given PBTs. There are a number of problems in using these data. We do not know how those given a PBT differ from or are representative of the rest. Perhaps most significant of all, except for North Carolina, all agencies had PBTs available, and in the great majority of the cases, PBT data were available to the officer for a driver before he was arrested. Thus, most arrest decisions were based on PBT data, rather than just test battery data. Given these limitations and constraints, a few additional analyses were done that can be used to help compare and assess the different DWI detection techniques [8].”

It is somewhat surprising,to me at least, that NHTSA is rather forthcoming in this study stating how the data may not be very accurate.


Percent in Each BAC Category for Drivers Arrested by Various Procedures[9]


False Positive 0 - .04%

Difficult To Assess Depends on Other Data .05 - .09%

Arrest Supported By BAC Data .10%+



Normal Procedure Using PBT (D.C. Control)






Sobriety Test Battery and PBT (D.C., MD & Arlington)






Sobriety Test Battery, No PBT (NC); Arrest Indicated by 2 Test Combined Decision Rule Only






Sobriety Test Battery No PBT (NC) Officer Arrest Only






Normal Procedures, No PBT (NC)







There is a couple interesting points to be made about this date. The accuracy of just the PBT was 89%. When a decision was made based upon the PBT and the FSTs, the result was nearly identical. When just the FSTs were used, and no PBT, the accuracy dropped to around 84.5%.

The final part of the study may show how inaccurate the Walk and Turn is. When the officer found the person failed the walk and turn, yet passed the HGN, the officer was accurate only 53% of the time. 23% of the time the person’s BAC was under .05, and the remaining 23% of the time it was between .05 and .09. So, the results were just as inaccurate in close cases as they were in low BAC cases[10].

[1]Field Evaluation of a Behavioral Test Battery for DWI, DOT HS-806-475, Anderson, et al. Sept. 1983. (Hereafter referenced as NHTSA 1983. Please note the study has no page numbers so my references are to the page number of the .pdf version)

[2] NHTSA 1983 at 4

[3]Id. at 6

[4]Id. at 7

[5]Id. at 9

[6]Id. at 9

[7]Id 9-10

[8]Id. at 10

[9]Id. at 11

[10]Id. at 12

[1]PSYCHOPHYSICAL TESTS FOR DWI ARREST, Burns &Moskowitz, June 1977. Note: Due to the manual not having page numbers, my citations refer to the page of the .pdf version.

[2]Id. at 9

[3]Id. 14-16

[4]Id. At 14

[5]Id. At 32

[6]Id. at 32

[7]Id. at 32

[8] Reminder, the original HGN (AGN) checked for one eye while the other eye was covered.

[9]Id. at 55

[10]Id. at 20

[11]Id. at 14

[12]DEVELOPMENT AND FIELD TEST OF PSYCHOPHYSICAL TESTS FOR DWI ARREST, Tharp, et al., March 1981. Note: Due to the manual not having page numbers, my citations refer to the page of the .pdf version

[13]Id. at  2

[14]Id. at 14


[16]Id. at 15



[19]Id. at 14

[20]Id. at 15

[21]Id. at 17

[22]Id. at 19

[23]Id. at 19

[24]Id. at 55

[25]Id. at 78