Re: WOMBAT Validation

Donald Talleur (dtalleur_at_uiuc.edu)
Tue, 24 Feb 1998 13:11:20 -0600


Mr. LaRoche,
My memory may have failed me but I recall That I was the one who
set up this project for Dr. Lintern as an non-formal investigation that
Stanley Roscoe was interested in. We had the subjects so it was appropriate
to do. I am well aware of what the Aviation Reserach lab has published and
it includes nothing about this study. It was not reported as a finding from
the study that these subjects were actually involoved in either.
I never had any discussion with Dr. Lintern (who I just happened to
have worked with since 1989 unti the day he left for Australia) that
indicated that our study had conclusively proven anything it set out to
prove. I can't dispute that there may have been a high correlation value
achieved as you described but I don't recall having a sufficient number of
subjects to establish this correlation as significant given the
inter-subject variance among the conditions under which they were training
in the parent study.
You know, I go back to the Predicting Human Performance book
published by you, Roscoe and Corl, and find only one reference to Dr.
Lintern. And that is labeled a person communication. Given that that book
was published in 1997, I find it hard to beleive that we produced something
more substantive but yet it did not find it's way into the book. (By the
way, I thought the book was well written and does much to further the
understanding of the field and in particular, the goals of WOMBAT).
But this is all academic. I was never discounting the success of
WOMBAT in other tests or it's overall usefullness. We simply did not do a
substantive study of the equipments predictive validity nor did was what we
did do of publishing quality. But then it wasn't meant to be. I was not the
"Doctor" who supervised the study. You are correct in that. I was guy who
Dr. Roscoe sat down and trained on the system when we first got it. I
tracked and monitored the entire process.
The reason I'm skeptical about the correlation data is that we
didn't have a even design of students from the several experimental groups
that we took subjects from. There was no analysis of possible interactions
between the experimental assignment and the those who tested on WOMBAT.
Statistically it was a poor design (best we could offer however) and
indadequate data to make any generalizations about.

Incedentely, the first tests where not even daul joystick, They
used the single joystick prototype which I understand is not what is being
promoted currently. The later test used the dual joystick system and seemed
to produce "better" data but I'm just not convinced that we had a good
enough design so that we'd be able to replicate those findings.

Ok, lets step aside and get something positive out of this: It
seems that the area of primary training prediciton (uising WOMBAT) is
probably under investigated. Since it has been historically difficult to
find a good prediciton tool (Such as Sternberg or Spartan tests: both of
which we did not produce exciting results with) Maybe more effort should be
given to this area. WOMBAT has the potential predict adequately if we can
track people past solo. The proof to me is how they do done the road beyond
solo. i.e. time to private license etc. But we've got to do it in part 61
environment. We have a set program that just about garuntees when the
average student will reach each new level of certification. Those that lag
behind tend to wash out, You can't really call them outlier's and treat
them as such because you never get the chance to track them after the leave.
I am open to suggestions on how we might get around this problem.

Thanks for your response Jean: I was under the impression that no one even
cared about this area of research! (Other than the guy who originally asked
about it)

At 01:00 PM 2/24/98 -0500, you wrote:
>
>To:
>Donald Anders Talleur
>Assistant Av. Ed. Specialist/Assistant CFI
>Aviation Reserach Laboratory/ Pilot Training
>Institute of Aviation
>Willard Airport
>
>
>Dear Mr. Talleur:
>
>In a contribution to this mailing list, you recently wrote:
>
>
> Talleur> I ran WOMBAT (It is capitalized because it stands for
> Talleur> something) to try predict performance of new pilots. It
> Talleur> failed to predict anything. However, I'm not convinced we
> Talleur> were measuring the right things to correlate to the WOMBAT
> Talleur> scores.
>
> Talleur> In any event, WOMBAT is a game and unless it has changed
> Talleur> dramatically I fail to see how it would predict pilot
> Talleur> performance in an emergency better than testing is a real
> Talleur> simulator. Also how often do pilot's do a dual tracking task
> Talleur> (one with each hand) and in different planes of motion. I
> Talleur> flown over 30 types of aircraft and a dozen different types
> Talleur> of sims and have yet to find myself doing this type of stuff.
> Talleur> But is this a non-similar part-task training device of some
> Talleur> sort that, although it has little or nothing to do with
> Talleur> flying activities, has good predictive power of how a pilot
> Talleur> may perform? I'm not convinced. For regular line flying you
> Talleur> might be able to collect data to correlate to WOMBAT scores.
> Talleur> But the issue is predicting how pilots will react under real,
> Talleur> life-threatening pressure. I still contend that we'll have a
> Talleur> hard time producing meanignful data on these events that, in
> Talleur> hindsight, can be correlated to WOMBAT scores. Don't think
> Talleur> I'm knocking WOMBAT however. It has been proven to have some
> Talleur> useful applications in aviation. I just don't believe that
> Talleur> this is one of them!
>
> Talleur> Don Talleur
>
>
>You administered WOMBAT tests to students at the University of Illinois
>Institute of Aviation during the Fall of 1993 under the supervision of
>Dr. Gavan Lintern. The purpose of the experiment was to find the
>correlation between the WOMBAT scores and the flight training required
>for ab initio pilots to fly solo. Although the developers of WOMBAT have
>always worked to predict distant success (10, 15, 20 years down the line;
>see David O'Hare's paper in Dec. 97 issue of the journal Human Factors),
>it was interesting to see whether WOMBAT would predict short-term
>training success, something of practical importance to training
>organizations such as yours.
>
>Consequently, for this study, flight instructors were asked to record the
>numbers of landings and flight hours prior to the time they judged their
>students were sufficiently proficient to fly solo safely, although most
>soloed at a later time due to internal presolo experience requirements.
>There was no incentive to take the WOMBAT test other than to participate
>in the study, and it has been our experience that candidates take the
>test more seriously and generally perform better when they have reason to
>believe the scores may affect their acceptance for training or future
>employment.
>
>Despite these limitations, the data you collected were analyzed by Dr.
>Lintern. He obtained some of the strongest correlations ever found in
>this type of study. In fact, WOMBAT scores correlated 0.80 with the
>number of landings to presolo proficiency and 0.78 with the number of
>flight hours. These results were released by Dr.Lintern, have appeared in
>published books and articles, and have been available from Aero
>Innovation. It is incorrect to say, "It failed to predict anything."
>Whether you were aware of Lintern's findings or not, you have no basis
>for such a statement other than your subjective observations.
>
>There is no doubt that WOMBAT has the interface of a game. However, the
>principle behind the test is to recognize the complexity of the future
>flight performances we are trying to predict and design a test with the
>same level of complexity in a culture-free environment. WOMBAT is quite
>complex in the sense of its multiple attention demands but is independent
>of flying experience in the sense that the component tasks are
>individually unlike any component tasks in flying an airplane.
>Furthermore, the current version of the test (Version 4.6) is far more
>complex and hence more demanding and "stressful" than the version you
>used in 1993.
>
>Flight simulators, on the other hand, are the quintessence of a
>culture-dependent environment. Manufacturers strive for the ultimate in
>verisimilitude to an airplane and the flight environment. Simulators are
>designed this way because they are meant for high training transfer and
>for assessing an experienced pilot's currency. They are not designed for
>assessing an individual's potential for future pilot performance. How one
>performs in a simulator depends so much on flight experience and currency
>and on the variability in the ratings of different check pilots that it
>is a poor basis for selection.
>
>In the four corners of the world we have observed countless breakdowns in
>performance under stress during WOMBAT testing, countless attempts by
>testees to explain their behavior by blaming the interface, the game-like
>environment, their poor comprehension of the tasks to be performed, and
>so on. Many of these candidates, when trained by the hiring organization,
>have failed to perform acceptably prior to certification, and worse yet,
>some have failed on the line. When access to the job of your life depends
>on a WOMBAT score, the level of "WOMBAT stress" is comparable to
>real-life emergency operational stress. A few thousand people can testify
>to this.
>
>Regards,
>
>
>
>Jean LaRoche, President
>Aero Innovation, inc.
>Montreal
>
>

Donald Anders Talleur
Assistant Aviation Education Specialist/
Assistant Chief Flight Instructor
Aviation Reserach Laboratory/ Pilot Training
Institute of Aviation
Willard Airport
217-244-8687 or 217-244-8606