Industry CRM Developers

 

CRM METRICS: A RECAP

by Dave Wilson

In a whimsical moment of weakness at the recent Aviation Symposium in Columbus, I volunteered to construct a review of the message traffic about CRM Metrics that has appeared on the CRM Developers' list-serv. I am now beginning to write after capturing 45 messages written by 18 different authors, covering 10 threads of thought. I am sure that those of you who have done a literature search before are laughing and scratching about how easy a task this is compared to the one(s) you have done. This is not my first, but it is a challenge.

The challenge comes from the wide variety of thought and concept of what CRM metrics are and what they should do. I should add: "and what they can do for us." The variety of the threads in the archives gives an idea of the complexity.

  • CRM Metrics (CRM Shift)
  • CRM Metrics and Checklists
  • Underpinnings of CRM
  • Boundaries of CRM
  • CRM Competency Indicators
  • CRM Evals (Carried under the subject line: Many thanks and some thoughts; more about that later).
  • Miscellaneous (Three relevant messages did not clearly fit into another thread).

Making sense of what we have said about metrics is further complicated by the fact that some threads, though chronological, were separated by several months. They also appeared under different titles, as after thoughts, or as gratuitous commentary on another subject. Further, there are peripheral subjects that come from the metrics discussion or lead into it. Where to begin? It is a puzzlement.

I chose to address the threads in the order that makes sense to me. The underpinnings and boundaries of CRM need to be defined before any metric system is practical. The subjects of competency indicators and evaluations are next because evaluation and indicators are an integral part of a metric system. Next, I took on the papers that spoke directly to the subject of CRM Metrics and organized them in what to me was a logical order.

The order is as follows:

As you know the messages do not always follow a logical pattern. As a result, the threads of a given argument may run through several messages in different groups. When that happens, I have tried to note it. I hope this document is a useful tool for you and for a CRM Metrics Tiger Team, which I hope will soon be organized.

This is my compilation and opinion of what at least 18 of us have said about some facet of metrics. If you have a "pearl of great value" that you do not see represented here or if I have not presented your viewpoint accurately, please bring it up in response to this epistle. If you wish to take issue with a point or concept, please do.


UNDERPINNINGS OF CRM:

Under this thread there were eight messages, by four authors (Mancuso, Krey, Mudge and Wilson). The first message was dated 11/23 and the last 12/31.

The thread started in a message from John Wise on another subject. He suggested that we define the underpinnings of CRM before we attempt to describe a metric. Vince Mancuso was the first to take up the challenge.

Vince felt that the focus should be on performance measurement, since that would reflect both the process and the outcome. He further narrowed his focus to the management skills." Vince stated: "...if we hope to measure anything, we have to have a set of standards of expectations unique to the job or position as a gauge for performance."

Vince's company has a set of defined set of management skill expectations that serve as a foundation for assessment and training the CRM "management skills." He notes that some frequently used metrics require the rater to decide satisfactory performance on a four-point scale. The fatal flaw in this approach in that the rater is required to define the quality of performance without a set of expectations other than the rater's opinion.

Vince likens their CRM task list and performance evaluation to an audit trail. The latter won't work without the former. We must define the underpinning of skill if we are to measure it and suggests the following definition: "Skill is having an appropriate response for a given set of circumstances." He says that this is as true for CRM skills training as it is for technical skills. Given that, we must define the appropriate response we expect from the crew/individual when faced with certain circumstances. This is not to say that we define every possible set of circumstances, but they could be covered at the macro level. At that level IF-THEN statements could be used. They could be worded as: "Given Set of Circumstances" . . . "Appropriate Response"

Vince feels that it is perfectly appropriate for a manager to set this kind of task expectation for crew members. For example, subordinate crew members are expected to speak up when they are uncertain and will be held accountable if they do not.

Vince followed with another message that spoke to the issue that "metrics are the tools and methods used to measure something. The dictionary defines metric as 'a standard of measure'." He also included a quote that brought Neil Krey into this part of the discussion. Einstein once said: "I wouldn't give a fig for simplicity this side of complexity. I would give my right arm for simplicity on the other side of complexity." Vince noted that the simplicity is only gained by understanding and harnessing the complexity. He noted that pilots, instructors, and managers do not care how metrics work, only that they do work.

In his next message Vince noted that most of his company's management skill (CRM) expectations were "non-phase specific" meaning that they were written at the macro level to apply to all phases of flight.

Neil Krey came online with some thoughts about the "distance" between the performance/behavior and the metrics. He used a quote from Peter Senge in The Fifth Discipline that describes two types of complexity, detail complexity (many variables) and dynamic complexity (when cause and effect are not close in time and space). Neil then notes that we may have to trade the dynamic complexity of the current evaluation system for the detail complexity of the system that measures performance closer to the actual intervention.

He joins Skip Mudge in opining that most of us have two parallel sets of metrics in place, those used during evaluations or check rides and those used on the line. Our challenge he says is to identify the differences, unify the parallels, and implement a single standard. Fly as you train and train as you fly!

Skip Mudge came online to reinforce a point made by Vince Mancuso. He echoed the opinion that CRM program designers and focus groups ought to identify and document the norms to be placed in the task list; they should not be determined on the fly by evaluators.

Skip spoke to Vince's comment that performance measurement can be reflected in both the process and the outcome. Skip feels that the process should be measured. He feels that if the system is properly used, and the crew still experiences negative outcomes, then the system failed, not the aircrew. First, "fix the system, then train the pilots on the modifications."

He notes that most grading scales require some sort of judgement by the evaluator. He prefers a "yes/no" observation. He concurs that to do this you must have a "logical set of clearly observable behaviors."

Skip concludes: "Effective CRM, whether it be a management system or not, should be proactive, in addition to being reactive. It should guide the crews to make better decisions at all times, and significantly reduce the risk of potential problems even developing. . . ."

About this time I joined in the discussion with a wave of agreements suggesting that with the traffic up to that time maybe we were at the edge of needing a Tiger Team on Metrics. The management skills that Vince describes should be termed performance skills. I tried to summarize the differences in Vince's and Skip's approaches as stated thus far and went on to raise a question as to the difference between the measurements of process versus performance. I felt that as the crew's performance is where the training we design shows up, that is where it should be measured. I endorsed Vince's idea of "Given Set of Circumstances . . . Appropriate Response." I also gave two examples where that approach would be proactive and reactive.

Vince leapt back into the fray with a message supporting the idea of a Tiger Team. He was trying to get their CRM task list released to the group. He was working that issue as this was written.

Vince went on to define the differences between his understanding of policy and procedure. He suggests that the "the only difference between the statement of expectations in a policy versus procedure appears to be the specific reference to a phase of flight and reference to a checklist." He thinks that any Tiger Team should develop two task lists, one that is phase-of-flight specific and requires reference to a checklist and one that is non-phase-of-flight specific and not dependent on a checklist.

Neil Krey closed this subject thread with a call for other comments as to the formations of a Tiger Team.


BOUNDARIES OF CRM:

The boundaries thread involved five messages from four authors (Helmreich, Keasal, Mudge, and Mancuso). The first message was on 1/26 and the last was on 2/11.

The thread started with Bob Helmreich's response to a statement from Skip Mudge on the universality of the elements of CRM. Skip's comments were in his message about November's Topic. Bob agreed that many things taught in CRM apply far beyond the cockpit, but protested representing CRM as a means to a better life. In a number of airlines the credibility of CRM decreased in direct proportion to the extent that it was seen as a panacea for the ills of human interaction.

Bob noted that the majority of pilots do and have practiced CRM naturally and that much of our efforts are focused on those who resist or see the efforts as misguided. Those who resist are in the minority, but are likely to be turned off by programs that make the point that they can cure the range of human ills from family to the cockpit. Bob's data show that we should limit CRM to the reduction and management of human error.

Dave Keasal also responded to Skip's message that CRM skills must be integrated in the individual. He noted that while person starts with a"set of tools" and as they (and their ability to use them) mature, the person should be using them at all times. Dave agrees with Helmreich that CRM skills are not the only things required. Technical training, physical skills, etc. are also on the list. He feels that we have oversold CRM as a panacea for flight safety and that the most important thing is to have a proper operational management attitude. He suggests a Boeing publication "On Design of Flight Deck Procedures" for a good discussion of this point. CRM is a skill, but one that must exist within a proper management system. He hopes for the day when CRM is not a separate subject, but is a part of the process of becoming an aviation professional.

Skip, in response, agreed that CRM skills cannot be considered instead of technical and physical skills. He notes that the technically proficient pilot who lacks effective team management skills could lead the crew down the poor judgement chain. ". . .it's (CRM) is not a panacea, but it is an essential element." He sees CRM as a skill, but one that exists in a management system. He also concurs that the company's management must fully support (and require) these management procedures. One of the things required is a firm commitment from management that the procedures are required at all times. "It (CRM) is just one component of the management system.

It is not a separate entity".

Vince Mancuso came on in full support of Bob Helmreich. He describes the actions in his company to establish printed definitions of both CRM and human factors in their flight operations manual. He also describes a study he is participating in with Dr. Carolyn Prince (US Navy). They are conducting surveys in the Navy and several airlines to determine pilot perceptions of CRM and Human Factors. The results indicate that there is a long way to go in undoing misperseptions.

Vince sees a problem in that many organizations do not have a realistic set of boundaries for their CRM programs. He feels that CRM and Human Factors have become buzz words that have lost their meaning to many. He is dismayed that many primary players in aviation use the terms interchangeably. (editor's note: For an interesting discussion of definitions, see the messages in May 97 Re: CRM Definition).

Vince provides a humorous example of the different meanings CRM has for people when one of their senior executives brought some graffitied flight logs into the CRM office and asked them to fix the problem.

Vince feels that some are supporting a broad, even Life, approach to CRM. He prefers Jim Reason's taxonomy of task, workplace, and organization. He notes that there are many things in his job as a mid-level manager and CRM developer that he is to work and many others that he does not. He goes on to describe the definitions they have developed:

Human Factors: The science of what contributes to or detracts from human performance.

CRM: A subset of human factors with the specific definition as follows: The management skills used to direct, control, and coordinate all available resources for safe and effective operations. He goes on to describe the element of their program as skills (communication, crew coordination, planning, workload management, decision making, and situational awareness management).

He feels that if it is not a management skill then it does not belong in CRM training. Issues like fatigue, personality, stress, nutrition, etc. are human performance issues and are trained to an awareness level in their program. Vince states that when one looks at error management from a larger perspective, then CRM is not the only tool, and that there are other programs that can better deal with those topics.

Extending CRM tools into the workplace and organizational problems is a mistake, according to Vince. He notes that many of our earlier programs tried that approach among others that were ineffective. Programs like awareness briefings designed to inform rather than train. He feels that the military emphasis on Operational Risk Management is a perfect example of a program designed to address workplace and organizational problems.

Vince points out that in the limited amount of time the CRM developer has to work with, there is a finite limit to what he can expect to accomplish. He points to the extraordinary cost of just one day of training in his company . . . $5 million. He hopes for two or three new ways to recognize and respond to error producing conditions in the cockpit. Vince does not call his program CRM any more due to the excess baggage the term brings with it. He uses the term Management Skill Training. The expectations are clearly established IAW the USAF Instructional Systems Design process. AQP has required a disciplined approach to CRM training.

"When an airline (ed. note: or any other aviation organization) takes on the error/risk management approach to human performance, then it is perfectly acceptable for CRM to be just a task-level training solution." Vince concludes: ". . . we have to be sure that the tool fits the task and traditional CRM doesn't in my opinion, fit the organizational level error management task very well."

Skip Mudge rejoined the discussion with an explanation of how their management system is designed. He explained that it is designed, taught and targeted to the specific duties and responsibilities of the flight crew. He noted that the applications outside that environment had arisen quite unintentionally and viewed that as "just another benefit" for their crews. The benefit reinforces the desired behavior on all flights.

Skip noted that most "traditional" CRM programs are developed primarily by psychologists and sociologists with the support of pilots. Their program starts with the pilots then is reviewed by the social scientists. Their course is the product of a thorough task analysis that produced a list of 240 objectives. They were developed into professional responsibilities and standard management procedures, the latter of which are both observable, required and monitored. He says this insures that there is never any doubt about what is expected and required by each crewmember.

As they begin with a client, his company gathers data on the corporate and departmental culture and values desired and practiced within the organization. This ensures that the training developed is consistent with the client's desired or existing value system. This process continues from initial through recurrent training and gives the client a proven standardized management system tailored to their organization.

Skip notes here that if you have a management system that is successful in the cockpit, and works in other situations as well, why not use it?

He concludes with a response to a question about definitions of CRM and Human Factors. He does not really differentiate between the two, but says that human factors is a broader discipline that encompasses CRM.


COMPETENCY INDICATORS:

Two messages from the across much waters (Australia and Netherlands. If I got it wrong Rein, my apologies) both touched on the issue of competency. Both the authors (Cylnick and Doorn) wrote on 3/13. The international dateline does wonders for rapid communications on the same day, unless of course, you are going the wrong way.

Greg Clynick said that for some time he had been occupied with this: "If we are to teach it, then we must be able to define it and measure it. - For 'it' read CRM.

He argues that CRM in the context of LOFT is to be a non jeopardy process, yet we want to claim that it makes for better pilots. How do we demonstrate that other than looking at "end of career' accident rates? We need indicators to bear out what we say.

Greg notes that the problem gets more complicated when one considers ab-initio pilots. They can truly benefit from CRM training, but what indicators to we use to make that determination.

Rien Doorn responded on the same day with his thought that the first goal for ab-initio training should be better performance during training. He argues that to learn how to improve behavior, it's important to understand it. He says that to improve functioning, start with knowing how you function, and use that information to improve. (Ed. note: That requires some form of measure or metric).


CRM FAILURES:

This thread involves nine messages by eight authors (Joering, Mancuso, Liemann, Talleur, Wilson, Bent, Heybroek, and Rippey). The first message was dated 10/29 and the last 4/12.

Gerald Joering started this thread off with a statement and a question: "I am very interested in how other airlines are handling their CRM failures (boomerangs). Are there any special techniques for requalifying pilots after bad LOE rides?"

Vince Mancuso provided the first response to Gerald's question. After a quick definition of the LOE, Vince asked if their LOE had an assessment/evaluation tool that pinpointed an LOE deficiency as a CRM deficiency. He noted that not every LOE failure will be due to a CRM and even if it is it may be due to the failure of one's management skills. That would not necessarily indicate that they DO NOT BELIEVE IN CRM. Vince understands that the "boomerang" in CRM is one who not only does not believe in CRM, but one who comes back after training worse that when he/she started.

Vince feels that the important thing is for the CRM program to focus on skill building. Another caution he offers is is the focus is on skill building, but the metric uses attitude/perception to decide training success.

Vince sees a danger in using the term "boomerang" to separate believers and nonbelievers. He feels that to do so runs the risk of making CRM sound like a religion. This can happen if we focus on perceptions/attitudes rather than outcome and process. Vince sees a danger in focusing on "attitude change" as the primary outcome of a CRM program. He notes that many past post-training metrics have focused on perceptions rather than on training outcomes. He suggests reading John Wise's comments (on the dangers of perception measures as a metric for training outcomes) and his own comments on CRM outcome and process metrics. (Ed. Note: See the CRM Developers web site archives for these comments).

He closes with a restatement of his first question. Is the LOE failure a CRM problem? If so, is it an attitude or a skill problem? How are you determining either?

The next message was entitled "Many thanks and some thoughts" from Hugo Oscar Leimann. Hugo's message touched on several topics, among which was a continuation of the failure/evaluation subject.

Hugo introduces his thoughts on the subject with questions: "If we know that we are dealing with nothing less than 80% of the accidents here, is it not the time to think about establishing or reinforcing some kind of "CRM-exam" like with the technical stuff?", and "Are we confident that all the "approved" courseware are enough to cope with this huge problem?" He notes that they were pretty optimistic till the three B-757 accidents. The reports show that there are still CRM problems and that the training did not reach everyone. In some situations CRM skills and attitudes are overrun or forgiven and again we have a pick (up) in accident statistics. Hugo concludes with a suggestion that it is time to enforce some kind of check system to approve or not CRM behaviors just like we do with the technical skills.

Donald Talleur also came on line with a question: "How does one "bust" on CRM during a sim or check flight?" He sees no evidence that the industry has qualified what constitutes CRM to begin with. At their lab they have researchers who are having difficulty defining adequate metrics for CRM. He feels that the purpose of CRM training should be to enhance skills by example and practice. If we pass/fail crews on this what have we accomplished? Donald raises the specter of a CRM bust(s) leading to a career failure and suggests that we wait on a pass/fail system till everyone has had a chance to be trained in CRM from day one of their career. Then all will have an equal experience level. He suggests healthly discussion to establish CRM criteria before we go to a pass/fail system.

Donald raises another point by question: " Put a new crew together for the first time and how do they perform? He feels this is bound to effect any measurement of CRM performance. He concludes: "And by the way: It is not good enough to say that because they didn't crash that CRM was adequate!"

Dave Wilson (your friendly editor) entered the discussion at this point. With thanks to others I pointed out that even on a pass/fail scale, evaluations constitute a measure of some sort. I asked if anyone out there had a metric system for CRM either objective or subjective. I noted that both Delta and the Air Force Air Mobility Command had such systems with which I was familiar.

I thought it would be interesting to tie a metric system into CRM training for inexperienced crew members, say those with 0 to 500 hours. That could lead to answers to the interesting questions of when and how to start CRM training.

A true metric system for CRM could serve several purposes. One of the most valuable would be to identify and remediate the crew member who "just does not get it." I noted that we would not hesitate to "retrain" a crewmember who was deficient in a technical skill. I concluded that maybe it was time for the CRM police. :-)

John Bent was the next one to enter the fray with comments in response to Talleur's question: "How does one bust CRM (per se)?" John notes that CRM is not a "stand alone" item, but a broad title covering numerous preferred "behaviors." He also notes that the NASN/UT/FAA Crew Research Project, led by Bob Helreich, has developed "behavior markers" to identify CRM competencies.

John goes on that at his company the modified the markers to their own language and "for years" have used them to evaluate CRM during all checking processes. They use the markers renamed: team building, appropriate assertion, and communication. They also look closely at sharing and teamwork in all evaluations, sim ratings, and line (route) checks. "The 'multiplying consequence' of breaking down the components of effective CRM in an organization, and requiring inspection of these, is to educate the aircrew body as a whole." They use a "Line Check Form" that has ratings for the markers from 1-5 (failure - excellent). Most of the markers relate to CRM behaviors.

Rick Heybroek came online with his answers to some of the questions raised thus far. As to CRM failure during recurrency, Rick relates that different operators have different policies. He recalled a notable CRM only failure briefed at a 1992 AQP meeting. The ride was scored with an AQP LOE form that is fairly typical. The operator in the case noted that the CRM only failure suddenly changed attitudes toward CRM among the pilots in that company.

Rick also pointed out that there is extensive literature and empirical research to answer the question: "What is CRM?" He notes that a web search yields quite a few titles. He suggests that solid research data shows the FAA CRM categories agree very well with the Naval Training Systems Center research (Drs. Prince, Salas, and NASA's Dr Kanki).

Rick notes that we always reach a point where the question is not, does this really measure X, but does it measure an interesting property of X? Rick commented that the AQP LOFT Focus Team debated the issue of the value of pass/fail criteria for crews. The LOFT provides a non-jeopardy experience and the LOE with integrated CRM. There are still training captains who do not understand this he feels, though hopefully not many. LOFT as a learning experience describes it well. Pass/fail applies to LOE and Rick thinks that is what some find objectionable.

The advice from the AQP working group has always been to involve your unions in the scenario validation. Rick feels that one would have to have a really bad attitude and work really work at failing LOE persistently on pure CRM. He says at this point we refer back to the Fedex Capt. Queeg video.

In response to the question of crew cohesion/familiarity, Rick notes that there IS a difference between flown-together and not-flown-together crew performance and refers to 1986 paper by Foushee, Lauber, Baetge, and Accomb.AQP usually schedules LOE towards the end of the footprint to allow familiarization. However, he notes that since crew changes are common, the ability to cope with them is arguably a reasonable parameter for leadership and team building skills.

Rick concludes with a request for a discussion among CRM staff at AQP-qualified operators.

Unable to keep quiet, your friendly editor came back into the discussion in response to the questions raised by Donald Talluer.

As to how one busts on a CRM sim or check flight, I recalled an incident from my AF days. An evaluator pilot failed an entire aircrew on an aircraft checkride because the copilot did not complete a hydraulic panel checklist item and no one caught it (other than the evaluator). The crew was failed under the seldom used "pass/fail" item called Crew Coordination. A problem surfaced in the corrective action. The evaluator prescribed two four-hour simulator missions. Unfortunately, at that time the simulator rides were focused on technical issues and procedures. I suggest that the incident illustrates two things: We have always known how to "bust" a crew for a CRM failure. Once that happens, we are not sure what to do about it or that the corrective action does not always fit the CRM deficiency.

Donald suggested that the purpose of CRM training should be to enhance skills by example and practice. He also expressed concern for one's career after a "bust". I responded that though I agree with the first, I do not know how we can achieve our purpose if we have NO metrics. The metrics exist, we just have not defined them for a global application. Most of the major airlines and military commands have metric systems of some sort. Any one who has evaluated aircrew performance will tell you that they can recognize good and bad resource management when they see it. The hard part is isolating the CRM performance from the technical. I repeated a question Bob Helmreich had posed at the Eighth Symposium. He described a crew flying an ILS. The Captain did not handle things well and let things get out of hand. The first officer compounded the problem. The approach had to be aborted and they went missed approach, because they were out of limits for a safe approach. Was the problem the pilot's flying skills or CRM?

As to the question of one's career after a bust: I think the problem would be in the training and the form of the check. If the training system is properly set up, the aircrew's substandard performance should define the remedial training. If a crew member cannot respond satisfactorily, then he (she) should be doing work other than the cockpit or cabin.

I responded to Donald's suggestion that we wait til everyone had CRM training from day one so we would have an equal experience base. It seems we can define the presence or lack of CRM from the CVR after the accident, but apparently not before the fact.

I also noted that there have been several studies comparing the performance of crews who fly together for a long time versus that of crews who come together for only a short time. Usually the crew that stays together for a few days (or weeks or months) handles problems better than the newly formed or "pool" crew. However, even the "hard" crew begins to take things and each other for granted and their resource management slips as they mature as a crew. I related my 20+ years of flying in an Air Force "pool" system where training and standardization overcame the initial awkwardness. If it does not then the training or the process is the problem, not the evaluation system. I also related my experience over the last 10 years of training 100 to 150 crews a year in that same system. They usually overcome that initial problem. When they don't, our training focus is more clearly defined.

I feel that CRM training should help a crew handle the problems that they face in their operational system. If it does not, the problem is in the training not the evaluation. Evaluations should reflect the operation as it takes place on the line.

The problems of pass/fail are critical to the acceptance of any system of metrics. CRM metrics has many values to aviation in crew perfomance, corporate training emphasis, designating the culture of an organization, and others. I find it frustrating that we can not come up with some universally accepted way to measure something that I think we can all recognize as good, not-so-good, or bad when we see it.

The last message in this thread was from Colin Rippey. He noted that they, too, were using the NASA/UT Performance indicators for a couple of years. They are currently working the issue of grading and what to do with failing grades. Their early efforts to get the "old salts" to buy in were to design their program as a non-jeopardy. As Colin described it: "You have to take part in the training, but if you don't display attitudes and behaviours promoting good CRM, then no big deal."

Now they are working toward AQP and grading CRM performance will be as important as any other area. In their case, a CRM failure by an individual or crew, on the line or in the sim is dealt with on a case-by-case basis (and they do not have many), generally including an interview/counselling by one of their CRM development/facilitator team members.


METRICS AND CHECKLISTS:

In early Oct. 96 two messages appeared that linked metrics and checklists. They came from two authors (Barnes and Wilson) and were dated 10/8 and 10/10 respectively.

Bob Barnes spoke from a Certification Flight Test perspective. He has noticed that there is a marked difference in the CRM used by three-pilot versus two-pilot crews. If those seeking certification do not account for these differences through effective training program design and appropriate checklists, there can be a perception of an increased workload and in some cases an unacceptable rating of the design by the line evaluation pilots.

I responded to another's message that touched on the subject of checklists from a similar angle. I noted: Good procedures are no guarantee for good performance. But that incidents and accidents are not necessarily caused by poor performance in the technical areas; and, when they are we train (human skill) or fix (mechanical) the problem away (we hope!!) In either case our objective is finite so we develop finite measures to help us assess the success of our fix. In the realm of human interaction we have only recently (10 years or so) begun to train pilots, then crews, now teams in resource management. That perception alone reinforces the assertion that the "operating envelope" is substantially broader than the imagination of the procedure writer.


SEPARATES:

Three messages spoke to metrics but did not as clearly tie to other threads so I have chosen to characterize them as SEPARATES. They came from two authors (Deen and Leimann) on the 25 and 29 Nov. and 17 Dec.

Greg Deen on his first input to the group spoke to the relationship of the CRM program and management's acceptance or even endorsement of it. For management to continue to support the program or to expand it requires some sort of demonstration that the program is doing something productive. Greg humorously relates the CRM program with which he works as a giant cement wheel. It moves and has inertia, if you try to stop it you get run over; try to push it and you get a hernia.

He feels the success of the program is best shown by a simple sales technique. Get management to buy into the program by showing that it gains a benefit or avoids a loss. Greg says that to gain a benefit is more challenging to document and demonstrate to management than avoidance of loss. It requires an active and relevant metrics system. He cautions that we should not see metrics as a stick over the crews heads, but a system to document sincerely candid success stories, not just hazards.

Avoiding a loss can be measured by improved launch reliability, fewer mishaps, fuel savings etc. Tie those to the CRM course, and show management the relationship.

Greg also asks the group a question: Is it possible that we are trying to take on too much in a metrics quest? He urges us to simplify our efforts to find a measure, and notes that our original need for metrics was to evaluate the effectiveness of our training program. He feels that deviations from the objectives of the course will be easy to spot, and the harder question will be the WHY of the deviation.

He also cautions that management, especially senior management, must take the lead here. If they see their leaders playing fast and loose with the rules, the crews will follow. He also repeats his caution about negative reinforcement, and challenges us to show the crews what to do, not just what not to do.

Hugo Leimann came on line with a rather light hearted input call "Misteriometer" which translates "my little farm" in which he chides us for being protective of our corporate secrets in our quest for a metric system. He notes that our forum is the most open and sincere of those in which he has participated. He understands that we have a real dilemma with the dualistic nature of our participation. That is ,"To say or not to say?" What are our current projects or discoveries?", etc. he notes that there are several writers who have addressed this dilemma and hopes that we can find a way around it in the area of CRM metrics. He suggests a strict code of ethics and keeping a record of what is said and by whom.

Greg Deen came back on line with a Metrics test. He related the story of a crew that he was training in a MOST (military version of LOFT) mission. He relates the situation and the crew's dilemma, indecisions, conversation, and decision. He then asked how we would apply our metrics theory to the training event he described.


METRICS:

This is the last chapter of the summary. Like most last chapters, it has its beginnings in the earlier sections and wraps up most, if not all, of the threads. There are fourteen messages by six authors (Wilson, Wise, Hendy, Hulen, Mancuso, and Deen). The messages started on 10/3 and ran through 1/29.

I started this thread with a message on metrics on 3 Oct. 96. I was working on a project concerning a customer's requirement to develop a CRM metric system. I had noted an earlier developer's list comment about a "shift" of CRM from a separate training program to one integrated with the technical training program. I warned of "over-integration" to the extent that the CRM program lost its identity. Our company made an unsuccessful attempt to integrate CRM into technical training. Actually, I guess we were too successful; We integrated it so well it lost it identity and integrity. After a program review, we recognized what was happening and rejuvenated the CRM program with its own identity, but still integrated. I noted that aircrew responses to our efforts were very positive. I opined that though not solid metrics, such inputs coupled with those from evaluators, check airmen, and instructors could provide some answers to the CRM metrics question.

John Wise cautioned in his message that "aircrew response" is a bad place to start due to the data being highly suspect. He supported his comment by saying that anytime people feel that their job is on the line, they are going to give the politically correct answer. He suggested that such temporary measures seem to stick around for a LONG time, especially when they give the desired answers. They gain a certain validity just becaused they are used by all the "right" people" and the results gain a respect they do not deserve. Since they generate the answer people want, there is little reason to try to develop a valid measure.

I responded in two messages in partial agreement with John. I noted that it was not my intent to rest all our metrics on the questionable laurels of aircrew response. But neither should we ignore it. I suggested that to rely totally on that type of information may be asking the fox how he liked the hen house. However, the check ride and exam results seem to be, at least for the present, the best objective measures of performance and knowledge we have.

I then wrote a short lament that it would be great to have as exacting, finite and objective measure for CRM as we have for checkrides and writtens. That raised some the questions in my mind about CRM metrics: What and how are we going to measure? What are the criteria for success? The first answer is directly related to the "underpinnings of CRM." The latter is related to the use one chooses for the information gleaned from the measures.

If CRM training is only about behavior modification, the answers are in the realm of the psychologist. If CRM training is about the acquisition of certain skills then the presence or absence of those skills should be fairly easy to detect by operational evaluators. If CRM depends on both, then experts from both houses should have valid inputs.

I also asked the question: What is the purpose for evaluating CRM output? If we want to improve the training methods or program, the approach would be different than if we were trying to support investment in CRM training. There are some questions of definition and direction that need answers before we can develop a valid measuring system.

Keith Hendy came into the conversation with a description of the progress they are making in CRM metrics in his organization. He suggests that "all aviation training is about instilling behaviours in aircrews that will result in timely and appropriate decisions/actions." For those decisions to be appropriate he suggests that the situation must be attended to, assessed correctly, the decision/action implemented, and the result monitored to make sure the goal is satisfied. Therefore, CRM training should be aimed into those domains to insure goal achievement has been obtained. In this way, Keith feels that CRM becomes the management of attention, workload, and knowledge. He suggests that this approach can become the basis for an objective metric of CRM or even aircrew proficiency. They have begun to use this approach among their CC-130 crews.

Keith goes on to explain that their approach grew out of some theoretical work they were doing on what operator workload was, how to describe it and how to measure it. He feels that operator workload management is really time management. If you add in the concepts from Perceptual Control Theory (PCT) you get a powerful tool to explain both individual and group behaviors. This approach argues for a skill based approach to CRM. Keith feels that the modification of crew attitudes is at least difficult and the vote is still out on the success of modifying behavior.

John Wise rejoined the discussion at this point and suggested that behavior is the only thing worth measuring in a CRM environment. If not behavior - - - Why bother? He also expressed a desire for more information on Keith Hendy's work at DCIEM (Ed. Note: Keith's very interesting paper was presented at the 9th Symposium and will be included in the Proceedings. Also see Keith's work on Measuring Situational Awareness and Workload Management in the "What's New" section of the Industry CRM Developer's Web Site.).

Charles Hulin entered the discussion stating that behaviors are what we are attempting to change and they are what should assess in CRM training courses. He feels that the behaviors we are trying to change cannot be changed without some cognitive or attitudinal state or condition changes. Though he is not sure if attitudes or changed behaviors will happen quickly or automatically. He does feel that long-term behavioral change without accompanying changes in cognitive states is not going to happen. So we try to change behavior by changing the supporting cognitive structure.

Vince Mancuso joined the thread by noting that all possible measures, distilled to their core, appear to fall into one of three bins: 1. Outcomes, 2. Processes, 3. Perception Measurement Tools (surveys, etc.). Each has there own useful purposes and limitations.

Vince noted that John Wise highlighted the limits and traps of perception measures. John thought that the practitioners might want to retain these tools. They can be useful if they are used with caution and respect for their shortfalls. Outcomes measures are very difficult to obtain because of all the confounding influences. Though not a simple task, Vince thinks that we could identify the unexplored areas and identify the dependent variables that the operational managers could use. Perhaps then we could identify some independent CRM/Human Factors variables that have a predictable influence. However, Vince feels that Process measures are the most fertile for development in the area of CRM/Human Factors. When the task list or clearly defined management behaviors are identified, there exists a focal point for observing and assessing the process. The safety analysts and researchers tell us the conditions that lead to crew error. They can be added to the task lists. Many will have a surface validity that will allow the operational manager to make them into a requirement.

Vince concludes with the thought that there are many unexplored areas in the measurement of CRM process and this forum is just the place to explore them.

In response to a request from Neil Krey, I provided some words on how the debriefing in the C-130 MOST program works. He also asked that I differences with the civilian world. Since my experience is primarily in the world of military aviation, mostly with the C-130, I chose to stick to my subject and let the readers do the comparisons. There followed a brief description of the subject followed by a tie into metrics.

After ten years, our training program still gets the highest ratings from the aircrews. Since they seem quite at ease criticizing anything else that they feel wastes their time, I see no reason to discount their praise. I know from personal experience that the crews say the same things about our CRM training in private that I read in their critiques. Still, I would truly like to have some sort of quantitative and qualitative measure to lead us to further improvements.

In my next message I continued the discussion and tried to describe our dilemma. I know that though we have a metric system, we have to define what to measure and a scale. In CRM there are three things that are suggested to be measured: Attitude, behavior and courseware. Any aspect of CRM can be impressed into those categories.

Attitude is very difficult to measure, much easier to do a comparison (a la Helmreich). But then who defines the subtleties of attitude? Is attitude in the testing room reflecting attitude or performance in the aircraft? And, if it is, is there a correlation?

Performance or behavior is easier to measure, but in the CRM arena it is not always observable. How do we design a check ride or LOE to elicit all aspects of CRM, especially when any one of several behaviors may provide a satisfactory conclusion to a problem? If we build a scenario designed to elicit a particular behavior, it probably will.

The behavior expected from CRM training is dependent on the courseware. Here is where I see Vince Mancuso's comments on the subject of CRM program makeup as particularly relevant. He suggests the elements of STRUCTURE, CONTENT, METHODS, AND DEVICES must be addressed. But before that we need to know what goals are we trying to attain?

I suggested that maybe our focus for a metric has been too macro, perhaps even in the wrong direction. We know that most experienced operational aircrew evaluators (and several researchers) can identify good CRM when they see it. I followed that comment with a description of our program that focuses on the definition of CRM tools, and where and when to use them. But we still have the question of how to measure results.

I noted that we can and do test in the academic situation. We can, and do, evaluate crew performance in the simulator. We design scenarios to simulate an actual logistic or combat mission. Will the lessons taught so far reach the aircraft (simulator)? That is where I maintain the objective of the metric system should be focused.

I suggest that there are two ways to evaluate a crew: self-evaluation and outside evaluation. The self-evaluation, particularly if it follows a formal checklist and is conducted after every flight with sufficient time allotted by the company, is the most productive. The evaluation would have to involve the whole crew, front and back. The time allocation and the funds to support it show a commitment by "management."

The outside evaluation should be accomplished by a CRM "qualified" evaluator or check airman. There also needs to be a reckoning that not every mission will produce all expected crew behaviors. Another point to be reckoned with would be the CRM bust when the mission was successful or vice versa. I noted that that was the essence of this thing we are dealing with. It is different from the application and demonstration of technical skills.

In the next message in the thread Greg Deen gave his philosophy of CRM Metrics. He stressed the importance of keeping focused on the goal of CRM training: To improve mission effectiveness. He did not limit that to just the cockpit, but to all the teams that constitute the aviation system. He noted that understanding the behavior within and among the teams is paramount to evaluation and training of the team members. He noted that some seem to think that metrics are the tool for understanding the process; Greg maintains that simply understanding the process is not the solution.

He notes that though we understand gravity, we are not free of it. Understanding only gives us the knowledge to adapt or use. In aviation, the understanding comes from evaluation. That requires a tool that we are calling "metric."

Greg feels that metrics are simply the feedback to training. Tests and evaluations are metrics and the cycle completes when we evaluate the effectiveness of training. The key he says is in the training document and training objectives you have set. Did the aviator respond properly to a "given circumstance?"

Greg suggests that the question of metrics seems to be "a chicken or the egg" dilemma. Which comes first? He warns that some training is fairly easy, and is easily undermined if the process itself is not understood. Reference to a copilot as the "wetware" beside you or training him to just watch and listen dilutes the training program and teaches the copilot to just become a passenger. Crews should be trained to share the work load, and according to company policy, the PIC should delegate flying duties. When the appropriate behavior is reinforced with repetitive training, metrics becomes easy.

Metrics is easy, Greg says, if the training, management and evaluation system all work in concert toward clearly defined behavioral goals. The effectiveness of the teamwork system will be judged by the philosphy of the metric system.

Vince Mancuso challenged Greg's assertion that "metrics is easy." He was cautious about the development of metrics, management, training and evaluation systems. Vince invited Greg to share some specifics about the C-130 scenario mission that Greg had presented to the group. (Ed. note: See Greg's message in the CRM Developers Group archive, dated 12/17/96).

Greg responded with a message that started "Metrics IS easy." He then noted that his comment was somewhat facetious, but he does believe that metrics could be a simple part of the aviation system. He noted that we have had metrics systems in place for years in the form of flight evaluations. The results are often combined, and when a negative trend is detected, the training program changes to focus on the shortfall. Greg says that the evaluation system identified a problem area to management, who directed training to respond. They did and the problem was resolved.

Greg then applied that series to CRM. He noted the experience in one company over several years. Early on they had a DC-8 crew that ran out of fuel in a holding pattern due largely to a lack of communication among the crew. Many years later one of their DC-10 crews experienced a catastrophic and total hydraulic failure and brought the plane in for an "almost" landing. When the CVR of the latter crew was analyzed, the results of the company's CRM program are both clear and fascinating. As the stress and workload increased, so did communication in both quantity and effectiveness. As a quantifiable entity, "ineffective" communication was the least apparent. The entire culture of the company had changed due to the method of decision making. Greg emphasized that the change did not take place overnight; it took years.

The company started a training program, management supported and endorsed it, and the evaluation program was tied to the training program. All three worked together in concert: he noted that one cannot survive without the other.

Greg started his message with a comment about flight evaluation that looks at "technical" skills. If CRM is taught as a technical skill, the evaluations simplify. As to the MOST scenario that Vince asked about, Greg notes that there are lots of technical skills and CRM teachings available to "grade." He commented that if management would support the CRM evals like they do the technical, we would make a lot of progress toward improving crew performance. The same crew that Greg described flew the next day and greatly improved their CRM skills. He closed with assurances that he would recap the responses on the "CRM Crew" in a short time.

True to his word, Greg came back online with an analysis of the "CRM Incident" he had described in an earlier message. In response to his request, he had received three responses that provide a CRM assessment of the mission he described. He was coming from the premise that "metrics" is the "grading" of CRM behavior. He concluded that we are not agreed on what metrics is.

He received three responses all with a common thread, the crew was marginally qualified, quite dangerous, and needed retraining ---Particularly the PIC. One of the responses rated each of the crew members' behavior in technical terms. Greg said that he watched the crew through the entire mission as a CRM facilitator and WST operator. He was not as critical of the crew as were the respondents to his challenge to rate the crew. Greg then asked a question that continues through the rest of the message. Do we rate the whole mission or just one incident?

Using the LLC-4 developed by Dr. Helmreich, et.al., the respondents ratings of the crew would have been a "1", meaning poor and unsafe. Greg placed them in the 2 - 3 categories. In some parts of the mission the crew was sharp. Communications were effective and decisions were clear. So what about the landing dilemma, which even Greg called "lousy" from both a technical and CRM standpoint. The pilot should/could have given the control to the copilot thereby lessening his own workload.

Greg closed the last message with almost the same question that opened it: Do we assess the ends, or the means?

----------

 

Home ] Up ] [ CRM METRICS: A RECAP by Dave Wilson ] All-Female Aircrews ]