Commented summary of the article Delphi methodology in health research: How to decide its appropriateness

The paper Delphi methodology in healthcare research: How to decide its appropriateness by P. Nasa, R. Jain, and D. Juneja appeared in the last year. Of course, this is not the first and certainly not the last paper dealing with the use of Delphi methods in healthcare, however, its clarity supports its introduction to Czech readers, as this method is quite often applied in healthcare, even if opinions on it differ.

Among the essential advantages of the Delphi method are anonymity, repeatability, controlled feedback, and statistical stability of the consensus.

The Delphi method is a systematic forecasting process using the collective opinion of panel members. It is also a structured method of building a consensus among panel members. Delphi methodology has become established in various fields of medicine. It has taken on a key role over the last few decades, for example, in making recommendations for treatments using collective intelligence where research is limited, ethically/logically difficult, or where evidence is conflicting. There are no generally accepted quality parameters for evaluating Delphi in health research, and in some cases questions have arisen that relate specifically to the quality of the results of Delphi studies.

Nasa, Jain and Juneja propose a total of nine quality assessment points for Delphi studies in healthcare across four steps of the methodological process. This is essentially a decomposition of the entire Delphi process into four parts and nine areas that quality assessment should focus on and that Delphi assessment typically consist in.

1) Problem definition

Delphi studies are practical in problem areas where either evidence based on statistical models is not available, or knowledge is uncertain and incomplete, and human expert judgment is better than individual opinion. The problem to be solved can be identified by different approaches: (1) an extensive systematic literature search; (2) group discussion within a defined steering group; and (3) open discussion rounds among panel members.

2) Panel members

Persons that participate in anonymous voting within Delphi are called the panellists. The basic issues are panel homogeneity, when a panel that can be characterized by a diversity helps to achieve a broader perspective and a higher possibility of result generalizing. On the other hand, a homogeneous panel may be more useful for previously unaddressed questions of the targeted problem. Expert panel (expertise), when this common label of panellists is sometimes difficult to fill, and the basic question is who is it the expert. There is a general agreement that selection criteria such as education, length of experience, specific experience with the issue to be addressed, or the formal position in the organization should be defined before the research begins. The panel size usually ranges from 10 to 1000, however typically rather between 10 and 100. A number of 30-50 panellists seems optimal.

3) Individual rounds of questioning

Analyses of successive repeated rounds provide an opportunity to evaluate the answers in terms of consistency and stability between two successive rounds. Repeated and interactive rounds are useful for gathering qualitative information, improving the framing of statements for the panellists, and reaching a consensus. It is important to maintain strict anonymity of panel members and their responses.

4) Criteria for conclusion (termination) of the study

The consensus definition used in published Delphi studies varies. A commonly used definition of the consensus is the percentage of agreement based on a predefined threshold, central tendency, or a combination of both. However, the percentage of agreement required varies widely from 50% to 97% and is chosen arbitrarily.

The criteria for the study completion should include at least four rounds. However, the essence of good studies are the iterative process and the controlled feedback. The termination criteria involve a consensus reached after a predetermined (usually two) rounds in most Delphi studies. However, with only two rounds of Delphi, the stability of the responses or consensus cannot be verified. The “modified Delphi” declaration arbitrarily uses two to three rounds of polling, which are decided a priori as the final criteria. However, the term “modified” is discrepant in Delphi studies and is used without any universally accepted agreement.

The stability of responses is even more confusing than consensus, and the stability of consensus is rarely used as a final criterion in Delphi studies. Traditionally, consensus or a predetermined round of surveys have served as the final criterion. This carries the risk that there was a significant change in responses in the last round that affected the stability of the results or consensus. Therefore, some authors felt that reaching consensus was not meaningful with unstable responses. Stability of results is therefore considered a necessary criterion. The stability is defined as the consistency of responses between two successive rounds of the study.

Conducting a good Delphi study is relatively challenging in terms of preparing the design, selecting the panellists (experts), conducting a sufficient number of rounds, but also obtaining a stability of results. On the other hand, this method provides a very solid basis for decision making, especially where other methods are not possible, are expensive, or otherwise ineffective. The anonymity of individual members in the panel removes the natural biases such as dominance or group conformity (defined as “groupthink”) that can be observed in face-to-face group meetings.

The above description shows that this method is not just a kind of “table talk”, that it is not based on a dominant opinion of one expert, and can be used to address a range of health care issues.


References:

Nasa, P., Jain, R., & Juneja, D. (2021). Delphi methodology in healthcare research: How to decide its appropriateness. World journal of methodology, 11(4), 116–129. https://doi.org/10.5662/wjm.v11.i4.116