Limitations of A Measurement Tool to Assess Systematic Reviews (AMSTAR) and suggestions for improvement

Burda, Brittany U.; Holmer, Haley K.; Norris, Susan L.

doi:10.1186/s13643-016-0237-1

Table 1 Concerns regarding AMSTAR items, instructions, responses, and suggested revisions

From: Limitations of A Measurement Tool to Assess Systematic Reviews (AMSTAR) and suggestions for improvement

AMSTAR tool^a		Issues			Suggested revisions
Item	Instructions	Related to the item	Related to the instructions	Related to the responses	Item	Instructions	Responses
1. Was an “a priori” design provided?	The research question and inclusion criteria should be established before the conduct of the review. Note: Need to refer to protocol, ethics approval, or pre-determined/a priori published research objectives to score “yes.”	The phrase “a priori design” is unclear.	Unless a protocol is available or the authors explicitly state that the design was developed a priori, a “yes” response is not indicated; thus “cannot answer” is likely the most common response. Many review authors state that they developed the research questions and inclusion criteria prior to executing the search; however, according to the instructions, a report of such an approach would still be “cannot answer” as there is no reference to a protocol, for example.	“Not applicable” is not an appropriate response.	Reword: Were the review questions and inclusion/exclusion criteria clearly delineated prior to executing the search strategy?	Reword: The review questions and inclusion/exclusion criteria should be established a priori as evidenced by a published protocol or an explicit statement in the review. Note: If the review refers to a protocol, ethics approval, or to pre-determined research questions and inclusion/exclusions criteria, score “yes.”	Remove the “not applicable” response.
2. Was there duplicate study selection and data extraction?	There should be at least two independent data extractors and a consensus procedure for disagreements should be in place. Note: Two people do study selection, two people do data extraction, consensus process or one person checks the other’s work.	None.	The main sentence relates to extraction only, and the “note” relates to the other aspects of the question. The “note” is not clearly written; for example, does the consensus process apply to study selection as well?	“Not applicable” is not an appropriate response.	None.	Reword: There should be at least two independent assessors for study selection (i.e., title, abstract and full-text screening). There should be either duplicate independent data extraction or verification of extracted data by a second person. A consensus process should be used when disagreements arise in either study selection at the full-text stage or in data extraction. Note: If two independent people do study selection and data extraction is verified, with consensus used in the event of disagreements, then indicate “yes.”	Remove the “not applicable” response.
3. Was a comprehensive literature search performed?	At least two electronic sources should be searched. The report must include years and databases used (e.g., Central, EMBASE, and MEDLINE). Key words and/or MESH terms must be stated and where feasible the search strategy should be provided. All searches should be supplemented by consulting current contents, reviews, textbooks, specialized registers, or experts in the particular field of study, and by reviewing the references in the studies found. Note: At least two sources plus one supplementary strategy used, select “yes.”	This item should proceed to the current item 2.	Additional clarity is needed and inclusion and exclusion criteria related to language of publication should be explicitly addressed.	“Not applicable” is not an appropriate response.	Reorder: This item should precede current item 2.	Reword: At least two bibliographic databases should be searched. The report must include years and databases examined (e.g., Central, EMBASE, and MEDLINE). Key words and/or MESH terms must be reported and the search strategy available. All searches should be supplemented by consulting reviews, specialized registers, or experts in the particular field of study, and by reviewing the references in the studies found. Publications in all relevant languages should be sought and a justification provided when there are language restrictions. Note: If at least two bibliographic databases plus one supplementary strategy were used, select “yes.”	Remove the “not applicable” response.
4. Was the status of publication (i.e. gray literature) used as an inclusion criterion?	The authors should state that they searched for reports regardless of their publication type. The authors should state whether or not they excluded any reports (from the systematic review), based on their publication status, language, etc. Note: If review indicates that there was a search for “gray literature” or “unpublished literature,” indicate “yes.” Single database, dissertations, conference proceedings, and trial registries are all considered gray for this purpose. If searching a source that contains both gray and non-gray, must specify that they were searching for gray/unpublished literature.	As written, this item is a reporting issue and not a quality issue. The item implies that if publication status was an inclusion (or exclusion) criterion, you respond “yes.” This differs from the instructions which focus on the appropriate inclusion of gray literature.	The second sentence suggests that the review simply has to state if any reports were excluded based on publication type, which is a reporting issue and not a quality issue. Language of publication is primarily an issue of gray literature.	“Not applicable” is not an appropriate response.	Reword: Was relevant gray literature included in the review? Reorder: This item should follow current item 3.	Reword: The authors searched for and considered gray literature (e.g., trial registries, conference abstracts, dissertations, and unpublished reports) as appropriate to the research question. Note: If the review indicates that there was a search for gray literature that is appropriate to the research question, score “yes.”	Remove the “not applicable” response.
5. Was a list of studies (included and excluded) provided?	A list of included and excluded studies should be provided. Note: Acceptable if the excluded studies are referenced. If there is an electronic link to the list, but the link is dead, select “no.”	None.	Including a list of all excluded studies may not be feasible, even if online capabilities are available. It is unclear at what stage the excluded list is focused; the full-text or the title and abstract stage.	“Not applicable” is not an appropriate response.	None.	Reword: A list of included and excluded studies at the full-text stage should be available to the reader (either within the publication, in an online appendix, or from the review authors). Note: If a list of both included and excluded studies (the latter at the full-text stage) is available either directly or by inquiry, then score “yes.”	Remove the “not applicable” response.
6. Were the characteristics of the included studies provided?	In an aggregated form such as a table, data from the original studies should be provided on the participants, interventions and outcomes. The ranges of characteristics in all the studies analyzed e.g. age, race, sex, relevant socioeconomic data, disease status, duration, severity, or other diseases should be reported. Note: Acceptable if not in table format as long as they are described as above.	As written, this question focuses on reporting and not quality.	It should be emphasized that the ranges of characteristics should be tailored to the review question.	“Not applicable” is not an appropriate response.	None.	Reword: In summary form, relevant data from the individual studies should be provided on the participants, interventions, comparators and outcomes. Note: If the summary provides the information necessary for the reader to understand the key characteristics of each study, score “yes.”	Remove the “not applicable” response.
7. Was the scientific quality of the included studies assessed and documented?	A priori methods of assessment should be provided (e.g., for effectiveness studies if the author(s) chose to include only randomized, double-blind, placebo controlled studies, or allocation concealment as inclusion criteria); for other types of studies alternative items will be relevant. Note: Can include use of a quality scoring tool or checklist (e.g., Jadad scale, risk of bias, sensitivity analysis, etc.), or a description of quality items with some kind of result for each study (“low” or “high” is fine, as long as it is clear which studies scored “low” and which scored “high”; a summary score/range for all studies is not acceptable.	The meaning of “scientific quality” is unclear. At the individual study level, an assessment of the risk of bias is likely to be more useful than consideration of quality. It is also unclear if this item refers to the individual study or to the body of evidence.	The meaning of the phrase a priori methods of assessment” is unclear. The tools used to assess risk of bias should be reliable, valid and tailored to the study design and include relevant contextual issues. Quality scoring tools are not generally recommended because they require each item to be weighted relative to other items. A sensitivity analysis is not a type of quality tool or checklist.	“Not applicable” is not an appropriate response.	Reword: Was the risk of bias assessed for each included study, taking into account important potential confounders and other sources of bias relevant to the review question?	Reword: At least two authors should assess the risk of bias using an instrument appropriate to the study design and context. A consensus process should be used to determine the final assessment. The risk of bias should be reported for each study. Quality scores should not be used; categories such as high, moderate, and low are preferred. Note: If the risk of bias of each included study was appropriately assessed and reported, score “yes.”	Remove the “not applicable” response.
8. Was the scientific quality of the included studies used appropriately in formulating conclusions?	The results of the methodological rigor and scientific quality should be considered in the analysis and the conclusions of the review, and explicitly stated in formulating recommendations. Note: Might say something such as “the results should be interpreted with caution due to poor quality of included studies.” Cannot score “yes” for question if scored “no” for question 7.	The meaning of “scientific quality” is unclear.	Systematic reviews should not contain recommendations; the difference between methodological rigor and scientific quality is unclear; and additional guidance is needed on how best to use quality assessments when formulating conclusions. The item refers only to conclusions; however the instructions refer to both analysis and conclusions. It is unclear how quality should be considered in analyses.	It is unclear how the response “not applicable” would be applied.	Reword: Was the quality of the body of evidence appropriately assessed and considered in formulating the conclusions of the review?	Reword: The review authors should have assessed the quality of the body of evidence for each important outcome across studies using GRADE or another explicit and transparent approach [37, 60, 61], and the review conclusions should reflect that assessment. Note: Score “yes” if the review authors appropriately considered the quality of the body of evidence (across studies) for each important and critical outcome in the review’s conclusions.	Remove the “not applicable” response.
9. Were the methods used to combine the findings of studies appropriate?	For the pooled results, a test should be done to ensure the studies were combinable, to assess their homogeneity (i.e., χ ² test for homogeneity, I ²). If heterogeneity exists a random effects model should be used and/or the clinical appropriateness of combining should be taken into consideration (i.e. is it sensible to combine?). Note: Indicate “yes” if they mention or describe heterogeneity (i.e., if they explain but cannot pool because of heterogeneity/variability between interventions.	The item addresses the method for combining studies, yet the instructions relate to issues of statistical heterogeneity and imply that a meta-analysis was performed.	It is not appropriate to examine statistical heterogeneity before clinical appropriateness: the latter should always be performed first. Tests for heterogeneity do not “ensure the studies were combinable.”	None.	Reword: Were the data appropriately synthesized in a qualitative manner and if applicable, was heterogeneity assessed? If a meta-analysis was performed, was it appropriate?	Reword: Authors should provide a qualitative synthesis and explore heterogeneity if applicable. If a meta-analysis was performed, it should have been performed in an appropriate manner. Note: Score “yes” if the qualitative synthesis is appropriate, if heterogeneity was explored, and if a meta-analysis was performed, it was appropriate.	None.
10. Was the likelihood of publication bias assessed?	An assessment of publication bias should include a combination of graphical aids (e.g., funnel plot, other available tests) and/or statistical tests (e.g., Egger regression test). Note: If no test values or funnel plots included, score “no.” Score “yes” if mentions that publication bias could not be assessed because there were fewer than 10 included studies.	None.	These tests examine the issue of small study bias, not publication bias per se. Often more important than graphical and statistical tests in exploring publication bias is information that can be retrieved from study registries, and from regulatory and other agencies (e.g., gray literature).	“Not applicable” may be an appropriate response if the assessment of publication bias is inappropriate (e.g., less than 5-10 studies) or was assessed as part of the tool used to evaluate the body of the evidence (item 8).	None.	Reword: The potential for publication bias should have been considered in the review, using other information as relevant, and graphical aids and statistical tests as appropriate. The limitations of the statistical and graphical tests should be noted in the review. Note: A “yes” response can be used if the review authors explored the data and other relevant information sources for evidence of small study or publication bias. A “not applicable” response should be used if publication bias was considered as part of quality assessment of the body of evidence in item 8.	None.
11. Was the conflict of interest stated?	Potential sources of support should be clearly acknowledged in both the systematic review and the included studies. Note: To get a “yes,” must indicate source of funding or support for the systematic review AND for each of the included studies.	The phrase “conflict of interest” is unclear. This likely refers to whether there is a disclosure of conflicts, but it is unclear whether this refers to individual authors of the review and/or included studies or to the funder of the review and/or included studies.	The instructions are not congruent with the item. “Sources of support” could refer to funding for the review, financial support for the review authors, or funding of the included studies. Conflict of interest includes other interests that may interfere with the authors’ objectivity, such as personal financial interests.	“Not applicable” is not an appropriate response.	Reword: Were conflicts of interest disclosed for all of the review authors and was the funding source of the review and of each study within the review reported?	Reword: Disclosures of relevant interests should be provided for all review authors and the source of funding for the review and for each study included in the review should be reported. Note: “Yes” is indicated if disclosures of interest are provided for all review authors, the funding for the review is provided and is not likely to be a source of bias to the review’s conclusions, and the funding for all included studies is indicated (or if not reported in the individual studies then this is indicated).	Remove the “not applicable” response.
12. Proposed new Item	Not applicable	Not applicable	Not applicable	Not applicable	Were relevant subgroups considered in the review process, analysis, and conclusions?	Relevant population subgroups and characteristics should be considered in the scope and in the key questions for the review, and in searching, data extraction and analysis and in the review’s conclusions. Note: “Yes” is indicated if the main relevant subpopulations and characteristics were considered throughout the review process.	Yes, no, cannot answer

^aItems, instructions, and notes listed on AMSTAR’s website (http://amstar.ca/Amstar_Checklist.php) as of June 10, 2015

Back to article page