Citing Evidence

Fact checking articles link to a number of types of resources, including:

  • appearances of a claim in different contexts and by different people
  • pointers to related discussion and coverage of a claim, for example previous fact checks by the same or other organisations
  • links to supporting evidence, data and legislation that is relevant to the analysis of a claim and the conclusion reached by the fact checker

In previous notes we’ve explored how to provide structured data about the first of these types of links, by associating appearance lists with a Claim. This helps third-parties associate a specific fact check with those appearances.

But what about providing structured data about other types of links?

Why provide structured data about citations?

What are the reasons for providing structured data for other types of links?

There’s no clear argument for providing structured data about every link in a fact checking article. If a consumer of the data needed a full list then its simple enough to extract links from an HTML page.

But there are reasons for providing more information about references to evidence, data and legislation, as more formal citations.

Scholarly research, reports, white papers and similar outputs all recognise the important role of clear citations, which includes:

  • acknowledging and attributing the work of others
  • helping to build trust and confidence in the statements or conclusions made in a document by directly linking to evidence or data that supports it
  • providing a means for readers to find related data and insights, so that they may conduct further reading, research and analysis
  • supporting understanding of the impact of research by facilating analysis of the networks of citations that link together published works

At least the first three are relevant within the fact checking space.

The final benefit might also be helpful in building stronger engagement between fact checks and publishers of official statistics and research by making it easier to identify how and where that data is being used.

In their eligibility criteria for Claim Review, Google explain that:

Your fact check analysis must be traceable and transparent about sources and methods, with citations and references to primary sources.

So providing structured data about those citations would seem like a useful addition to current practice.

How can we provide citations for fact checks?

Schema.org defines the citation property of a CreativeWork as:

A citation or reference to another creative work, such as another publication, web page, scholarly article, etc.

As a ClaimReview is a type of CreativeWork then we can use this property to provide lists of citations:

{
	"@context": "http://schema.org",
	"@type": "ClaimReview",
	"identifier": "bf7ebf4d-820e-4777-81ee-38553d6e3b28",
	"datePublished": "2021-05-12",
	"description": "There is no evidence that...",
	"url": "https://fullfact.org/online/ethylene-oxide-covid-test-ivermectin/",
	"claimReviewed": "Ivermectin can cure Covid-19 in 48 hours.",
	"reviewRating": {
		"@type": "Rating",
		"alternateName": "A study looking at Ivermectin\u2019s impact on the SARS-CoV-2 virus...",
		...
	},
	"reviewBody": "A study looking at Ivermectin\u2019s impact...",
	"itemReviewed": {
		"@type": "Claim",
		"appearance": [{
			"@type": "CreativeWork",
			"url": "https://www.facebook.com/holly.myhal/videos/2956582674589404/",
			"datePublished": "2021-05-02",
			...
		}]
	}
	"citation": [
		{
			"@type": "CreativeWork",
		"name": "Comparing the COVID-19 Vaccines: How Are They Different?",
		"datePublished": "2021-05-13"
    	"url": "https://www.yalemedicine.org/news/covid-19-vaccine-comparison"
			...
		},
		{
			"@type": "CreativeWork",
			"url": "https://doi.org/10.1016/j.antiviral.2020.104787"
			"name": "The FDA-approved drug ivermectin inhibits the replication of SARS-CoV-2 in vitro",
			"datePublished": "2020-06"
			...
		}
  ]
}

The above example lists the two key external references associated with checking one of the specific claims.

Consistent with how a CreativeWork listed as an appearance would be included, the example also provides the title, url, date of publication of the work. Other metadata like the author and publisher might also be useful to include.

How can we cite datasets?

As a Dataset is a CreativeWork then we can also use citation to directly link to a Dataset:

"citation": [
  {
    "@type": "Dataset",
    "name": "Deaths registered weekly in England and Wales, provisional",
    "url": "https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/datasets/weeklyprovisionalfiguresondeathsregisteredinenglandandwales"
    "dataPublished": "2021-05-25"
    ...
  }
]

Given the work that the ODI, Full Fact and other fact checkers are doing to engage with data users, explicitly linking to datasets may help evidence the benefits of closer collaboration.

How can we provide more context to a citation?

While it would be straight-forward to include all referenced material as citations, some editorial review would make the data more useful.

If the list of citations included only the most important or significant resources and data used to develop a conclusion, then this would add more value for the data consumer. This might be done by asking the author to confirm the list of resources prior to publication.

In the context of fact checking its important to distinguish between when something is referenced as an appearance and when it is a citation despite both potentially referring to datasets, articles and research papers. They’re different types of link.

For example this review on some ONS data highlighted that the data was misleading. It was the source of some incorrect claims. The dataset should be identified as an appearance that was later corrected, rather than as a citation.

Similarly, this article reviews claims made in a research paper that was later retracted. It’s another example of an appearance rather than a citation.

In this fact check about Ivermectin a research paper is clearly being cited. Although it is not done so as a direct confirmation or rebuttal of the claim. The fact check instead references the paper in order to discuss its limitations and how that has been misinterpreted. It was not a direct source of an incorrect claim.

Looking through the range of material that Full Fact have published, “Study X doesn’t actually show Y” is a recurring theme. Citations aren’t always endorsements.

There have been many debates about how to capture the nuance around citation practices in scholarly research. Various approaches and ontologies have been proposed but none yet seem to have been widely adopted.

One of the more recent is the CiTO ontology which proposes a variety of properties that clarify the reason for citing another work.

For example it distinguishing between citing a work as a data source, as a source of discussion, or support for an argument. Some of these citation forms clear overlap with the meaning of appearance.

The ontology has been designed to align with Schema.org. It could be used as a source of additional custom properties (or simply inspiration) if Full Fact wanted to add more nuance in its data around how works are being cited.

Clearly that would add additional work for authors and so this would be worth testing with potential data users to see if the distinctions are helpful.

Use of DOIs and monitoring retractions

Finally, when citing academic research I would recommend adopting a policy of linking to works via their DOI (where available).

So, instead of linking to:

https://www.sciencedirect.com/science/article/pii/S0166354220302011

The Ivermectin fact check would link to:

https://doi.org/10.1016/j.antiviral.2020.104787

The reader will end up being redirected to the same location, but there are several benefits of using the HTTP version of a DOI as the primary link:

  • They provide a stable link to academic research, so will be a more permanent way to link to research that won’t be impacted by, for example, publishers revising their websites
  • There are scholarly research tools and infrastructure (e.g. CrossRef Event Data service) that mines web content looking for references to DOIs. This will help to surface discussion and citation of research contained in fact checks to researchers and publishers
  • It could help Full Fact and other fact checkers monitor for corrections and retractions to research that may be useful to reflect in fact checks

Here is a simple example of the last point.

In this fact check of a paper called “Facemasks in the COVID-19 era: A health hypothesis”, Full Fact highlighted a number of issues with the work.

The fact check notes that:

Full Fact approached the journal’s editor for more detail on how this particular paper was reviewed who told us: “We are aware [of] all the issues related to the publication in question. Actions are in progress we will inform you as soon as final actions are complete.”

In fact the paper was later retracted and this correction is now clearly displayed on the website, if a reader follows the link.

The status of the article is also available via the CrossRef API in a machine-readable form:

"update-to": [
  {
    "updated": {
      "date-parts": [
        [
          2021,
          1,
          1
        ]
      ],
      "date-time": "2021-01-01T00:00:00Z",
      "timestamp": 1609459200000
    },
    "DOI": "10.1016/j.mehy.2020.110411",
    "type": "retraction",
    "label": "Retraction"
  }
],

Using DOIs to link to research and then monitoring the (open, free to use) CrossRef API for retractions or revisions would allow some automation of updates to relevant fact checks.

This could be useful not just in refuting claims but also in circumstances when a fact check might need to be reviewed or revised. This would be a direct benefit from investing in improvements to the data.

Recommendations

  • Use citation to list the key papers, legislation and datasets referenced when creating a ClaimReview
  • Include details about title, author, publication dates in cited works
  • Decide on the editorial policy for when a work should be cited and expose that in any API documentation
  • Use DOIs to link to academic research and datasets to help surface their use in fact checking content
  • Consider using DOIs to monitor for retractions and corrections to work that is referenced as an appearance or citation
  • If more nuance is required around forms of citation, then consider using the CiTO ontology to enrich the data further