Describing corrections, retractions and fact checking interventions

As this article makes clear fact checking involves more than just publishing articles that review false or misleading claims. Fact checkers are also directly following up with the authors of those claims request corrections, and using other levers, like putting pressure of regulators, to intervene where they have powers to do so.

Structured data describing the fact checking process should ideally reflect the full range of this activity. So, in addition to publishing data about the results of fact checking (ClaimReview), how can we:

  • describe when corrections and publications are made to published material?
  • be transparent about when fact checkers themselves need to correct their work?
  • describe the broader interventions that fact checkers take, and their outcomes?

This section of the notes explores ways to answer these questions.

Publication and modification dates

The majority of the resources that Full Fact describe in their data are classes of CreativeWork. This includes the web page containing a review, an individual ClaimReview, a distinct Claim and also the content referenced via the appearance property.

At a basic level, it can be useful to just know when these works have been revised.

Consistent use of the datePublished and dateModified properties will help capture this information and allow consumers to accurately identify and query for changes.

Currently these properties are currently provided for ClaimReview in the Full Fact Profile.

I would suggest at least adding them to sightings listed in the appearance property, when Full Fact identifies a relevant change (e.g. a correction).

Besides recording of time stamps, how can we describe what has changed?

Corrections to sightings

Corrections to published articles are often handled by publishing a correction notice. This might appear in the article itself, for example:

This article was amended on 31 May 2019 because an earlier version said that nearly 5bn plastic straws were being used each year in the UK. That is the figure for England. This has been corrected.

Or, the notices might be published separately.

Schema.org has proposed a means of describing these correction notices, as a CorrectionComment.

"appearance": [{
  "@type": "CreativeWork",
  "url": "https://www.theguardian.com/environment/2019/may/22/england-plastic-straws-ban",
  "datePublished": "2019-05-22",
  "dateModified": "2019-05-31",
  "author": {
    "@type": "Organization",
    "sameAs": "https://www.wikidata.org/wiki/Q192621",
    "name": "The Guardian"
  },
  "correction": {
    "@type": "CorrectionComment",
    "text": "This article was amended on 31 May 2019 because an earlier version said...",
    "datePublished": "2019-05-31",
    "correctedWork": "https://www.theguardian.com/environment/2019/may/22/england-plastic-straws-ban"
  }
}],

Full Fact might not want to incorporate the full text of a correction. So alternatively a CorrectionComment could be published with a description that briefly summarises the change. E.g.:

"correction": {
  "@type": "CorrectionComment",
  "@id": "...",
  "description": "The Guardian published a correction to one of the claims in this article",
  "datePublished": "2019-05-31",
  "correctedWork": "https://www.theguardian.com/environment/2019/may/22/england-plastic-straws-ban"
}

There’s an editorial decision to be made how much detail to capture here. Would it be useful to record the full notice? Or would Full Fact instead prefer to just record when a correction has been made, with a brief summary. There’s potentially extra value in having more detail, but will cost additional time.

If a correction notice has been published separately to the article, then we can just add a url to the comment:

"correction": {
  "@type": "CorrectionComment",
  "@id": "...",
  "url": "https://www.dailymail.co.uk/home/article-9547717/Corrections-clarifications.html"
  "description": "The Daily Mail corrected a claim about Putin's yacht",
  "datePublished": "2019-05-05"
}

As an aside, it appears that the Mail don’t link to corrected articles, whereas the Guardian docs. In the Daily Mail example, there are in fact several articles being corrected. The data model needs to give some flexibility here so that we can link to multiple values in correctedWork.

As we’ll see later, one of the benefits of describing a CorrectionComment as a separate resource, ideally with its own @id, is so that we can reference it from other parts of the data.

Formal Retractions

There are other ways in which articles might be modified post-publication. Scholarly research recognises retractions alongside the corrections and addenda already discussed.

While in some cases articles might simply be removed, the most common way that retractions are handled is marking the articles and then separately publishing a retraction notice.

These retractions could also be added to our model. While using CorrectionComment might fit, with a url referring to the location of the published notice, it would be better if Schema.org also defined a RetractionComment type to distinguish these cases. (And perhaps a retraction property, although I’d argue comment still fits).

This addition is something that Full Fact could propose alongside CorrectionComment.

Deletions of articles or postings

Less formally, admissions of error are frequently handled by simply deleting an incorrect tweet or posting.

Schema.org doesn’t currently provide a means of describing the status of a work. So some new terms would be required if Full Fact consider it useful to track deleted appearances.

Alternatively, these could be recorded as other examples of a RetractionComment. For example:

"appearance": [{
  "@type": "CreativeWork"
  "datePublished": "2021-04-20",
  "author": {
    "@type": "Person",
    "sameAs": "https://twitter.com/jennyrickson"
    "description": "Twitter user"
  },
  "correction": {
    "@type": "RetractionComment",
    "datePublished": "2021-04-30",
    "description": "The user later deleted their tweet"
    "author": {
      "@type": "Organization",
      "name": "Full Fact"
    }
  }
}]

In this example the author of the comment is clearly marked as Full Fact, so its clear that the comment comes from a different source to the author of the tweet.

In scholarly publishing a retraction notice will often be published by the editorial staff rather than the author of the original work. So this example aligns with existing practice, with the difference here being that the content has also been deleted.

Other types of outcomes

Full Fact are not looking to track all corrections or retractions from the sightings recorded in their data. Their primary goal is to track the outcomes of an intervention they have made. E.g. a request to an author to correct their article.

What we’ve looked at so far are the positive, verifiable outcomes that might result from an intervention. Circumstances where a correction or retraction notice can be directly linked to an appearance.

The team currently have an editorial process for describing interventions and the technical team are exploring ways to surface that information in a structured way.

Reviewing the range of outcomes catalogued so far its easy to identify a number that are both less “positive” and/or are harder to evidence. For example:

  • no action is taken
  • the suggestion to correct or retract a claim was refused or disputed
  • the issues with the claim were discussed with the source, and while this did not resolve in a correction, it may influence future behaviour
  • the action taken was not directly related to the appearance of a Claim. For example, a methodology was published to provide more insight into the figures used to make a claim, or a regulator wrote a letter to the author of the claim
  • some action was taken, but it is hard to clearly evidence. For example a headline or some content was revised without acknowledging a change
  • …etc

Given the range of activities involved and the wider group of actors engaged in them, it would be impractical to model all of these outcome in a consistent, unambiguous way.

My suggestion would be to start with a simple approach that combines provision of a general description of an outcome alongside a reference to public evidence, where available.

Describing fact checking interventions

A basic model for describing a fact checking intervention would need to cover the following:

  • who performed the intervention (e.g. Full Fact)
  • the motivation for making the intervention
  • who was the intervention directed at, e.g. which person or organization was requested to perform some action?
  • the status of the intervention, from the perspective of the fact checker
  • a description of what was requested
  • a description of the outcome, once the request is complete
  • evidence of a succesful outcome, e.g. a pointer to some result, where available
  • timestamps to record activity

This aligns with the current data that Full Fact have internally and is sufficient to record the progress of an intervention from initiation through to completion.

Sketched out as JSON-LD, this might look something like:

{
  "@type": "FactCheckingRequest",
  "agent": {
    "@type": "Organization",
    "name": "Full Fact"
  },
  "sourceReview": {
    ...the ClaimReview motivating the request...
  },
  "recipient": {
    ...the person or organization contacted...
  },
  "description": "...a general description of what has been requested...",
  "datePerformed": "date",
  "dateUpdated": "date",
  "requestStatus": {
    "@type": "DefinedTerm",
    "name": "..."
  },
  "result": [{
    "@type": "Outcome",
    "description": "a description of the outcome",
    "correctionComment": {
        "@type": "CorrectionComment",
        "@id": "..."
    }
  }]
}

Key things to note about this sketch:

  • the name FactCheckingRequest needs some review. Perhaps FactCheckingIntervention instead? Would FactCheckingRequest imply a request to check a claim, rather than a follow-up?
  • a request is made as a result of completing a fact checking exercise. The FactCheckingRequest has a sourceReview property which links to the ClaimReview that motivated it. Associating the request with a review helps to put it in context
  • there is a single recipient per request. If several people are contacted, then these are described with separate requests, each with their own timestamps, status, etc. So a request tracks the interaction with that recipient and the outcome of that interaction.
  • a recipient will be either a Person or Organization with sameAs links where possible
  • a result may need to have multiple values so is given as an array. This would allow capturing of details like a organisation corrected an article and also issued a press notice to explain their error. But this needs more input on whether multiple outcomes are common or not.
  • the requestStatus would ideally be a DefinedTerm rather than a free-text field. The Full Fact data shows some obvious values for this simple taxonomy, e.g. “Awaiting Response”, “Closed - unresolved”.
  • the correctionComment property links an outcome to a specific change in an article, via the @id of a CorrectionComment, allowing the intervention to be traced through to some change in a sighting

A useful inverse of sourceReview would be allow a ClaimReview to have an optional factCheckingRequest property:

{
  "@context": "http://schema.org",
  "@type": "ClaimReview",
  "url": "...",
  "claimReviewed": "...",
  "reviewRating": {
    "@type": "Rating",
    "alternateName": "..."
  },
  "factCheckingRequest": {
    "@type": "FactCheckingRequest"
    ...
  }
}

Taken together, this model allows the full fact checking process to be articulated in the data:

  • the fact checker writes up a claim, resulting in a ClaimReview being published
  • as a result of that process, the fact checker makes a FactCheckingRequest to one or more organisations, this is surfaced in the data
  • the fact checker updates that request, as required, to record updates of its status and outcomes
  • the recipient of the request might correct their articles
  • the fact checker can update their data to record the correction and close down the request

The impacts on the Full Fact editorial process are likely to be:

  • tracking multiple requests, not just a single intervention per review
  • the need to more clearly define who has been contacted
  • more structure around recording status of an intervention
  • capturing and recording data about corrections and retractions

Some variations on this proposal

There are ways to refine this proposal a little more, but these may require further changes to the editorial process to capture more structured categories for requests and outcomes.

Categorising requests

There may be some common categories of FactCheckingRequest, e.g. “Correction Requested” or “Contacted Regulator” which it might be helpful to expose other than as free-text in a description.

These can be added to the current proposal, as follows:

{
  "@type": "FactCheckingIntervention"
  ...
  "requestCategory": {
    "@type": "DefinedTerm",
    "name": "Correction Request"
  }
}

If there are likely to be multiple categories, then the requestCategory might need to be an array.

By using a category, whose values are extensible, we avoid the need to fully define and agree a set of request types. Fact checkers can converge on this separately, or adopt a category provided by Full Fact.

Categorising outcomes

Taking a similar approach, the outcome of a request might also be categorised:

{
  "@type": "Outcome",
  "description": "a description of the outcome",
  "outcomeCategory": {
    "@type": "DefinedTerm",
    "name": "Correction Published"
  },
  "correctionComment": {
    ...a link to a CorrectionComment or RetractionComment attached to a sighting...
  }
}

The Full Fact data has less obvious structure in this area. But there are definitely patterns that would allow these categories to be collected.

Alignment with Schema.org

How does the proposed approach align with Schema.org?

Schema.org has vocabulary for describing an Action. There is a set of different types of action including a CommunicateAction which is similar to the FactCheckingRequest described here.

The model proposed here could build on that framework. The above examples already reuse some relevant property names.

But there are some differences in design:

  • An Action has an ActionStatusType, but this has a limited set of values which are very different to the range of statuses that Full Fact use for tracking their interventions
  • An Action has a startTime and endTime. Is a request active for a period of time, or is it performed and then its status updated? My sketch takes the latter approach, but this could be revised to align
  • There are terms defined for Action which are just not relevant, e.g. target, instrument, location, object, error

My initial recommendation would be to explore the above proposal (or something similar), to see if it fits with Full Fact’s requirements for its API. And only then take this as a proposal to Schema.org. That might require some later realignment, but pending change could be communicated with API users.

This is clearly a custom area of the data model, so some experience of applying it in practice (and then having real-world data to back it up) might help drive through the conversation with Schema.org and other fact checkers.

Corrections to fact checking articles

Finally, how can Full Fact be transparent about corrections to their own work? For example.

I’d recommend adding these corrections to your data as well, using the same CorrectionComment model as for sightings.

But what is being corrected in this instance? In the example the claim review itself wasn’t changed. Similarly, here is a comment that relates to an article, but not the verdict of the review. But there may be other cases the correction might have some bearing on the final verdict.

This exposes a limitation in the current Claim Review model: its doesn’t clearly distinguish between a ClaimReview and the web page or article that contains it.

In some, and possibly many cases, the review and the article are one and the same.

But as we’ve seen in our previous discussion, a single page on the Full Fact website might actually contain several reviews, each covering different claims from different authors.

Different fact checkers will have adopted different approaches here. And a page may ever only contain a single ClaimReview of a single Claim.

At the moment we can’t clearly say “We corrected a typo on this page” separately from “We corrected this ClaimReview”. There is no FactCheckingArticle (similar to a NewsArticle) that contains the review.

It might be helpful to have a broader community discussion around whether this distinction might be useful. For example, it might help to align the ClaimReview model with the news markup aspects of Schema.org. For example ReviewNewsArticle has a very similar design to ClaimReview.

There is clearly overlap between the output of news and fact checking organisations: news articles might also contain review of claims alongside other opinion and commentary. Being able to capture this information, whilst distinguishing between news articles and content which is primarily fact checking might be helpful.

The data model for news articles in Schema.org also includes terms for describing publishingPrinciples, to link to policies around corrections and fact checking. Having these policies is part of the eligibility requirements that Google have placed on inclusion of fact checks in their search results, so exposing this as structured data might be beneficial.

For now, I’d recommend you automatically adding a CorrectionComment to any ClaimReview when an article is updated, assuming it is easy to do so. This would make the data available in the short term. Further revisions can come later.

Recommendations

  • Use datePublished and dateModified to provide basic publication and modification dates for articles, ClaimReview and sightings
  • Propose that correction, CorrectionComment and correctedWork are formally included in Schema.org, giving a clear intention to use it, citing the original issue. Further, I’d suggest proposing that correctedWork is an array.
  • Propose the inclusion of a RetractionComment to cover retraction notices
  • Test out the proposed outline for FactCheckingRequest using existing data and the API, before taking this as a formal proposal to Schema.org. Make it clear to API users if this area might change in future. (Or alternatively, just accept this is a more custom addition?)
  • Conside also publish structured data about corrections to Full Fact’s own articles