Best Practices

Accurate and efficient “each stage: first time right”
The Data Quality Challenge invites you to lead the change
Real-world successes in upstream metadata
​​
Complete and correct metadata for author affiliations and research funding are essential. The most effective approach? Fix it upstream.
Integrate metadata creation - collection, quality control and updating - in your systems and workflows before publication, and relay it throughout the editorial, production, and publication processes.
​
Below, we've gathered real-life examples of publishers putting this into practice. Use these best practices as inspiration for strengthening your own approach. ​Get inspired!
EMS Press best practice
Structured from the start: capturing metadata via ‘manuscript extraction’ as early as at submission, and building on globally valid identifiers wherever possible (RoR IDs, DOIs, ORCIDs)
-
EMS Press authors generally submit in TeX format, which enables parsing the source of a submission and extracting all of the information needed.
-
The EMS Press’ publication process builds on globally valid identifiers wherever possible: RoR IDs, DOIs, zbMATH Identifiers, MRIDs and ORCIDs. They continuously develop tooling and incorporate data from other metadata providers like the RoR registry as reliable sources of truth.
-
After peer review and at handover to the typesetter, custom LaTeX stylefiles are provided to semantically annotate each publication. Organization information is annotated with RoR IDs at the source level.
-
Source files are parsed at the time of publication using both open source and custom-built tools, and the extracted information is cross-referenced with their production database to provide feedback to the team responsible for publishing and quality control. The continuously enriched in-house datalake informs and feeds tools and processes for the next submission to publication process.
-
By considering different output formats at the moment of submission, EMS Press can treat JATS, PDF, and other output formats as equal transformations of the source file. Global identifiers keep publications and metadata identifiable in different systems, wherever the data ends up being used.


eLife best practice
Capturing affiliations at submission with ‘author select’, ensuring that ROR IDs are introduced early and verified before publication, coupled with a quality assurance process during proofing​​
-
In the submission system eJP, the author is presented with a ‘Search for Organizations’ widget to add an affiliation. Subject to their country and auto-completion, the affiliation and corresponding ROR ID are added.
-
The ROR IDs are exported in the package that is handed over to their vendor Kriyadocs.
-
During proofing, a quality assurance process which involves automated XML validation, querying ROR’s public API, and manual checks from production vendors and in-house eLife staff, ensures that missing ROR IDs are introduced and content and metadata are both complete and correct.
-
The ROR IDs are included in the JATS XML for the published articles which are deposited to Crossref.
American Society for Microbiology best practice
Combining AI-powered submission tools with editorial oversight via expert manual checks​​
-
Authors submit manuscripts via ChronosHub and receive AI-powered assistance with metadata entry and PID matching.
-
The Ringgold IDs get exported as person identity metadata in the MECA XML-compliant package that is passed to Kriyadocs at acceptance.
-
The accepted manuscript is handed over to vendor Kriyadocs, and they extract the author affiliations from the manuscripts. These data are reconciled against and combined with the submission metadata.
-
Newly captured institutions are validated automatically against the Ringgold database, using both the main institution name and location (city and country). If there is a match, the Ringgold ID is captured.
-
If there is no match, a query is inserted for copyeditors to analyze the affiliation. These staff members will try to resolve name mismatches and spelling anomalies to resolve more Ringgold IDs for author affiliations.
-
Ringgold IDs are captured in the published JATS XML and deposited to Crossref.
-
Ringgold IDs are transformed into ROR IDs for certain downstream deliverables.


Rockefeller University Press best practice
Maintaining ROR IDs across the full publishing workflow, from ‘author select’ at submission through metadata deposits upon publication​​
-
In eJP, the author is presented with a ‘Search for Organizations’ widget to add an affiliation. Subject to their country and auto-completion, the affiliation and corresponding ROR ID are added.
-
These ROR IDs persist throughout their workflow. Staff ensure that authors have applied RORs to the “manuscript affiliations”, those checked for Read & Publish deal eligibility, and all RORs are included in the metadata exported to their vendor, TNQ, who merges the data into the JATS XML for the article.
-
Upon publication, Silverchair includes the RORs in downstream deposits to Crossref and others.
American Chemical Society best practice
​​Multi-method PID matching with near-complete coverage​​
-
ACS applies a suite of methods at submission to link a Ringgold ID to an author affiliation, including:
-
Extraction from the submitted manuscript through the ACS Publishing Center powered by ChronosHub.
-
A submitting author is presented with a pick-list of suggestions based on the extraction data and/or their profile.
-
-
Applying real-time proprietary algorithms to a submitting author typing free text affiliation(s).
-
This gets ACS to 96–97% of affiliations with a Ringgold ID.
-
The Ringgold IDs get stored in the in-house production system and data lake, and quality assurance processes are in place to ensure consistency between content and underlying metadata throughout the workflow. The last 3% of Ringgold coverage in author affiliations is achieved by involving humans in combination with tools.
-
The Ringgold IDs are included in the JATS XML and the metadata file for the published articles which are deposited to Crossref.


Pensoft best practice
AI-assisted extraction with human review and in-house metadata validation
-
A proprietary tool extracts the metadata from the submitted manuscript (assisted by AI) and authors check/edit this information.
-
Throughout peer review and production, in-house staff ensure consistency between information in the manuscript/article and the structured metadata.
-
Included in the Jats XML for authors: ORCID, credit roles, (ROR IDs for affiliations coming soon).
-
Included in the JATS XML for funders: ROR and DOI.
Beilstein-Institut best practice
Post-acceptance metadata QA through automation and expert review ​​
-
Automated extraction of author affiliation information from manuscript after acceptance*).
-
Human involvement from in-house staff to search and confirm the correct affiliation – ROR ID combination.
-
Quality assurance for consistency between content and structured metadata is an integral part of the editorial, production and publication workflow and system.
-
Beilstein-Institut includes ROR IDs in their JATS XML, as well as ORCID IDs, CRediT roles, and funder DOIs, and after an article is published, the ROR IDs are included in Crossref deposits.
*) Beilstein-Institut has a proprietary submission, peer review, production, and publication system.

The Royal Society best practice
Embedding metadata in OA payment and agreement workflows ​​
The Royal Society leverages CCC RightsLink to manage their OA program, including agreements. In doing so:
-
The accepted manuscript metadata and author affiliation IDs are passed via API from the editorial system into RightsLink.
-
RightsLink leverages Ringgold IDs in the metadata, automatically checking the author's Ringgold affiliation alongside other deal eligibility criteria to assign manuscripts to the appropriate deal, apply discounts, waivers, and fulfill institutions’ reporting needs. Other organizational IDs are also supported.
-
RightsLink stores the precise identifiers that drove the match to an agreement, and those IDs are shared with publishers and their institutional customers in detailed agreement reporting.
-
This affiliation metadata can be automatically passed to other 3rd party systems via API/custom connectors Here is a sample payload:
