NIH Preprint Pilot

The NIH Preprint Pilot is a project of the National Library of Medicine (NLM). During the pilot, NLM will make preprints resulting from research funded by the National Institutes of Health (NIH) available via PubMed Central (PMC) and, by extension, PubMed. The pilot aims to explore approaches to increasing the discoverability of early NIH research results posted to eligible preprint servers. PMC already makes available more than one million peer-reviewed papers resulting from NIH-supported research collected under the NIH Public Access Policy. This pilot builds on PMC’s NIH repository role as well as 2017 NIH guidance (NOT-OD-17-050) that encourages investigators to use interim research products, such as preprints, to speed the dissemination and enhance the rigor of their work.

What is a preprint?

Preprints are complete and public drafts of scientific documents, not yet certified by peer review. These documents ensure that the findings of the research community are widely disseminated, priorities of discoveries are established and they invite feedback and discussion to help improve the work.

Certification by peer review is the key distinction between a preprint and an accepted author manuscript or published article. Many preprints are submitted to journals for publication, and as a result, subsequent versions of the paper may also be made available after peer review. Readers of preprints should be aware that any aspect of the research, including the results and conclusions, may change as a result of peer review (see PMC Disclaimer). Authors may also revise preprints and post updated versions to the preprint server.

A paper's lifecycle from preprint to author manuscript to published article

A preprint also has a different permanent unique identifier (e.g., DOI) and citation than the version accepted for publication and published in a journal.

If you have questions regarding the NIH Preprint Pilot that are not addressed in the FAQs or feedback on any aspect of the pilot for NLM, please contact pmc-preprints@ncbi.nlm.nih.gov.

Scope

The first phase of the NIH Preprint Pilot will focus on increasing the discoverability of preprints with NIH support relating to the SARS-CoV-2 virus and COVID-19. NLM curation efforts are limited to preprints included in the iSearch COVID-19 Portfolio tool at this time.

In addition to inclusion in the COVID-19 Portfolio, a preprint must have NIH-affiliated authors or acknowledge NIH support in the preprint to be in scope for the first phase of the pilot.

NLM will expand the pilot to include preprints resulting from the broader spectrum of NIH-supported research as curation and ingest workflows are refined, automated, and made scalable.

NLM also plans to streamline the process for adding preprint citations to My Bibliography in Summer 2020 to enable NIH investigators to more easily report preprints as products of their award(s).

Timeline

The pilot will run for a minimum of 12 months, starting from June 2020. Regular updates on the pilot will be posted to the NLM Technical Bulletin.

Workflow, Display & Access

The NIH Preprint Pilot workflow aims to minimize effort required by NIH investigators to make their early research results discoverable via PMC. There will not be a separate submission process or system to add preprints to PMC. Investigators are strongly encouraged to follow the current NIH guidance on claiming interim research products to streamline identification of preprints with NIH support.

NIHPP Preprint Pilot Workflow

Step 1

NLM will identify preprints that are in scope for the pilot using available curation tools for each phase.

Step 2

Preprint citation and abstract metadata will be pulled from available web services to create an article header record in PMC. A PMCID will be assigned at this time. This early record aims to facilitate rapid discovery.

Step 3

Once loaded to PMC, a corresponding PubMed record will be created, and the preprint will be converted to archival full-text XML format as license terms allow.

The conversion process will take a few days.

Step 4

Upon completion, the full-text web version of the preprint will be made available in PMC for full-text searching and integrated with the rest of the literature.
Screenshot of NIHPP paper in PMC with banner and info and related content boxes
To ensure that researchers, clinicians, and the public can all easily distinguish between preprints and the journal literature, preprints in PMC will be identified by an NIH-branded preprint banner. All preprint records will be clearly indicated as such in the citation as well as via a prominent green info box. PubMed will use a similar info box display for preprint records. The info box will also clearly designate that the preprint has not been peer reviewed.

The yellow box for related content in PMC will include a) a pointer to the preprint on the server website and b) a link between preprint and published article, when available. Users can always access and view the preprint record directly from the server via the DOI link.

Preprint Server Eligibility

Phase 1 of the pilot will include preprints with NIH support identified in the iSearch COVID-19 Portfolio tool developed by the NIH Office of Portfolio Analysis. This tool includes content from the preprint servers with the highest volume of papers relating to COVID-19: medRxiv, bioRxiv, ChemRxiv, arXiv, Research Square, and SSRN.

More generally, in determining eligibility of a preprint server for inclusion in the pilot, NLM considers the following server policies and practices:

NIHPP Preprint Server Recommended Practices

These considerations are based on NIH guidance for selecting interim research product repositories (NOT-OD-17-050) and the recommendations for preprint servers outlined in the Committee on Publication Ethics Discussion Document (Version 1). Where applicable to preprints, NLM also looks for conformance with the Principles of Transparency and Best Practice in Scholarly Publishing (joint statement by COPE, DOAJ, WAME, and OASPA).

Further, for this pilot, NLM also considers:

  • The estimated volume of preprints with NIH support currently available in a preprint server.
  • The scope of a preprint server. Servers that generally include content other than papers (e.g., datasets, posters) or that frequently accept articles which have already been accepted for journal publication or even published in a journal are out of scope for the pilot. PMC and PubMed have existing mechanisms for selecting and ingesting content that has been accepted for publication or published in a journal.
  • Completeness of openly available metadata. Metadata must be sufficient to support accurate citation and location of preprint.

Finally, consistent with NIH guidance to investigators (NOT-OD-17-050), NLM strongly encourages eligible preprint servers to make available Creative Commons Attribution license options or the option to dedicate the work to the public domain.

NLM anticipates that these considerations may evolve over time as we continue to engage with the preprint server and broader scholarly communications communities.

Support Center

Last updated: Thurs., 7 Jan 2021