Skip to main content

JATS-XML: Why Scholarly Publishers Should Embrace the Standard

Kevin Boshold

Created on 03. June 2024

Illustrated representation of a scientific magazine with smartphone and various scientific symbols
Table of contents

"Without standards, there can be no improvement”, the Japanese engineer and one of the founding fathers of the Toyota Production system, Taiichi Ohno is reported to have stated, highlighting the timeless value of uniformity in processes and products. In the vast expanse of modern industries, standardization serves as the backbone of efficiency, safety, and interoperability. 

By establishing common languages, practices, and benchmarks, industries can streamline operations, reduce costs, and foster innovation. This enhances product quality and consumer trust and facilitates international trade and cooperation, proving that standardization is not merely a tool but a catalyst for global progress and sustainability.

These advantages of standardization all hold equally true in scholarly publishing. One of the most powerful standards in this business is the XML format JATS (Journal Article Tag Suite).  
 

1. Decoding XML and JATS

XML (Extensible Markup Language) is a markup language that encodes content in a way that is both human-readable and machine-readable, while also being layout-independent. Various industries utilize distinct XML standards to author media-neutral content, process it efficiently, and publish it to various channels and platforms. In the scholarly publishing domain, the JATS XML format has become the industry standard. But what is JATS and why has it become so popular?      

JATS, also known as NISO JATS, was developed by the National Information Standards Organization (NISO) and approved by the American National Standards Institute. It is an internationally used standardized markup format specifically designed for the digital publication and exchange of scientific articles and publications.

In simple terms, JATS is a set of XML elements and attributes that describe the structure and semantics of a journal article facilitating its automated processing and publication. Originally, the JATS standard was primarily used for scientific, technical, medical, and engineering journal articles. However, journals from other fields, such as humanities, sociology, economics, and social sciences, now also rely on it.

There are three JATS tag sets that target three different groups in the publishing lifecycle:

  • Article Authoring — for article authors
  • Journal Publishing — for publishers, used in the production workflow
  • Journal Archiving and Interchange — for archives and libraries

Some other JATS extensions are NISO Standards Tag Suite (NISO-STS) and Book Interchanging Tag Suite (BITS).

 

2. Why JATS has become the Industry Standard

Scholarly publishers are increasingly adopting JATS for their content production, and in some cases the JATS format is already a requirement for indexing services and delivery channels. What makes this XML standard so appealing?

  • Developed for articles: JATS is tailored to the journal production use case and is aligned with the state-of-the-art publishing approach for journals and preprints.
  • Dynamic standard: It is constantly evolving to meet the needs of publishers and other users.
  • Structural and declarative: JATS markup is structural and declarative, which facilitates software-based processing of journal articles and ensures data preservation.
  • Available knowledge base: The JATS standard is very well documented. Comprehensive tag libraries are available online and free of charge containing instructions and examples for the use of elements and attributes, as well as many recommendations for best practices.
  • Advantages over custom XML schema: Using a uniform industry standard instead of developing a proprietary schema has many other benefits for publishers, including improved interoperability with third-party systems and platforms, significantly lower implementation costs, increased attractiveness to authors and editors, and valuable knowledge building within the organization.
       

3. The Advantages of JATS based Publications

Relying on JATS pays off throughout the entire editorial process. From authoring and management to publication and distribution, there are a host of benefits. To name just a few:

  • More efficient, streamlined production of consistent, high-quality and media-neutral content
  • Effortless output in various formats including HTML, EPUB, (print) PDF, eBook or XML
  • Seamless dissemination across any required channels and platforms such as websites, web stores, content delivery platforms or scientific databases and libraries
  • Increased visibility and hence expanded audience
  • Improved findability of publications, both on the web (through search engines such as Google Scholar or in scholarly databases and libraries etc.) and in in-house content management systems
  • Simplified citation in other publications
  • Consistent display on all output channels and devices
  • Enhanced readability and accessibility
  • Greater efficiency in document management compared to documents in traditional formats like Microsoft Word
  • Easier and faster production of new editions or entirely new publications (e.g. scientific anthologies etc.) and consequently easier reuse and remonetization of existing content

In a nutshell, JATS files, or XML files in general, are an ideal fit for scholarly publishing given their versatility, time savings, and adherence to quality standards.

Authors and publishers make sure that their content is well organized and easily accessible on various platforms without requiring much effort on their end.

 

4. Integrating JATS into the editorial process

Since JATS is essential for efficient publication across various formats and channels, the content must be available in this format at some point during the editorial process. It is generally recommended to use JATS as early as possible. However, there are basically two ways in which publishers can create it.

4.1 Converting traditional Word content to JATS:

During the editorial process, articles and related content are often provided in Microsoft Word format. At some point, this content must be converted to JATS for efficient processing and publication. There are several tools available for this conversion process, but publishers often outsource this step, especially to low-wage countries.

4.2 Creating JATS content from the start:

The other option is to take an "XML first" approach and create JATS content from the beginning using an XML editor (e.g. Xeditor) or XML-based editorial system (e.g. Xpublisher for Scholarly Publishing). This practice is becoming increasingly popular among publishers because it offers several important advantages:

  • Increased data control: Creating content in XML from the start eliminates the need for data conversion later in the process. This cuts out an often critical production step and increases data control and security. In the past, such conversions were often outsourced, and sensitive data, such as research data, was shared uncontrolled with third parties and external service providers.
  • Improved semantic content quality: 
    Modern XML editors provide guided authoring and real-time validation, ensuring everyone is creating valid XML at all times. This prevents editing errors as well as time-consuming and error-prone manual reworking. The result is a significantly higher semantic content quality, which is especially beneficial for subsequent processing.
  • Enhanced process efficiency through automation and standardization: 
    Utilizing an XML editor, or an editorial system that integrates such an editor, markedly boosts the automation and standardization of processes. This advancement leads to notable improvements in content quality and operational efficiency. By automating routine tasks and standardizing content production processes, publishers achieve significant time and cost savings. This efficiency accelerates the content development cycle and ensures consistency across documents, enhancing the overall quality and coherence of the published content.

In the past, creating XML content required technical expertise from trained editors.   

However, modern web-based tools now offer intuitive interfaces similar to Microsoft Word, and support various XML standards such as JATS out-of-the-box. This makes it easier for publishers to adopt an XML-first approach. The following graphic illustrates the interface of our XML editor Xeditor, which closely resembles the familiar Word layout.

Screenshot of the Xeditor surface

 

5. The Important Role of Metadata

The brief definition of JATS contained in the first chapter as "a set of XML elements and attributes that describe the structure and semantics of a journal article and facilitate its automated processing and publication" is quite general. For a better understanding of the practice, more technical details about the metadata are needed.

Metadata is one part of the aforementioned "set of XML elements and attributes" and plays a key role in the efficient distribution of publications across different media channels. Professional publishing platforms, as well as abstracting and indexing services that store scholarly content, typically export metadata from a publication to make it automatically available on these services.

Examples of metadata include:

  • Document title
  • Article title
  • Article version
  • Authors and affiliations
  • Abstract
  • Bibliography
  • Copyright information
  • List of keywords
  • Digital object identifiers (DOIs)

Using a modern and intuitive XML editor, as described in the previous chapter, this metadata is automatically queried and filled in by the person responsible, e.g. the author or editor. The following graphic illustrates how this might look in practice in a JATS XML editor like Xeditor:

Screenshot of the Xeditor meda data


JATS XML files and corresponding metadata are indispensable to modern scholarly publishing. They facilitate the storage of bibliographic data and enable researchers and readers worldwide to access relevant information and publications rapidly and conveniently.

Thus, they make an important contribution to a major scientific goal: disseminating knowledge as widely as possible and making it available to the public.