Use Cases

We made two demonstrators and one case study that focuses on the evaluation and deployment of the tools and workflows developed by the SoFAIR project.

To get more details, please check the dedicated Evaluation Report.

We worked on the following use cases:

Demonstrator 1: Linking research studies to software in life sciences for Europe PMC

  • Aim: To evaluate a workflow for integrating TEI-annotated content, generated by SoFAIR machine learning models, into the existing Europe PMC annotations infrastructure.
  • Outcome: A pipeline was developed to convert Softcite annotations into the Europe PMC format and successfully upload them to the platform for ingestion via its public API.

Demonstrator 2: Validating extracted software mentions within an institutional repository

  • Aim: To develop a reliable and scalable end-to-end system for identifying, processing, and validating software mentions in publications deposited in the HAL institutional repository.
  • Outcome: A scalable pipeline was implemented on the Grid5000 infrastructure, utilizing Grobid and SoftCite for document processing and COAR Notify for interoperable communication with HAL and Software Heritage (SWH). Authors or administrators are involved in a validation step via the HAL interface to accept or reject the detected software mentions.

Case study in the digital humanities

  • Aim: To investigate the potential of automated software-mention detection for analyzing digital transformation processes in the humanities by examining software-usage in publications.
  • Outcome: The study found that automated detection is valuable for scientometric and infrastructural applications. A high proportion of DH journal publications contained software mentions (30-60%), while Traditional Linguistics and Literary Studies (TLL) journals rarely exceeded 10-15%.