
Overview of the data and preprocessing steps

Data and Preprocessing

The Sustainability Office of the University of Basel supplied data from two different databases containing information of the University’s electronic research output (publications and projects).

The publications dataset had to be processed to generate abstracts that could be searched using search queries in English:


  1. The database included journal articles (N = 43,229), book items (chapters, 13,443), journal items (5,028), proceedings items (2,144), and books (1,897).
  2. We selected publications with English titles (~85%) and DOI present (~80%) that were published between 2010 and 2020 (~60%) using a custom-made R script. About half of all publications remained.
  3. We identified abstracts for all publications using rscopus. About ~85% of abstracts could be retrieved.

Publications & Projects

  1. Team members identified faculty-level organizational units within the University of Basel (including associated and cross-disciplinary institutes) manually, using the descriptors currently in use at the University of Basel.
  2. We used translations of the SDG queries of Elsevier (Jayabalasingham, Boverhof, Agnew & Klein, 2019) and the Aurora network using the corpustools R package to detect whether titles or abstracts were SDG-relevant.

The data is shared via this Github repository.



Variable Description
type Type of publication.
year Year of publication.
authors Author list of publication.
title Title of publication.
abstract Abstract of publication.
outlet Name of journal, book, or proceedings publication.
peer_reviewed Has publication been subjected to peer review.
organisation Original list of Unibas organizations involved in publication.
unibas_authors Publication authors at University of Basel.
unibas_publcation Publication offically affiliated with University of Basel. (?)
doi Digital object identfier
Org-* Logical vectors indicating Unibas faculty-level organizations involved in publication.
SDG-* Logical vectors indicating whether publication addresses the 17 sustainable development goals.


Variable Description
lead Name of project lead researcher.
co_lead Names of project co-lead researchers.
organisation Original list of Unibas organizations involved in project
title Title of project.
year Year of project start.
abstract Abstract of project.
keywords Keywords of project.
type Type of project: self- or externally financed.
status Active or Completed
id Internal idenifier number
Org-* Logical vectors indicating Unibas faculty-level organizations involved in project.
SDG-* Logical vectors indicating whether publication addresses the 17 sustainable development goals.