Logo of the Physikalisch-Technische Bundesanstalt
symbolic picture: "magazines"

Training machine learning algorithms for assessing ECGs

PTB publishes largest ECG database accessible to the public to date

PTBnews 3.2020
Especially interesting for


AI experts

A database developed at PTB provides more than 20 000 ECG images, machine-readable diagnostic findings and comments of cardiologists. This database was prepared primarily for the development of machine learning and is structured into training and test sections for comparability. The database, named PTB-XL, is publicly available at PhysioNet.

Overview of the database's contents. Each entry corresponds to a row in the table from top to bottom in the chronological order. Black pixels show available values; white pixels indicate missing values.

Artificial intelligence is a huge trend in medicine. In particular in fields such as ECG assessment where much practice and experience are necessary, deep learning can show great advantages. The corresponding algorithms are capable of recognizing patterns in the midst of large amounts of data in a way only experienced cardiologists have been able to do to date. They thus support physicians in the time-consuming procedure of checking the numerous ECG signals. To achieve this, existing high-performance algorithms were typically trained using non-public datasets, so that they could not be used by a wide part of the scientific community. On the other hand, public datasets were previously too limited to be used for training, especially for the reliable evaluation of machine learning algorithms. In addition, the evaluation methodology has not been standardized, so that results cannot be sufficiently compared.

The new PTB-XL database contains 21 837 ten-second ECG signals from 18 885 patients. It is thus approximately 40 times larger than the PTB Diagnostic Database which has been very widely used to date. In the EU EMPIR project titled Medalcare, PTB is working with the Fraunhofer Institute for Telecommunications – Heinrich-Hertz-Institut (HHI), to compare different machine learning algorithms fusing this large dataset. The first benchmark study in this field compares conventional classification algorithms by means of a number of different tasks and clearly defined evaluation procedures. The results have been published in IEEE Journal of Biomedical and Health Informatics and are to be used as an incentive for other scientists who are willing to carry on working with the database.

The dataset is available to the public at PhysioNet: Opens external link in new windowhttps://www.physionet.org/content/ptb-xl


Tobias Schäffter
Division 8 Medical Physics and Metrological Information Technology
Phone: +49 30 3481-7343


P. Wagner, N. Strodthoff, R.-D. Bousseljot, D. Kreiseler, F. I. Lunze, W. Samek, T. Schaeffter: PTB-XL, a large publicly available electrocardiography dataset. Scientific Data 7, 154 (2020)

N. Strodthoff, P. Wagner, T. Schaeffter, W. Samek: Deep Learning for ECG Analysis: Benchmarks and Insights from PTB-XL. IEEE Journal of Biomedical and Health Informatics 2020. Early Access Article