Project outline

The aim of this work package is to develop a representative reference ECG database of a virtual population with healthy, ischemic and arrhythmic ECG traces amenable to statistical evaluation. The database will be used to assess and compare different classification approaches for advanced data analysis in WP2. It will include at least 10,000 cases where – per definition – the ground truth of the underlying pathology is precisely known. At the end of the project, the synthetic ECG reference database will be made freely available to other researchers for them to test their own algorithms.

Existing biophysical modelling tools developed at KIT and MUG to simulate de- and repolarisation of the heart tissues in realistic anatomies will be employed for the creation of the virtual ECGs. The following pathologies will be simulated:

•    ischemia and infarction of varying degree of severity, various sizes in different areas of the ventricles
•    ectopic foci driving premature complexes and fusion beats at various timings and positions in both atria and ventricles
•    fibrotic lesions of various types and scar areas of various sizes, morphologies and locations in the atria and in the ventricles
•    slow conduction or conduction blocks at various positions
•    various re-entrant activation patterns of varying complexity in both, the atria and the ventricles

These pathologies will be implemented in different heart geometries and the heart will be embedded in different thorax geometries. Both a 12-lead-ECG and a Body Surface Potential Map (BSPM, 252 electrodes) will be simulated. In addition to the various pathological cases, many non-pathological (healthy) ECGs will also be produced. Simulated ECGs will be mixed with clinical ECGs and expert cardiologists will be asked to separate one from the other (see WP3) in a ‘clinical Turing test’.
The influence of varying simulation input parameters on the virtual ECG signal traces will be characterised by a sensitivity analysis for different pathologies and the healthy variations.

This synthetic ECG database serves as a basis for WP2 to address the problem of distinguishing between different classes of pathologies using the ECG.

First results

Various ECGs of healthy hearts have been simulated. Atrial modelling delivered realistic P-waves and ventricular modeling resulted in typical QRS-T waves.

Figure 1: Left: Simulated depolarization wave on an atrial geometry. Right: Time course of the action potential in one single cell.

Up till now, the transmembrane voltages have been simulated on various atrial and ventricular geometries and forward calculations to the body surfaces have been conducted to extract the 12-lead ECG. 
Besides the simulation of healthy signals, also some simulation input parameters have been varied to produce ECGs representing cardiac diseases. To mention some examples, the conductivities in the left bundle branch have been decreased and an ectopic focus located at the left superior pulmonary vein has been used as a stimulus resulting in ECGs of patients fallen ill with left bundle branch block or premature atrial contractions. 

From publicly available ECG data, the statistics of intra-patient time intervals (PQ-interval, QT-interval, etc.) and amplitude variations (P-max, R-max, etc.) were extracted and employed to concatenate the simulated single beats to a continuous ECG with a length of 10 seconds.

Figure 2: Example of a synthesized ECG representing two consecutive beats in the Einthoven and Goldberger leads. Inter-beat variations were extracted from clinical ECG recordings to modify and concatenate the single beat simulations of the P-waves and QRS-T-complexes.


The aim of this work package is to provide a complete pathway ‘from data to decision’ together with an understanding of the effects of various sources of uncertainty on this process. The pathway consists of extracting features from 12-lead ECG signals, including real signals and the synthetic signals provided by Work Package 1, which are used by machine learning for classification of the signals as coming from subjects that do or do not have particular cardiac conditions. Standard approaches and new novel methods will be used for both the feature extraction and the machine learning. Various methods will also be investigated to convert the signals to images which are then classified using deep learning methods (i.e. without pre-processing of the signals for feature extraction). In addition, deep learning methods will be applied directly to the raw ECG signals without any feature extraction.

The performance of the different classification methods will be compared using various publicly available datasets, including:

•    A small dataset of real ECG signals taken from the PhysioNet PTB Diagnostic ECG Database for 100 subjects that are either healthy or have myocardial infarction;
•    A larger dataset containing ECG signals for 2,863 subjects grouped into four (distinct) classes extracted from the data made available for the PhysioNet/Computing in Cardiology Challenge 2020

The classification methods developed will also be compared using the reference ECG signals generated in Work Package 1.

An important aspect of this work package is to gain an understanding of the machine learning process. The relative importance of the different features extracted from ECG signals will be assessed to determine which parts of the ECG signals affect these key features. For the deep learning methods, the heat-mapping approach developed by TUB and their collaborators will be applied to the images derived from the ECG signals and the raw ECG signals themselves in order to determine what parts of the images and signals are most significant in obtaining the classification.

The effect of variability in the ECG signals, such as signal length, noise and mislabeled data,on the machine learning classification process will also be investigated. This work will link to the cardiac modelling undertaken in Work Package 1 to understand the sensitivity of an ECG signal to particular cardiac conditions. The effect of these variations on the classification performance will be considered. There are also many other sources of uncertainty that can influence the results from the pathway, including the type and accuracy of the training dataset, and these sources will be investigated in order to determine their effect on the classification performance. 


First results

A variety of features have been extracted from the real ECG signals including:

•    ECG intervals and amplitudes (see Fig. 1);
•    Heart Rate Variability (HRV) features;
•    Wavelet features;
•    Symmetric Projection Attractor Reconstruction (SPAR) features.

Machine learning has been used on these different sets of features for classification of the ECG signals.


A variety of methods for converting ECG signals to images have also been used including:

•    Poincare plots;
•    Scalograms (wavelet coefficients);
•    Spectrograms (short time Fourier transform);
•    SPAR images.

See Fig. 2 for an example of each of these. Classification of these images has been achieved using bespoke deep neural networks as well as transfer learning using pre-trained networks.

In addition, deep neural networks have been used for classifying the raw ECG signals.




Noise Study

ECG signals are often noisy. We are investigating the effect of physiological noise on classification performance.

The dataset we are using has been derived from data made available for the PhysioNet/Computing in Cardiology Challenge 2020. We chose 2,678 lead II ECG recordings in three approximately equal classes, namely normal, atrial fibrillation and ST depression. The signals were filtered to remove any noise giving a clean dataset. Physiological noise, which was originally recorded and prepared for the MIT-BIH noise stress test database, was added to the filtered signals. Noise types available were baseline wander, electrode movement, motion artefact and a combination of all three. Examples of the four noise types can be seen in Figure 3.

We then generated scalogram and attractor images from the ECG signals (see Figure 4) and applied transfer learning using several pre-trained deep neural networks to classify the images.

To investigate the impact of physiological ECG noise on transfer learning classification performance, three variations of the model training and testing were carried out:

(i) Model trained and tested on data taken from the same dataset

(ii) Model trained on clean data used to classify all datasets

(iii) Model trained on the data containing the mixture of noises and used to classify all datasets

Results will be summarised here once the paper about this work has been published.


Local Interpretable Model-agnostic Explanations (LIME)

The classification process performed by neural networks is considered to be a black box in terms of the steps it takes to reach the end result. Using Local Interpretable Model-agnostic Explanations (LIME), the aim is to understand better how a neural network reaches the classification result. LIME is a method that helps to highlight features of the image that are considered important for the classification.

The dataset being used compromises of ECG signals that have been converted to attractor images (Figure 5). Attractor images are a method of condensing longer ECG wave forms into one single image. A single attractor image can hold information from multiple minutes of ECG signals.

These attractor images are then subjected to the classification process. This involves classification of the attractor image into one of four categories, normal, atrial fibrillation, ST-depression or ST-elevation.

Once the classification has been performed, the LIME method can be applied. This basically generates a heatmap highlighting the features that were deemed important for the classification by the neural network. An example of the generated heatmaps can be seen in Figure 6.

Furthermore, average heatmaps displaying patterns in classification and misclassification specific to each category will also be generated to help gain a more in depth understanding of the classification process with respect to each class.

After the LIME heatmaps have been explored, we will move onto the backpropagation method (Layer-wise Relevance Propagation) which also generates heatmaps and will explore whether the two methods give similar results.

The aims of this work package are i) to clinically validate the synthetic ECG and BSPM database created in WP1, ii) to compare simulated ECG data with directly measured ones and iii) to compare the performance in classification of ML-based algorithms (developed in WP2).

Specifically, the project aims to ascertain that the synthetic ECG and BSPM database replicates all clinically-observable electrophysiological features and coincides with clinical observations using two subsequent validation approaches. First, synthetic ECGs will be presented to an expert panel of cardiologists to ensure ECGs are indiscernible from clinically measured ones and exhibit the necessary features of a given pathology for accurate clinical classification. Secondly, a detailed mechanistic validation of the biophysical ECG model will be performed using multiparametric data (BSPM or ECGs, medical imaging and EP-maps). In order to compare the performance in classification between clinical experts and machine learning algorithms developed in WP2, the project will also perform a bench mark test on the synthetic database.