AI in heart research

How computer-assisted genomic medicine is searching for the causes of cardiovascular diseases

Computational biologist Marcel Schulz is working toward the goal of personalized medical treatment – step by step and with the help of artificial intelligence.

When calcium accumulates in the coronary arteries, it compromises oxygen supply to the heart muscle and its ability to pump blood. The effects of this condition, called coronary heart disease (CHD), range from chest pain (angina pectoris) and cardiac arrhythmia to an infarction or sudden cardiac death. According to the World Health Organization (WHO), coronary heart disease is the most common cause of death worldwide. Among the risk factors are an unhealthy lifestyle – not enough exercise, a fatty diet, alcohol, cigarettes – and diseases such as diabetes or hypertension. However, genetic factors can also play a role. That is why medical scientists are attempting to identify specific segments of the human genome among its 3.26 billion building blocks. What they are specifically searching for are segments with genetic variants that trigger CHD and other cardiovascular diseases. In the most inconspicuous of these variants, a single base pair in a DNA segment is different; the technical term for this is “single nucleotide polymorphisms”, or SNPs. For example, one person has the base pair adenine-thymine at a certain position on their genome, whereas another has guanine-cytosine. AT or GC – this variant of the base sequence can determine whether someone is at risk of a heart attack or perfectly healthy.

Computational biologist Marcel Schulz from the Institute for Computational Genomic Medicine is hot on the trail of such gene variants that pose health risks. His specialist field, bioinformatics, endeavors to answer biological questions with the help of computing power – using statistical surveys or machine learning models. In the age of genome sequencing, such technical tools are indispensable, as only these can filter valuable findings from large datasets. Or as Schulz says: “To really understand why a person becomes ill, we need bioinformatics. Without it, there would be many questions where we cannot make any further progress.”

Guilty SNPs

Such questions include the following: Why do people suffer from coronary heart disease? And why do calcium deposits narrow coronary arteries? To unravel the molecular mechanisms behind this, Schulz brings genome-wide association studies (GWAS) into play. With this statistical tool, he searches for SNPs in the genomes of a large group of people and compares the gene variant profiles of sick and healthy individuals. If a frequent occurrence of SNPs is observed in sick people, it infers that these are a contributing factor to the disease. Schulz is currently searching for this link between disease and gene variants in a dataset of CHD patients. He is especially looking at pathogenic SNPs in what are known as non-coding RNA molecules (ncRNAs). The human genome contains about 26,000 of these ncRNAs (see also p. 5). They are not translated into proteins, but instead control gene regulation as well as important cardiovascular processes. This makes them promising targets for the future treatment of cardiovascular diseases.

Schulz is developing computational methods to assess which SNPs play a role, e.g. by thwarting correct gene regulation, in the pathogenesis of CHD. Finding these risk SNPs in ncRNA genes is no easy task, as over 20,000 SNPs occur more frequently in CHD patients than in healthy individuals. To make things even more difficult, they are not necessarily located in the ncRNA itself, as the regulatory region in the genome often lies outside the actual gene region. For example, an ncRNA gene might occupy positions 1,000 to 5,000 on chromosome 12. However, the regulatory regions that influence the ncRNA gene’s activity occupy position 250 or 18,000 on the chromosome. Despite its distance from the ncRNA gene, a SNP in this regulatory region might well interact with it, for instance if it influences the binding of proteins that come into contact with the ncRNA gene through DNA folding in the cell nucleus. Despite such obstacles, Schulz has already struck lucky: He has so far linked 144 ncRNA genes to SNPs that encourage CHD – the first 144 candidates for a future therapy where these genes would then be systematically turned off, for example.

For Schulz, understanding the effect of individual SNPs is the key to personalized medicine in the future. “If we know what role certain gene variants play in cardiovascular diseases, we can treat patients individually.” One example is cholesterol: Too much of this blood fat causes calcium deposits to build up in the blood vessels, increasing the risk of vascular disorders, heart attacks and strokes. “But if we know which SNPs inhibit the activity of a person’s cholesterol metabolism pathways, we can intervene at an early stage.” Patients could be invited for regular check-ups while they are still young and, if necessary, be prescribed medication to lower their cholesterol. In this way, medicine would no longer simply react to a disease but have the knowledge required to anticipate it and act accordingly.

Simplified diagram of a convolutional neural network. The input data (left) are reduced to individual features in several layers with the help of automatically learned hierarchies. All features are then strung together and combined in further layers to produce an optimal result (right) Diagram: Marcel Schulz

Mutation of blood stem cells

Other tools that Schulz is using are found in machine learning, a branch of artificial intelligence. Multimodal autoencoders are among them. These are particularly good at compressing input data, e.g. on gene activity. In other words, at encoding the same volume of data with fewer bits and then storing it in a “latent space”, a kind of transit area. Data compression makes it possible to push the “noise” generated by superfluous information to one side and only include the most important data elements. This makes it easier to spot patterns hidden in the data. The original data are not lost in the process: The autoencoder can easily reconstruct them. Schulz uses the tool to find therapeutic targets that could help in the treatment of chronic heart failure. Among the risk factors for this condition are mutations in blood stem cells of the red bone marrow, which give the descendants of these blood stem cells a certain advantage in that they produce a subpopulation of altered blood cells, a clone. This phenomenon is called clonal hematopoiesis of indeterminate potential (CHIP). CHIP itself is not a disease, but it can trigger one. Here, too, genetic variations play a role: Patients suffering from chronic heart failure were found to have both CHIP and SNPs in certain genes. This is a job for the autoencoder, which Schulz is using to categorize the activities of around 20,000 genes from a dataset of patients suffering from chronic heart failure. This, too, is a step in the direction of personalized medicine: “Using the data of an individual patient as our basis, we want to be able to predict whether a particular drug would positively influence gene expression, that is, the formation of RNA molecules or proteins. To do this, we compare the gene expression of sick and healthy people.”

Monitoring heart rhythm

Deep neural networks, DNNs, mimic the way the human brain works. They, too, are a subfield of machine learning. Schulz works with convolutional neural networks (CNNs), a special and often used type. CNNs present a simplified image of the visual cortex’s neural circuit and work as follows: The front processing layer extracts simple features from the data input and feeds them to the layers behind. There, the extracted features are combined. This process is called meshing. “Let’s assume that the front layer has extracted 100 features. All the neurons in the layer behind have access to them. If the meshing layer consists of 50 neurons, for example, the network learns 100 features multiplied by 50 neurons, or 5,000 combinations. This is how it generates the desired output.”

CNNs are a useful instrument for processing lots of different data. Schulz has developed one that analyzes data from electrocardiograms. In the future, it could find its way into heart monitors for patients with cardiac arrhythmia. These small computer chips are implanted under the skin, usually close to the collarbone, record the heart’s electrical activity and by so doing can make even inconspicuous arrhythmias visible. Schulz has developed his CNN specifically for atrial fibrillation. How exactly does it monitor heart activity? After the input layer of the artificial neural network has recorded the peaks and slopes of the ECG signals, the layers behind process the combination of these signals to conclude whether the length of the waves has changed. Here, the CNN superimposes its own learned ECG curves onto the ECG signal actually received. As soon as it detects cardiac arrhythmia, it sounds the alarm. “Our CNN could be extended to different arrhythmias without raising the heart monitor’s energy consumption,” explains Schulz. This could be an advantage for patients in the future, since the battery would last longer and the device would not need replacing so often, or ideally not at all. The CNN would also be suitable for implanted defibrillators, which in the event of arrhythmias emit an electrical impulse that stops them.

Explainable AI

Schulz has many more big plans for the folded neural networks. They are currently being trained to predict atypical gene activity in over 50 cell types. “We are building a separate CNN for each cell type, which learns which factors in the DNA sequence are important for its specific cell type. Some of these factors only occur in this one cell type, they are cell type-specific. The CNN algorithms learn such things all by themselves. We don’t have to teach them.”

As far as predictions are concerned, Schulz thinks that explainable AI is an important component. Although artificial intelligence learns to interpret complex correlations between features, it does not deliver any information about what it has actually learned. “What makes things even more difficult is that artificial neural networks learn many non-linear correlations. These are difficult for us humans to follow.” That is why we will need a type of AI in the future that not only helps us find the causes of cardiovascular diseases but also reveals how it arrives at its results.

Photo: Arezoo Haghiri

About / Marcel Schulz, born in 1981, is Professor for Artificial Intelligence in Genome Research and has set up the new Institute for Computational Genomic Medicine. He studied bioinformatics in Berlin and earned his doctoral degree at the Max Planck Institute for Molecular Genetics in 2010, after which he spent some time as a postdoctoral researcher at Carnegie Mellon University in Pittsburgh, USA. Schulz was then a group leader at the Max Planck Institute for Informatics and at Saarland University in Saarbrücken before joining the Institute of Cardiovascular Regeneration at Goethe University Frankfurt in 2018. Schulz is a member of the Cluster of Excellence “Cardio-Pulmonary Institute (CPI)” and the German Center for Cardiovascular Research (DZHK).
marcel.schulz@em.uni-frankfurt.de

Photo: private

The author / Andreas Lorenz-Meyer, born in 1974, lives in the Palatinate and has been working as a freelance journalist for 16 years. His areas of specialization are climate research, renewable energies, digitalization and biology. He publishes in daily newspapers, specialist newspapers, university and youth magazines.
andreas.lorenz.meyer@nachhaltige-zukunft.de

Futher issues of Forschung Fankfurt

Relevante Artikel

Hoffnungsträger: Stammzellen aus dem Blut der Nabelschnur werden in Stickstoff tiefgekühlt, um sie in der Zukunft therapeutisch nutzen zu können. Foto: Veith Braun

Hoffnung aus dem Stickstofftank

Noch immer werden Stammzellen eingefroren, um künftig ­Krankheiten zu heilen – trotz der niedrigen Erfolgsquote Biologisches Material einzufrieren und dadurch

Im Hamburger Stadtteil Billstedt gibt es schon seit 2017 einen Gesundheitskiosk. Foto: Daniel Reinhardt, Picture Alliance

Medizinische Hilfe für alle

Wie Gesundheitskioske eine Lücke im Gesundheitssystem schließen könnten Medizin näher zu den Menschen bringen – besonders zu denen, die sie

Moderne Verfahren zur Bewegungsanalyse des Kiefers (hier Jaw Motion Analyzer) ermöglichen eine erweiterte Diagnostik, ergänzen die Therapieoptionen und unterstützen beim Krankheitsverständnis. Foto: Poliklinik für zahnärztliche Prothetik

Auf der Suche nach dem richtigen Biss

Die CMD-Ambulanz ist Anlaufstelle vieler Patienten mit Diagnosen von Zähneknirschen bis hin zu komplexen Funktions­störungen des Kausystems Eine Fehlfunktion von

Öffentliche Veranstaltungen
„Beifall für Alfred Dregger“ (1982). Michael Köhler vor dem Bild in der U-Bahn-Station, auf dem er (l.) und sein Mitstreiter Ernst Szebedits zu entdecken sind (s. Markierung). © Dirk Frank

Universitäre Foto-Storys

Nach 40 Jahren: Zwei Stadtteil-Historiker haben zu Barbara Klemms berühmten großformatigen Uni-Fotos in der U-Bahn-Station Bockenheimer Warte recherchiert. Interessante, humorvolle

Kind auf einem Roller © Irina WS / Shutterstock

Wie junge Menschen unterwegs sein möchten

Bundesministerium für Forschung, Technologie und Raumfahrt fördert Nachwuchsgruppe CoFoKids an der Goethe-Universität „Von der ‚Generation Rücksitz‘ zu den Vorreitern der

You cannot copy content of this page