Credit: University of Toronto Engineering
Frey’s team used computational deep learning techniques to train a system that mimics the process of splicing in the cell (left panel). Features such as motifs, RNA secondary structures and nucleosome positions are computationally determined from the DNA sequence (right panel), combinations of these features are combined to detect complex patterns, and these are combined again to predict how splicing will occur for the exon within the DNA sequence. The effect of a DNA mutation is assessed by apply the system to the sequence with and without the mutation and measuring the change in the computed splicing level. This procedure enables the identification of potentially harmful mutations from across a wide range of diseases. Frey’s team applied it to spinal muscular atrophy, cancers and autism and in each case identified previously unknown genetic determinants of the disease.