David Sankoff


Professor, Department of Mathematics and Statistics, University ofOttawa, Canada

Title: The rise and fall of massive paralogy in plant genomes after recurrent whole genome duplication and subsequent waves of gene loss

Abstract: The evolution of plants over many tens of millions of years evidences repeated cycles of genome duplication and fractionation, the genome-wide process of losing one gene per duplicate pair. A major type of data in the study of these processes is the frequency distribution of similarities between the two genes, over all the duplicate (paralogous) pairs in the genome.

We develop a special birth-and-death model as well as inference methods for these processes. Our model is quite general, accounting for repeated duplication, triplication or other multiplication events, as well as a general fractionation rate in any time period among multiple progeny of a single gene. It also has a biologically and combinatorially well-motivated way of handling the tendency for at least one sibling to survive fractionation. We show how the method settles previously unexplored questions about the expected number of gene pairs tracing their ancestry back to each multiplication event. We exemplify the algebraic concepts inherent in our models and methods on Brassica rapa , whose evolutionary history is well-known. Finally we demonstrate the quantitative analysis of high-similarity gene pairs and triples to confirm the known ploidies of events in the lineage of B.rapa.