Synthetic DNA sequences

2018-

An important and largely unsolved problem in synthetic Biology is how to control gene expression levels by cellular programming. Endogenously, such control is achieved by way of regulatory elements such as promoters, enhancers and insulators. One of the main hallmarks of such elements is their accessibility to factors promoting transcription of cognate genes. To achieve cellular programming through the manipulation of regulatory elements, it is therefore desirable to dictate the accessibility patterns of elements across cellular conditions of interest.

We propose a fully data-driven framework for designing synthetic human sequence elements with predicted cell-context specific chromatin accessibility behavior. We apply a comprehensively annotated high-resolution map of accessible genomic elements as the basis for a training set to learn and apply the foundational rules regarding how regulatory elements are encoded in the human genome. This approach results in highly variable pools of synthetic sequences, not observed in nature.

We use a supervised adaptation strategy to evolve these sequences towards multiple pre-defined cellular contexts. We show that this tuning process not only retains characteristics of endogenous regulatory sequences, but often strengthens them through an increased abundance of relevant regulatory signals.

Our approach and its current results provide a foundation for subsequent experimental validation through the targeted integration of synthetic sequences in genomic loci in relevant cellular conditions.

Code available on GitHub

Work with Peter Bromley.

Avatar
Wouter Meuleman
Principal Investigator

My research interests include computational (epi)genomics, genome organization, and data visualization