Back
Science

St. Jude Scientists Develop BOUQUET Algorithm to Map 3D Gene Regulation

View source

St. Jude Scientists Unveil BOUQUET: A New Algorithm to Map Gene Regulation in 3D

Scientists at St. Jude Children's Research Hospital have developed a groundbreaking new algorithm named BOUQUET. This innovative tool is designed to study the intricate three-dimensional (3D) molecular machinery that governs gene expression. Traditionally, scientific studies have often viewed the genome in a linear, two-dimensional manner, despite DNA and its associated proteins functioning in 3D.

BOUQUET utilizes machine learning to reveal that specific sets of genes and their regulatory elements can interact within protein condensates. These condensates are high-density, membraneless droplets found within cell nuclei. This significant discovery provides novel insights into how cells regulate the genes crucial for establishing their specialized identities.

BOUQUET utilizes machine learning to reveal that specific sets of genes and their regulatory elements can interact within protein condensates, offering new insight into how cells regulate their specialized identities.

Unlocking 3D Enhancer Architecture

The algorithm addresses a long-standing challenge: identifying comprehensive sets of enhancers—DNA elements that activate gene expression—and their accompanying proteins. This task is particularly complex because these elements can be located thousands of DNA bases away from their target genes.

BOUQUET employs a machine learning-based graph theory framework that specifically considers the 3D architecture of enhancers. This allows it to pinpoint genes that are likely to be located inside transcriptional protein condensates.

According to Brian Abraham, a corresponding author of the study, "This method quantifies the activating protein apparatus associated with each gene, allowing for the prediction of gene expression from protein binding maps and the identification of genes interacting with transcriptional condensates."

From Super-Enhancers to Enhancer Communities

Enhancers play a crucial role in activating gene expression through protein binding and by making contact with target genes. Previous research by Abraham's team had identified 'super-enhancers' as linear groups of enhancers that are significantly involved in controlling cell identity.

BOUQUET expands upon this concept by integrating protein binding maps to define what the team calls 'enhancer communities.' The research team is believed to be the first to demonstrate a quantitative correlation between enhancer/protein binding patterns and gene expression.

The research team is believed to be the first to demonstrate a quantitative correlation between enhancer/protein binding patterns and gene expression.

Unveiling 3D-Super-Enhancers and Co-Transcription

The communities identified by the Abraham lab are considered fundamental units of gene regulation, as their components exhibit correlated activities. Those communities with the highest protein levels were designated '3D-super-enhancers,' reflecting their relationship to the previously identified linear super-enhancers.

The study's findings indicated that all genes previously known to interact with transcriptional condensates were located within these 3D-super-enhancers. Furthermore, the number of these protein-rich communities matched earlier counts of transcriptional condensates.

Researchers observed instances where two genes from the same community, separated by half a million base pairs, shared the same condensate and underwent co-transcription within it, experiencing the same biochemical and transcriptional environment simultaneously. This highlights a previously unseen level of coordinated gene regulation.

Implications for Disease and Beyond

Understanding the molecular machinery that controls cell identity through transcription is critically important, especially since dysregulated transcription is a central feature of malignant cell identity.

This research provides a valuable new tool to investigate whether condensates might control the expression of disease-causing genes. The study received support from the Transcription Collaborative of St. Jude Children's Research Hospital and ALSAC.