<?xml version="1.0" encoding="UTF-8"?>
<rss  xmlns:atom="http://www.w3.org/2005/Atom" 
      xmlns:media="http://search.yahoo.com/mrss/" 
      xmlns:content="http://purl.org/rss/1.0/modules/content/" 
      xmlns:dc="http://purl.org/dc/elements/1.1/" 
      version="2.0">
<channel>
<title>Louise A. Huuki-Myers</title>
<link>https://lahuuki.github.io/blog.html</link>
<atom:link href="https://lahuuki.github.io/blog.xml" rel="self" type="application/rss+xml"/>
<description>Louise&#39;s personal webpage &amp; blog</description>
<generator>quarto-1.8.27</generator>
<lastBuildDate>Wed, 09 Apr 2025 00:00:00 GMT</lastBuildDate>
<item>
  <title>Deconvolution Benchmark: TL;DR</title>
  <dc:creator>Louise A. Huuki-Myers</dc:creator>
  <link>https://lahuuki.github.io/posts/2025-04-09-DeconvoBenchmark/</link>
  <description><![CDATA[ 




<section id="introduction" class="level1">
<h1>Introduction</h1>
<p>This blog post provides a high-level summary of our paper <a href="https://doi.org/10.1186/s13059-025-03552-3">“Benchmark of cellular deconvolution methods using a multi-assay dataset from postmortem human prefrontal cortex”</a> published in <em>Genome Biology</em> in April, 2025 <span class="citation" data-cites="huuki-myers">(Huuki-Myers et al., n.d.)</span>.</p>
<p>In this deconvolution benchmark project we set out to determine the most accurate method for predicting cell type composition in bulk RNA-seq data from brain tissue. We also evaluated method for selecting marker genes, and introduced the <em>MeanRatio</em> method for marker gene selection. The dataset developed for this experiment, <em>MeanRatio</em> functions, and other helpful tools for deconvolution are available in the <a href="https://bioconductor.org/packages/devel/bioc/html/DeconvoBuddies.html"><em>DeconvBuddies</em></a> Bioconductor package.</p>
<section id="what-is-deconvolution" class="level2">
<h2 class="anchored" data-anchor-id="what-is-deconvolution">What is deconvolution?</h2>
<p>Complex tissue is made up of different cell types that express genes at different levels. In bulk RNA-seq this heterogeneity of the tissue is obscured, and the gene expression measurements represent a mixture of all of the cells and cell types in the sample. Differences in the cell type composition between samples, either technical or biologically real, can confound downstream analysis such as differential expression.</p>
<p>Deconvolution is an analysis that infers the cell type composition of bulk RNA-seq data, using gene expression profiles from single cell data.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lahuuki.github.io/posts/2025-04-09-DeconvoBenchmark/images/Deconvolution.png" class="img-fluid figure-img"></p>
<figcaption>Cartoon overview of deconvolution</figcaption>
</figure>
</div>
</section>
<section id="how-to-preform-deconvolution" class="level2">
<h2 class="anchored" data-anchor-id="how-to-preform-deconvolution">How to preform deconvolution?</h2>
<p>To run deconvolution you’ll need:<br>
</p>
<ol type="1">
<li><p>Your Bulk RNA-seq gene expression data</p></li>
<li><p>A refrence single cell RNA-seq gene expression data set, from the same tissue type</p></li>
<li><p>A deconvolution method (computational algorithm)</p></li>
</ol>
</section>
<section id="available-deconvolution-methods" class="level2">
<h2 class="anchored" data-anchor-id="available-deconvolution-methods">Available Deconvolution Methods</h2>
<p>Reviewing the literature we found 20+ deconvolution methods available. This presents quite an overwhelming choice for researchers! Are there big diffrences between methods? If so how can we chose the most accurate method?</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lahuuki.github.io/posts/2025-04-09-DeconvoBenchmark/images/method_choice.png" class="img-fluid figure-img" width="395"></p>
<figcaption>Choosing a method</figcaption>
</figure>
</div>
</section>
<section id="existing-benchmarks" class="level2">
<h2 class="anchored" data-anchor-id="existing-benchmarks">Existing Benchmarks</h2>
<p>Benchmark studies aim to test and rank the performance of available methods. There have been several benchmarks studies on deconovlution methods, both with-in papers presenting new methods and as separate studies. However there is not much of a consensus on which method is the most accurate:</p>
<p><strong>Benchmarking results from different papers on “real” data</strong></p>
<ul>
<li><p><strong>MuSiC paper</strong> <span class="citation" data-cites="wang2019">(Wang et al. 2019)</span><strong>:</strong> MuSiC &gt; NNLS &gt; BSEQ-sx &gt; CIBERSORT</p></li>
<li><p><strong>Bisque paper</strong> <span class="citation" data-cites="jew2020">(Jew et al. 2020)</span><strong>:</strong> Bisque &gt; MuSiC &gt; CIBERSORT&nbsp;</p></li>
<li><p><strong>Cobos benchmark</strong> <span class="citation" data-cites="avilacobos2020">(Avila Cobos et al. 2020)</span><strong>:</strong> DWLS &gt; MuSiC &gt; Bisque &gt; deconvoSeq</p></li>
<li><p><strong>Jin et al.&nbsp;benchmark</strong> <span class="citation" data-cites="jin2021">(Jin and Liu 2021)</span><strong>:</strong> CIBERSORT, MuSiC &gt; EPIC*, TIMER, DeconRNAseq</p></li>
<li><p><strong>Dai et al., benchmark</strong> <span class="citation" data-cites="dai">(Dai et al., n.d.)</span>: Dtangle &gt; Bisque &gt; Other Methods</p></li>
</ul>
<p>Additionally the Cobos et al., 2020 benchmark study shows that different methods preform best on different data sets <span class="citation" data-cites="avilacobos2020">(Avila Cobos et al. 2020)</span>.</p>
<p>A challenge in benchmark studies is producing a “ground truth” estimate for cell type composition. Often in benchmarks pseudobulk mixtures created from the single cell data are used as the bulk data, so the absolute composition is known.</p>
<p>However we think pseudobulk data might not be a stand-in for real bulk RNA-seq data. Better to use orthogonal measurement of cell type compositions paired with real bulk RNA-seq data. We also were curious about the performance of methods specifically in brain RNA-seq data.</p>
<p>This motivated us to run our own deconvolution benchmark study!</p>
</section>
</section>
<section id="deconvolution-benchmark-study" class="level1">
<h1>Deconvolution Benchmark Study</h1>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lahuuki.github.io/posts/2025-04-09-DeconvoBenchmark/images/Deconvolution_compare_proportions.png" class="img-fluid figure-img"></p>
<figcaption>Benchmark study design: Use orthogonal RNAScope cell type proportions to evaluate accuracy of deconvolution methods</figcaption>
</figure>
</div>
<section id="study-design" class="level2">
<h2 class="anchored" data-anchor-id="study-design">Study Design</h2>
<p>We designed an experiment to evaluate the performance of deconvolution methods on human brain tissue, specifically the dorsal lateral pre-frontal cortex (DLPFC). We used consecutive slices of 22 DLPFC brain blocks from 10 neurotypical donors, to create three assays:</p>
<ol type="1">
<li><p>RNAScope: orthogonal measurement of cell type compositions for six major cell types (n=25)</p></li>
<li><p>snRNA-seq: reference single nucleus data (n=19)</p></li>
<li><p>Bulk RNA-seq: using a variety of library types and RNA extractions methods (n=110)</p></li>
</ol>
<p><img src="https://lahuuki.github.io/posts/2025-04-09-DeconvoBenchmark/images/sfigu_study_design.png" class="img-fluid" alt="Diagram of study design. A. Cartoon of brain region and brain block with consecutive slices. B. Tile plot showing available samples and quality control status"><br>
</p>
<section id="rnascope-cell-type-proportions" class="level3">
<h3 class="anchored" data-anchor-id="rnascope-cell-type-proportions">RNAScope Cell Type Proportions 🔬</h3>
<p>To obtain orthogonal measurements of cell type proportions for six major cell types in the DLPFC, we utilized multiplex single molecule fluorescent in situ hybridization (smFISH) combined with immunofluorescence (IF) using RNAScope/IF.</p>
<p>We designed two probe combinations:</p>
<ol type="1">
<li><p><strong>Star</strong> measures:</p>
<ol type="1">
<li><p>Excitatory Neurons (Excit)</p></li>
<li><p>Mircoglia (Micro)</p></li>
<li><p>combined Oligodenrocytes and Oligodendrocyte Precursor cells (OligoOPC)</p></li>
</ol></li>
<li><p>Circle measures:</p>
<ol type="1">
<li><p>Inhibitory Neurons (Inhib)</p></li>
<li><p>Endothelial/Mural cells (EndoMural)</p></li>
<li><p>Astrocytes (Astro)</p></li>
</ol></li>
</ol>
<p>We used HALO to segment and label cell types, then calculated cell type porotions for each sample.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lahuuki.github.io/posts/2025-04-09-DeconvoBenchmark/images/RNAscope_crop.png" class="img-fluid figure-img"></p>
<figcaption>RNAScope/IF Experiment Design. A. Star and Circle probe combinations measure 3 cell types each. Example flourescent images of B. Star and C. Circle. D. Bar plots of estimated cell type compositions</figcaption>
</figure>
</div>
</section>
<section id="single-nucleus-reference-dataset" class="level3">
<h3 class="anchored" data-anchor-id="single-nucleus-reference-dataset">Single Nucleus Reference dataset</h3>
<p>The snRNA-seq data was previously analyzed as part of the spatialDLPFC project (see <a href="10.1126/science.adh1938">Huuki-Myers et al.</a>, or <a href="https://lahuuki.github.io/posts/2024-05-23-spatialDLPFC/#single-nucleus-rna-seq-1">previous blog post</a> for more details. This reference consist of 56k nuclei from 19 samples with seven broad cell types.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lahuuki.github.io/posts/2025-04-09-DeconvoBenchmark/images/snRNA_overview.png" class="img-fluid figure-img"></p>
<figcaption>tSNE plot and overall cell type composition for snRNA-seq dataset</figcaption>
</figure>
</div>
</section>
<section id="bulk-rna-seq-data" class="level3">
<h3 class="anchored" data-anchor-id="bulk-rna-seq-data">Bulk RNA-seq Data</h3>
<p>For the bulk RNA-seq we we curious if using different library types (polyA or RiboZero) and RNA Extraction (nuclear, cytoplasmic, or total) would impact the accuracy of deconvolution. So for each brain block we prepared one sample of each library combination.</p>
<p>Analyzing just the bulk RNA-seq data we saw large differences in gene expression between the different preparations of the bulk data, principal component analysis shows the data divide by library type and RNA extraction. We were suspicious that these technical differences in gene expression would impact deconvolution estimates, a good deconvolution method should be robust to the differences in datatypes.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lahuuki.github.io/posts/2025-04-09-DeconvoBenchmark/images/bulk_overview.png" class="img-fluid figure-img"></p>
<figcaption>tile plot showing n samples over library type and RNA extraction, PCA of the genes expression shows PC1 seperates Library type, PC2 seperates RNA extraction</figcaption>
</figure>
</div>
</section>
<section id="which-methods-to-test" class="level3">
<h3 class="anchored" data-anchor-id="which-methods-to-test">Which methods to test?</h3>
<p>From the large number of available methods we selected six methods that were previously selected as top performers in other benchmark papers, and applied a range of different approaches: DWLS, Bisque, MuSiC, hspe, BayesPrism, and CIBERSORTx (detailed below).</p>
<p><img src="https://lahuuki.github.io/posts/2025-04-09-DeconvoBenchmark/images/methods.png" class="img-fluid"></p>
<table class="caption-top table">
<colgroup>
<col style="width: 18%">
<col style="width: 7%">
<col style="width: 14%">
<col style="width: 29%">
<col style="width: 14%">
<col style="width: 13%">
</colgroup>
<thead>
<tr class="header">
<th>Method</th>
<th>Citation</th>
<th>Approach</th>
<th>Marker Gene Selection</th>
<th>Availability</th>
<th>Top Benchmark Performance</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>DWLS</strong><br>
(Dampened weighted least-squares)</td>
<td><span class="citation" data-cites="tsoucas2019">(Tsoucas et al. 2019)</span></td>
<td>weighted least squares</td>
<td>-</td>
<td>R package on CRAN</td>
<td><span class="citation" data-cites="avilacobos2020">(Avila Cobos et al. 2020)</span></td>
</tr>
<tr class="even">
<td><strong>Bisque</strong></td>
<td><span class="citation" data-cites="jew2020">(Jew et al. 2020)</span></td>
<td>Bias correction: Assay</td>
<td>-</td>
<td>R package on GitHub</td>
<td><span class="citation" data-cites="dai">(Dai et al., n.d.)</span></td>
</tr>
<tr class="odd">
<td><strong>MuSiC</strong><br>
(Multi-subject Single-cell)</td>
<td><span class="citation" data-cites="wang2019">(Wang et al. 2019)</span></td>
<td>Bias correction: Source</td>
<td>Weights Genes</td>
<td>R package GitHub</td>
<td><span class="citation" data-cites="jin2021">(Jin and Liu 2021)</span></td>
</tr>
<tr class="even">
<td><strong>BayesPrism</strong></td>
<td><span class="citation" data-cites="chu2022">(Chu et al. 2022)</span></td>
<td>Bayesian</td>
<td>Pairwise t-test</td>
<td>Webtool, R package on GitHub</td>
<td><span class="citation" data-cites="hippen2023">(Hippen et al. 2023)</span></td>
</tr>
<tr class="odd">
<td><strong>hspe</strong> (<strong>dtangle</strong>)<br>
(hybrid-scale proportion estimation)</td>
<td><span class="citation" data-cites="hunt2019">(Hunt et al. 2019)</span></td>
<td>High collinearity adjustment</td>
<td>Multiple options- default “ratio” 1vALL mean expression ratio</td>
<td>R package on GitHub</td>
<td><span class="citation" data-cites="dai">(Dai et al., n.d.)</span></td>
</tr>
<tr class="even">
<td><strong>CIBERSORTx</strong></td>
<td><span class="citation" data-cites="newman2019">(Newman et al. 2019)</span></td>
<td>Machine Learning</td>
<td>Differential Gene expression</td>
<td>Webtool, Docker Image</td>
<td><span class="citation" data-cites="jin2021">(Jin and Liu 2021)</span></td>
</tr>
</tbody>
</table>
</section>
</section>
<section id="marker-gene-selection" class="level2">
<h2 class="anchored" data-anchor-id="marker-gene-selection">Marker Gene Selection</h2>
<p>A strategy to improve accuracy in deconvolution is to limit the analysis to a set of cell type marker genes; reducing noise in the analysis. To help select cell type specific marker genes we have developed the <em>Mean Ratio</em> method.</p>
<p>The <em>Mean Ratio</em> method works by selecting genes with large differences between gene expression in the target cell type and the closest non-target cell type. We calculate the <code>MeanRatio</code> for a target cell type for each gene by <strong>dividing the mean expression of the target cell by the mean expression of the next highest non-target cell type</strong>. Genes with the highest <code>MeanRatio</code> values are selected as marker genes.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lahuuki.github.io/posts/2025-04-09-DeconvoBenchmark/images/get_mean_ratio.png" class="img-fluid figure-img"></p>
<figcaption>Illustration of <em>Mean Ratio</em> marker selection method, and heatmap of top <em>Mean Ratio</em> marker genes</figcaption>
</figure>
</div>
<p>For more information about selecting marker genes with <em>Mean Ratio</em> see <a href="https://research.libd.org/DeconvoBuddies/articles/Marker_Finding.html">Finding Marker Genes with DeconvoBuddies</a>.</p>
<p>In our benchmark we found that methods responded differently and unpredictably to different marker gene sets, but top methods preformed better using the top 25 <em>Mean Ratio</em> marker genes for each cell type.</p>
</section>
<section id="method-performance" class="level2">
<h2 class="anchored" data-anchor-id="method-performance">Method Performance 🏆</h2>
<p>On to the main event: time to <strong>evaluate the deconvolution methods!</strong></p>
<p>We preformed deconvolution on the 110 bulk RNA seq samples, with each of the six selected methods, using the top25 Mean Ratio genes.</p>
<p>We then compared the estimated cell type proportions with the RNAScope cell type proportions. We calculated <strong>Pearson’s correlation and the root mean squared error (RMSE)</strong> between the two. Methods with high correlation and low RMSE are the most accurate.</p>
<p><strong>Overall <em>Bisque</em> and <em>hspe</em> were the top preforming methods.</strong> 🏆</p>
<p>These were also the top methods in Dai et al., benchmark which also examined brain data <span class="citation" data-cites="dai">(Dai et al., n.d.)</span>.</p>
<p>Bisque preformed slightly better in polyA data, <em>hspe</em> slightly better in RiboZero data. <em>CIBERSORTx</em> was a close third place, preforming similarly to Bisque and <em>hspe</em> in polyA data.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lahuuki.github.io/posts/2025-04-09-DeconvoBenchmark/images/benchmark.png" class="img-fluid figure-img"></p>
<figcaption>A. Scatter plot of RNAScope proportions vs.&nbsp;Method estimated proportions. B. Pearson’s correlation for each method over bulk RNA-seq library combinations, point size corresponds to rmse</figcaption>
</figure>
</div>
</section>
<section id="other-results" class="level2">
<h2 class="anchored" data-anchor-id="other-results">Other Results</h2>
<p>Above I have highlighted the main study design and conclusions of our deconvolution benchmark. In the paper we explored many more facets of deconvolution method performance. Some other results to highlight:</p>
<ul>
<li><p><em>hspe</em> is sensitive to marker gene selection</p></li>
<li><p><em>Bisque</em> can preform poorly with &lt; 4 donors</p></li>
<li><p><em>Bisque</em> an <em>hspe</em> are unaffected by including “case” donors in the snRNA-seq reference</p></li>
<li><p><em>Bisque</em> is biased to cell type proportions in the reference snRNA-seq data set</p></li>
<li><p><em>Bisque</em> and <em>hspe</em> had relativly fast runtimes and low memory requirements</p></li>
</ul>
<p>Be sure to check out the paper for more! 📃<br>
<br>
</p>
</section>
</section>
<section id="deconvobuddies" class="level1">
<h1>DeconvoBuddies</h1>
<p><img src="https://lahuuki.github.io/posts/2025-04-09-DeconvoBenchmark/images/deconvobuddies_logo.png" class="img-fluid"></p>
<p>In conjunction with this study we have developed a Bioconductor package <a href="https://research.libd.org/DeconvoBuddies">DeconvoBuddies</a>.</p>
<p>DeconvoBuddies is currently on the <a href="https://bioconductor.org/packages/devel/bioc/html/DeconvoBuddies.html">devel branch</a> and will be included in the next release (April 2025) release of Bioconductor.</p>
<p>The main features of the package are:<br>
</p>
<p><strong>Find Marker Genes</strong></p>
<ul>
<li><p>Implements Mean Ratio marker gene selection&nbsp;<code>get_mean_ratio()</code></p></li>
<li><p>Implements 1 vs.&nbsp;All marker gene selection <code>findMarkers_1vALL()</code></p></li>
</ul>
<p><strong>Plotting tools</strong></p>
<ul>
<li><p>Quickly plot gene expression over cell types (or other category)&nbsp;<code>plot_gene_express()</code></p></li>
<li><p>Plot top marker genes with annotated statistics <code>plot_marker_express</code></p></li>
<li><p>Plot Composition bar plots of deconvolution outputs&nbsp;<code>plot_comoposition_bar()</code></p></li>
</ul>
<p><strong>Access Data</strong></p>
<ul>
<li><p>Access paired data from consecutive slices of human DLPFC, used in deconvolution benchmark <code>fetch_deconvo_data()</code></p>
<ul>
<li>Access the RNA-scope, snRNA-seq, and bulk RNA-seq data described above</li>
</ul></li>
</ul>
</section>
<section id="truly-tldr" class="level1">
<h1>Truly TL;DR</h1>
<p>In this benchmark we used a multi-assay dataset from the human DLPFC to compare deconvolution performace in six top methods. RNAScope/IF cell type estimates were utilized as an orthogonal measurement of the true cell type composition. We developed the <em>Mean Ratio</em> method to select highly specific cell type marker genes.</p>
<p><strong>The top preforming deconvolution methods in brain were <em>hspe</em><span class="citation" data-cites="hunt2019">(Hunt et al. 2019)</span> and <em>Bisque</em> <span class="citation" data-cites="jew2020">(Jew et al. 2020)</span>.</strong> 🏆</p>
<p>We found many factors such as n reference donors, marker genes selection, and library type of bulk RNA-seq can impact performance of deconvolution methods. The dataset, <em>MeanRatio</em> function, and other useful functions for deconvolution are included in our Bioconductor package <a href="https://research.libd.org/DeconvoBuddies">DeconvoBuddies</a>.</p>
<p>Be sure to check out the paper for the full exploration of Deconvolution Method performance <span class="citation" data-cites="huuki-myers">(Huuki-Myers et al., n.d.)</span> ! <a href="https://doi.org/10.1186/s13059-025-03552-3" class="uri">https://doi.org/10.1186/s13059-025-03552-3</a></p>



</section>

<div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-bibliography"><h2 class="anchored quarto-appendix-heading">References</h2><div id="refs" class="references csl-bib-body hanging-indent" data-entry-spacing="0">
<div id="ref-avilacobos2020" class="csl-entry">
Avila Cobos, Francisco, José Alquicira-Hernandez, Joseph E. Powell, Pieter Mestdagh, and Katleen De Preter. 2020. <span>“Benchmarking of Cell Type Deconvolution Pipelines for Transcriptomics Data.”</span> <em>Nature Communications</em> 11 (November): 5650. <a href="https://doi.org/10.1038/s41467-020-19015-1">https://doi.org/10.1038/s41467-020-19015-1</a>.
</div>
<div id="ref-chu2022" class="csl-entry">
Chu, Tinyi, Zhong Wang, Dana Pe’er, and Charles G. Danko. 2022. <span>“Cell Type and Gene Expression Deconvolution with BayesPrism Enables Bayesian Integrative Analysis Across Bulk and Single-Cell RNA Sequencing in Oncology.”</span> <em>Nature Cancer</em> 3 (4): 505–17. <a href="https://doi.org/10.1038/s43018-022-00356-3">https://doi.org/10.1038/s43018-022-00356-3</a>.
</div>
<div id="ref-dai" class="csl-entry">
Dai, Rujia, Tianyao Chu, Ming Zhang, Xuan Wang, Alexandre Jourdon, Feinan Wu, Jessica Mariani, et al. n.d. <span>“Evaluating Performance and Applications of Sample-Wise Cell Deconvolution Methods on Human Brain Transcriptomic Data.”</span> <a href="https://doi.org/10.1101/2023.03.13.532468">https://doi.org/10.1101/2023.03.13.532468</a>.
</div>
<div id="ref-hippen2023" class="csl-entry">
Hippen, Ariel A., Dalia K. Omran, Lukas M. Weber, Euihye Jung, Ronny Drapkin, Jennifer A. Doherty, Stephanie C. Hicks, and Casey S. Greene. 2023. <span>“Performance of Computational Algorithms to Deconvolve Heterogeneous Bulk Ovarian Tumor Tissue Depends on Experimental Factors.”</span> <em>Genome Biology</em> 24 (1): 239. <a href="https://doi.org/10.1186/s13059-023-03077-7">https://doi.org/10.1186/s13059-023-03077-7</a>.
</div>
<div id="ref-hunt2019" class="csl-entry">
Hunt, Gregory J, Saskia Freytag, Melanie Bahlo, and Johann A Gagnon-Bartsch. 2019. <span>“Dtangle: Accurate and Robust Cell Type Deconvolution.”</span> <em>Bioinformatics</em> 35 (12): 2093–99. <a href="https://doi.org/10.1093/bioinformatics/bty926">https://doi.org/10.1093/bioinformatics/bty926</a>.
</div>
<div id="ref-huuki-myers" class="csl-entry">
Huuki-Myers, Louise A., Kelsey D. Montgomery, Sang Ho Kwon, Sophia Cinquemani, Nicholas J. Eagles, Daianna Gonzalez-Padilla, Sean K. Maden, et al. n.d. <span>“Benchmark of Cellular Deconvolution Methods Using a Multi-Assay Reference Dataset from Postmortem Human Prefrontal Cortex.”</span> <a href="https://doi.org/10.1101/2024.02.09.579665">https://doi.org/10.1101/2024.02.09.579665</a>.
</div>
<div id="ref-jew2020" class="csl-entry">
Jew, Brandon, Marcus Alvarez, Elior Rahmani, Zong Miao, Arthur Ko, Kristina M. Garske, Jae Hoon Sul, Kirsi H. Pietiläinen, Päivi Pajukanta, and Eran Halperin. 2020. <span>“Accurate Estimation of Cell Composition in Bulk Expression Through Robust Integration of Single-Cell Information.”</span> <em>Nature Communications</em> 11 (1): 1971. <a href="https://doi.org/10.1038/s41467-020-15816-6">https://doi.org/10.1038/s41467-020-15816-6</a>.
</div>
<div id="ref-jin2021" class="csl-entry">
Jin, Haijing, and Zhandong Liu. 2021. <span>“A Benchmark for RNA-Seq Deconvolution Analysis Under Dynamic Testing Environments.”</span> <em>Genome Biology</em> 22 (1): 102. <a href="https://doi.org/10.1186/s13059-021-02290-6">https://doi.org/10.1186/s13059-021-02290-6</a>.
</div>
<div id="ref-newman2019" class="csl-entry">
Newman, Aaron M., Chloé B. Steen, Chih Long Liu, Andrew J. Gentles, Aadel A. Chaudhuri, Florian Scherer, Michael S. Khodadoust, et al. 2019. <span>“Determining Cell Type Abundance and Expression from Bulk Tissues with Digital Cytometry.”</span> <em>Nature Biotechnology</em> 37 (7): 773–82. <a href="https://doi.org/10.1038/s41587-019-0114-2">https://doi.org/10.1038/s41587-019-0114-2</a>.
</div>
<div id="ref-tsoucas2019" class="csl-entry">
Tsoucas, Daphne, Rui Dong, Haide Chen, Qian Zhu, Guoji Guo, and Guo-Cheng Yuan. 2019. <span>“Accurate Estimation of Cell-Type Composition from Gene Expression Data.”</span> <em>Nature Communications</em> 10 (July): 2975. <a href="https://doi.org/10.1038/s41467-019-10802-z">https://doi.org/10.1038/s41467-019-10802-z</a>.
</div>
<div id="ref-wang2019" class="csl-entry">
Wang, Xuran, Jihwan Park, Katalin Susztak, Nancy R. Zhang, and Mingyao Li. 2019. <span>“Bulk Tissue Cell Type Deconvolution with Multi-Subject Single-Cell Expression Reference.”</span> <em>Nature Communications</em> 10 (1): 380. <a href="https://doi.org/10.1038/s41467-018-08023-x">https://doi.org/10.1038/s41467-018-08023-x</a>.
</div>
</div></section></div> ]]></description>
  <category>paper preview</category>
  <category>Deconvolution</category>
  <category>single cell</category>
  <guid>https://lahuuki.github.io/posts/2025-04-09-DeconvoBenchmark/</guid>
  <pubDate>Wed, 09 Apr 2025 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Spatial DLPFC: TL;DR</title>
  <dc:creator>Louise A. Huuki-Myers</dc:creator>
  <link>https://lahuuki.github.io/posts/2024-05-23-spatialDLPFC/</link>
  <description><![CDATA[ 




<section id="introduction" class="level1">
<h1>Introduction</h1>
<p>This blog post provides a high-level summary of our paper <a href="https://doi.org/10.1126/science.adh1938">“A data-driven single cell and spatial transcriptomic map of the human prefrontal cortex”</a> published in <em>Science</em> in May 2024 (aka <strong>spatialDLPFC</strong>)<span class="citation" data-cites="Huuki-Myers2024">(Huuki-Myers et al. 2024)</span>.</p>
<p>In the spatialDLPFC project we set out to learn more about the organization of the dorsolateral prefrontal cortex (aka DLPFC), its cell types, and gene expression profile 🧠.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lahuuki.github.io/posts/2024-05-23-spatialDLPFC/images/SpatialDLPFC_GraphAbs_revision2.4.png" class="img-fluid figure-img"></p>
<figcaption>Graphical abstract for the spatialDLPFC project published in <a href="https://www.science.org/doi/10.1126/science.adh1938"><em>Science</em></a></figcaption>
</figure>
</div>
</section>
<section id="background" class="level1">
<h1>Background</h1>
<section id="dlpfc" class="level2">
<h2 class="anchored" data-anchor-id="dlpfc">DLPFC</h2>
<p>The dorsolateral prefrontal cortex region of the brain is especially important for executive functions including working memory, cognitive flexibility, and planning. Disruptions of the DLPFC have been associated with several psychiatric and neurodevelopmental disorders, including schizophrenia and autism spectrum disorder.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lahuuki.github.io/posts/2024-05-23-spatialDLPFC/images/background.png" class="img-fluid figure-img"></p>
<figcaption>Location of the DLPFC, its laminar structure (illustration from <span class="citation" data-cites="House_Pansky">(House and Pansky, n.d.)</span>), and major cell types.</figcaption>
</figure>
</div>
</section>
<section id="rna-sequencing" class="level2">
<h2 class="anchored" data-anchor-id="rna-sequencing">RNA-sequencing&nbsp;</h2>
<p>One of the ways that we can understand the functions of different cell types and structures in the brain is to study what genes they express by sequencing the RNA in a tissue. Recently, several advanced transcriptomic<sup>1</sup> approaches using RNA sequencing have emerged, enhancing our ability to analyze gene expression in the brain.</p>
<script defer="" class="speakerdeck-embed" data-slide="3" data-id="044430d9444f4ce1b9e016125f22cad2" data-ratio="1.7772511848341233" src="//speakerdeck.com/assets/embed.js"></script>
<p>This LEGO brain schematic demonstrates the evolution from bulk RNA sequencing, which provides a mixture of cell types, to single cell/single nucleus RNA-seq, which reveals the transcriptional profiles of individual cell types. The latest advancement, spatial transcriptomics, links gene expression to specific anatomical locations, providing deeper insights into the relationships between brain structure and function.</p>
<section id="single-nucleus-rna-seq" class="level3">
<h3 class="anchored" data-anchor-id="single-nucleus-rna-seq">Single Nucleus RNA-seq</h3>
<p>Single nucleus or single cell RNA sequencing (snRNA-seq) enables us to examine the gene expression of individual cells or nuclei. This technique relies on uniquely barcoded gel beads that attach to a single cell or nucleus, tagging all RNA molecules from that cell. When sequenced, these tagged RNA molecules can be traced back to their original cell. Cells or nuclei are then typically clustered by their gene expression profiles to identify different cell type populations. The expression profiles and cluster assignments are often visualized using reduced dimension plots such as UMAPs or tSNE. In these plots, each point represents a cell, and the distance between points indicates their similarity<sup>2</sup>; closer points represent more similar cells, which are often of the same cell type (shown by different colors).</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lahuuki.github.io/posts/2024-05-23-spatialDLPFC/images/background_sn.png" class="img-fluid figure-img"></p>
<figcaption>Cartoon of 10x snRNA-seq process (via 10x Genomics), and tSNE plot</figcaption>
</figure>
</div>
<p>In this experiment we are working with nuclei, as the cell membrane is destroyed when the brain tissue is frozen. The major cell type populations to identify in the DLPFC are neurons (Excitatory and Inhibitory), glial cells (ex: Astrocytes, Microglia, Oligodendrocytes, OPC), and vascular cells (Endothelial &amp; Mural).</p>
</section>
<section id="spatially-resolved-transcriptomics-visium" class="level3">
<h3 class="anchored" data-anchor-id="spatially-resolved-transcriptomics-visium">Spatially Resolved Transcriptomics (Visium)</h3>
<p>Spatially resolved transcriptomics maps RNA to specific locations on a tissue sample, allowing us to profile gene expression across anatomical features such as blood vessels, glands, or, in our case, layers of the brain’s cortex.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lahuuki.github.io/posts/2024-05-23-spatialDLPFC/images/background_spatial.png" class="img-fluid figure-img"></p>
<figcaption>Cartoon of Visium spatial transcriptomics (via 10x genomics), and example spot plots</figcaption>
</figure>
</div>
<p>We used Visium slides, which feature a grid of approximately 5,000 spots arranged in a 6.5x6.5 mm area. Each spot has a unique barcode that binds to the RNA in the contacted tissue. When the RNA is sequenced, these molecules can be traced back to their specific grid locations, similar to the barcodes in snRNA-seq.&nbsp;</p>
<p>This RNA-seq data is paired with a high-definition histology image of the original tissue, providing additional information and aiding in data visualization. We can visualize the gene expression of each spot in “spot plots” using color gradients overlaid on these images. In the example above we highlight the location of the gray matter with <em>SNAP25</em> a gene highly expressed in neurons, <em>MBP</em> highlights white matter, and <em>PCP4</em> marks layer 5.</p>
</section>
<section id="study-design" class="level3">
<h3 class="anchored" data-anchor-id="study-design">Study Design&nbsp;</h3>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lahuuki.github.io/posts/2024-05-23-spatialDLPFC/images/study_design.png" class="img-fluid figure-img"></p>
<figcaption>Study design for spatialDLPFC</figcaption>
</figure>
</div>
<p>In this study we analyzed the DLPFC of ten healthy adult donors. We sampled three locations of the DLPFC: the anterior, middle, and posterior. All 30 samples were analyzed with Visium spatial transcriptomics, 19 (about 2 from each donor) were selected for snRNA-seq.</p>
</section>
</section>
</section>
<section id="data-driven-spatial-domains" class="level1">
<h1>Data-Driven Spatial Domains</h1>
<p>An earlier study, from the Lieber Institute, of spatial transcriptomics in the DLPFC <span class="citation" data-cites="maynard_2021">(Maynard et al. 2021)</span> relied on manually annotating the known layers of the cortex based on the histological images and the expression of select genes. This dataset has been invaluable for testing methodologies in spatial transcriptomics. However, manual annotation is tedious, time-consuming, and prone to human error and bias.&nbsp;</p>
<p>In our current study, which builds on the previous DLPFC project, we aimed to use unsupervised clustering to annotate the layers of the DLPFC, thereby avoiding the labor-intensive process of manual annotation and potentially discovering novel or unknown layers in the brain.</p>
<p>Based on benchmarking against the manually annotated layer data, we chose the method <em>BayesSpace</em> as the best method for clustering spatial data. We clustered the 30 Visium slides at a large range of resolutions, from k=2 to 28 (k denotes the number of clusters). We refer to these clusters as spatial domains, to name these domains we used the syntax <img src="https://latex.codecogs.com/png.latex?Sp_%7Bk%7DD_%7Bd%7D">, where <em>k</em> is clustering resolution and <em>d</em> is spatial domain number, so <img src="https://latex.codecogs.com/png.latex?Sp_%7B9%7DD_%7B1%7D"> is spatial domain 1 when <em>k</em>=9.&nbsp;</p>
<p>We found that <em>k</em>=2 did a great job separating the white matter from the gray matter. With an increasing number of clusters, the layers of the cortex begin to emerge. This brings us to a question: which level of clustering best captures biologically important layers of the DLPFC?</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lahuuki.github.io/posts/2024-05-23-spatialDLPFC/images/clusters.png" class="img-fluid figure-img"></p>
<figcaption>A. Histological images of three DLPFC tissue sections B. spatial clustering at k=2, 9, and 16</figcaption>
</figure>
</div>
<section id="spatial-registration-of-bayesspace-clusters" class="level2">
<h2 class="anchored" data-anchor-id="spatial-registration-of-bayesspace-clusters">Spatial Registration of <em>BayesSpace</em> Clusters</h2>
<p>To check which resolution of <em>BayesSpace</em> clusters best matches the six histological layers plus white matter, we used a useful analysis we’ve developed called “spatial registration”. We will delve into the details of this analysis in a future blog post, and its application in this <a href="https://research.libd.org/spatialLIBD/articles/guide_to_spatial_registration.html">vignette</a>.</p>
<p>Briefly this analysis compares the gene expression profile of a reference set of clusters such as spatial regions or domains, annotated features, or cell type populations etc. (in this case the manual annotations from the pilot dataset), to a query set of clusters we want to learn more about (the <em>BayesSpace</em> clusters). The <em>t</em>-statistics from an enrichment analysis in the query and the reference set are correlated, pairwise across all groups. We visualize this in a heatmap where the high correlation is green, low correlation is purple.&nbsp; Where a query cluster has high correlation with a reference cluster, we can say the two groups are associated, and if the correlation passes our threshold we annotate the query group with the reference.&nbsp;</p>
<p>In the below example <img src="https://latex.codecogs.com/png.latex?Sp_%7B7%7DD_%7B7%7D"> has a high correlation with the manual annotation white matter, we then annotate it as&nbsp;<img src="https://latex.codecogs.com/png.latex?Sp_%7B7%7DD_%7B7%7D%5Csim%20WM">. This annotation helps add biological context to our newly defined spatial domains.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lahuuki.github.io/posts/2024-05-23-spatialDLPFC/images/spatial_registration.jpeg" class="img-fluid figure-img"></p>
<figcaption>Example spatial registration between manual layers and <em>k</em>=7 <em>BayesSpace</em> clusters</figcaption>
</figure>
</div>
<p>From this process we learned that <em>k</em>=9 best reiterated the expected pattern of six layers + white matter, by matching each spatial domain to only one layer. In contrast to the <em>k</em>=7 resolution where some of the spatial domains (<img src="https://latex.codecogs.com/png.latex?Sp_%7B7%7DD_%7B2%7D"> and <img src="https://latex.codecogs.com/png.latex?Sp_%7B7%7DD_%7B3%7D">) matched more than one layer. <em>K</em>=9 split white matter and Layer 1 into two spatial domains with unique gene expression.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lahuuki.github.io/posts/2024-05-23-spatialDLPFC/images/bayesSpace_k09_spatial_registration_heatmap_color.png" class="img-fluid figure-img"></p>
<figcaption><em>BayesSpace</em> <em>k</em>=9 cluster spatial registration vs.&nbsp;manual layers</figcaption>
</figure>
</div>
<p>For higher resolution clustering, <em>k</em>=16 was determined to be the optimal number of clusters based on the fast H+ statistic, so based on the data this is the best way to cluster the data. This further split the six original layers into 2-3 sub-layers each. The maximum number of clusters we could comfortably run on our computing setup was <em>k</em>=28, at this high number of clusters we lose the laminar definition.</p>
</section>
<section id="novel-biology-in-spatial-domains" class="level2">
<h2 class="anchored" data-anchor-id="novel-biology-in-spatial-domains">Novel Biology in Spatial Domains</h2>
<p>So what does all this clustering and layer matching help us learn about the brain?</p>
<p>At each resolution differentially expressed genes were detected between the spatial domains, this shows the complex organization of gene expression across the DLPFC tissue.&nbsp;</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lahuuki.github.io/posts/2024-05-23-spatialDLPFC/images/Sp09D01_CLDN5.png" class="img-fluid figure-img"></p>
<figcaption>Clustering at <em>k</em>=9 highlighting <img src="https://latex.codecogs.com/png.latex?Sp_%7B9%7DD_%7B1%7D%5Csim%20L1">, spot plots of and Violin plots of <em>CLDN5</em> expression</figcaption>
</figure>
</div>
<p>The data-driven clustering at <em>k</em>=9 revealed a sub-layer of the white matter with as much difference in gene expression that exists in the previously considered layers. It also found a thin band of vascular tissue (<img src="https://latex.codecogs.com/png.latex?Sp_%7B9%7DD_%7B1%7D%5Csim%20L1">) in layer 1 with high expression for endothelial genes like <em>CLDN5</em>. These were both novel findings resulting from the unsupervised clustering. The sub-layers found in <em>k</em>=16 had distinct gene expression profiles.&nbsp;</p>
<p>These new spatial domains help refine the layered anatomy of the DLPFC. Neat! 🎉</p>
</section>
</section>
<section id="single-nucleus-rna-seq-1" class="level1">
<h1>Single Nucleus RNA-seq</h1>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lahuuki.github.io/posts/2024-05-23-spatialDLPFC/images/Brain Cell Types Text.png" class="img-fluid figure-img"></p>
<figcaption>Cartoons of brain cell types, Created with BioRender.com</figcaption>
</figure>
</div>
<p>On the single nucleus side of the experiment, we processed 56k nuclei from n=19 samples. The first round of clustering (hierarchical clustering) found 29 distinct cell type clusters from seven broad cell types (note the abbreviations):&nbsp;</p>
<p><strong>Glia &amp; Vascular cells: provide structure to the brain, support neurons</strong><sup>3</sup>&nbsp;</p>
<ol type="1">
<li><p>Astrocytes (Astro):link neurons to blood supply, clear neurotransmitters&nbsp;</p></li>
<li><p>Endothelial/Mural cells (EndoMural): blood vessels/vascular tissue</p></li>
<li><p>Microglia (Micro): immune function</p></li>
<li><p>Oligodendrocytes (Oligo): myelin sheath</p></li>
<li><p>Oligodendrocyte Precursor cells (OPCs)</p></li>
</ol>
<p><strong>Neurons: send and receive signals in the brain</strong></p>
<ol start="6" type="1">
<li><p>Excitatory Neurons (Excit)</p></li>
<li><p>Inhibitory Neurons (Inhib)<br>
</p></li>
</ol>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lahuuki.github.io/posts/2024-05-23-spatialDLPFC/images/TSNE_cellType.png" class="img-fluid figure-img"></p>
<figcaption>tSNE plot of snRNA-seq with 29 hierarchical clusters</figcaption>
</figure>
</div>
<p>Sub-populations in EndoMural, Oligos, and the Excit/Inhib Neurons were found in the first round of clustering.</p>
<p>In the DLPFC we know that different populations of excitatory neurons exist between the six layers of gray matter. To annotate our 13 Excit clusters we brought back our spatial registration tool, comparing all of the 29 hierarchical clusters to the manually annotated clusters from <span class="citation" data-cites="maynard_2021">(Maynard et al. 2021)</span> as well as the <em>BayesSpace</em> spatial domains at <em>k</em>=9 &amp; <em>k</em>=16.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lahuuki.github.io/posts/2024-05-23-spatialDLPFC/images/spatial_registration_sn_heatmap_bayesAnno.png" class="img-fluid figure-img"></p>
<figcaption>Spatial registration between the 29 snRNA-seq hierarchical clusters vs.&nbsp;histological layers or spatial domains at k=9 &amp; 16. Annotations with good confidence (cor &gt; 0.25, merge ratio = 0.1) are marked with “X” and poor confidence are marked with “*”.</figcaption>
</figure>
</div>
<p>We found Oligo and OPC cell types mapped to white matter, and EndoMural plus Astro mapped to Layer 1. Inhib neurons had a weak association with Layer 2-4, and the Excit neurons had strong associations with 1-3 layers each across the gray matter.&nbsp;The same patterns were found and re-fined in the spatial domains, such as the EndoMural groups mapping to <img src="https://latex.codecogs.com/png.latex?Sp_%7B9%7DD_%7B1%7D%5Csim%20L1">.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lahuuki.github.io/posts/2024-05-23-spatialDLPFC/images/TSNE_cellType_layer_label.png" class="img-fluid figure-img"></p>
<figcaption>tSNE plot of snRNA-seq data with layer level annotations</figcaption>
</figure>
</div>
<p>The layer associations were used to annotate the excitatory neuron populations by their strongest associated layers, other cell types were collapsed to their broad cell types. This resulted in our “layer-level” annotation with 13 cell types, and 7 populations of Excit neurons.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lahuuki.github.io/posts/2024-05-23-spatialDLPFC/images/markers_heatmap_layer_mean.png" class="img-fluid figure-img"></p>
<figcaption>Heatmap of the scaled pseudo-bulked logcounts for the top 10 marker genes for each layer level cell type</figcaption>
</figure>
</div>
<p>For each cell type we identified cell type specific marker genes with the <em>Mean Ratio</em> method described in <span class="citation" data-cites="huuki-myers">(Huuki-Myers et al., n.d.)</span>. The end product is gene expression profiles for layer annotated cell types in the human DLPFC! 🦠</p>
</section>
<section id="data-integration" class="level1">
<h1>Data Integration</h1>
<p>With this combined spatial and snRNA-seq data, there are a number of interesting downstream analyzes possible. Here I will briefly touch on two ways we integrated these data types.</p>
<section id="spot-deconvolution" class="level2">
<h2 class="anchored" data-anchor-id="spot-deconvolution">Spot Deconvolution</h2>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lahuuki.github.io/posts/2024-05-23-spatialDLPFC/images/spot_deconvolution.jpeg" class="img-fluid figure-img"></p>
<figcaption>Overview of spot deconvolution: multiple cells exist in each spot, deconvolution predicts the cell type composition of each spot.</figcaption>
</figure>
</div>
<p>A challenge with Visium spatial transcriptomics is that each spot is larger than single cell resolution, and on average contains 3 cells per spot. To better understand the gene expression of each spot, we employed an analysis called spot deconvolution, which predicts what cell types exist in the tissue for each Visium spot.&nbsp;</p>
<script defer="" class="speakerdeck-embed" data-slide="39" data-id="ca75f869c0764642b48e8fa7143218ff" data-ratio="1.7772511848341233" src="//speakerdeck.com/assets/embed.js"></script>
<p>We determined that the methods <em>Tangram</em> and <em>Cell2location</em> were the most accurate for predicting cell type compositions through a benchmark experiment.&nbsp;From there we predicted the cell type composition of the spots across the 30 Visium slides with both deconvolution methods.</p>
<p>The spot deconvolution work was performed by <a href="https://www.linkedin.com/in/nick-eagles7/">Nick Eagles</a>. Check out his <a href="https://speakerdeck.com/nickeagles/libd-seminar-spot-deconvolution">spot deconvolution slide deck</a> above for more details.</p>
</section>
<section id="spatially-map-disease-ligand-receptor-interactions" class="level2">
<h2 class="anchored" data-anchor-id="spatially-map-disease-ligand-receptor-interactions">Spatially Map Disease Ligand Receptor Interactions</h2>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://lahuuki.github.io/posts/2024-05-23-spatialDLPFC/images/ligend_receptor_spatialDLPFC.jpeg" class="img-fluid figure-img"></p>
<figcaption>Cell-cell communication, <em>EFNA5</em> and <em>EPHA5</em> co-localizing in <img src="https://latex.codecogs.com/png.latex?Sp_%7B9%7DD_%7B7%7D%5Csim%20L6"> , cartoon of LR interaction in a Visium spot</figcaption>
</figure>
</div>
<p>To show how this dataset can be a rich resource to study neuropsychiatric diseases we explored the spatial location of a ligand-receptor (LR) interaction that is associated with schizophrenia. We performed a cell-cell communication analysis which predicts which cell types are interacting with each other, and then identified overlapping LR pairs with risk of schizophrenia from databases with the cell-cell communication results. From the common set of LR pairs we examined ligand <em>EFNA5</em> &amp; receptor <em>EPHA5</em>. From the snRNA-seq populations, <em>EFNA5</em> was most expressed in Excit_L5/6, and <em>EPHA5</em> in Excit_L6. From the Visium data we identified spots where the two genes were co-expressed, most frequent in <img src="https://latex.codecogs.com/png.latex?Sp_%7B9%7DD_%7B7%7D%5Csim%20L6">, these spots also had high proportions of Excit_L5/6 neurons and Excit_L6 neurons predicted by spot deconvolution. Spatially mapping LR pairs helps us gain insight into the potentials for drug development. (This cool work was completed by <a href="https://x.com/BoyiGuo">Boyi Guo</a> and <a href="https://x.com/mgrantpeters">Melissa Grant-Peters</a>)</p>
<p>This analysis used many elements of the data from the spatialDLPFC project, and is just one example of how this dataset is relevant to the study of disease. In another application we also checked for enrichment of depression and PTSD related genes between the spatial domains. There are lots of exciting applications for the study of diseases with spatial and single cell, stay tuned to future work from the Lieber Institute for more! 👀</p>
</section>
</section>
<section id="summary" class="level1">
<h1>Summary</h1>
<p><img src="https://lahuuki.github.io/posts/2024-05-23-spatialDLPFC/images/SpatialDLPFC_logo.png" class="img-fluid"></p>
<p>Overall we’ve created a paired spatial transcriptomic and single nucleus RNA-seq dataset of the human DLPFC. We’ve used spatial registration to map the new spatial domains and excitatory neurons to the classical histological layers. The data-driven spatial domains refine the layers of the DLPFC, finding laminar domains and cortical sub-layers. Spot deconvolution further refines the profile of each spot. This data has many applications in the study of neuropsychiatric diseases. We’ve made this dataset widely available to the scientific community (see below).</p>
<p>For more details be sure to check out our recently published paper in <em>Science</em> <span class="citation" data-cites="Huuki-Myers2024">(Huuki-Myers et al. 2024)</span> 🎉<a href="https://doi.org/10.1126/science.adh1938" class="uri">https://doi.org/10.1126/science.adh1938</a></p>
<section id="data-availability" class="level2">
<h2 class="anchored" data-anchor-id="data-availability">Data Availability&nbsp;</h2>
<p>The 30 DLPFC Visium samples &amp; the 56k nuclei snRNA-seq dataset are available to explore on our <a href="https://research.libd.org/spatialDLPFC/">interactive websites</a> and Bioconductor/R package <a href="https://research.libd.org/spatialLIBD/">spatialLIBD</a>.</p>
<p>Check out how your favorite gene is expressed over the layers or cell types of the DLPFC!</p>
<p><img src="https://lahuuki.github.io/posts/2024-05-23-spatialDLPFC/images/SpatialDLPFC_shiny-01.png" class="img-fluid"></p>
</section>
</section>
<section id="comments" class="level1">




</section>


<div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-bibliography"><h2 class="anchored quarto-appendix-heading">Comments 💬</h2><div id="refs" class="references csl-bib-body hanging-indent" data-entry-spacing="0">
<div id="ref-House_Pansky" class="csl-entry">
House, Earl Lawrence, and Ben Pansky. n.d. <em>A Functional Approach to Neuroanatomy</em>. 2nd ed. Blakiston Division.
</div>
<div id="ref-huuki-myers" class="csl-entry">
Huuki-Myers, Louise A., Kelsey D. Montgomery, Sang Ho Kwon, Sophia Cinquemani, Nicholas J. Eagles, Daianna Gonzalez-Padilla, Sean K. Maden, et al. n.d. <span>“Benchmark of Cellular Deconvolution Methods Using a Multi-Assay Reference Dataset from Postmortem Human Prefrontal Cortex.”</span> <a href="https://doi.org/10.1101/2024.02.09.579665">https://doi.org/10.1101/2024.02.09.579665</a>.
</div>
<div id="ref-Huuki-Myers2024" class="csl-entry">
Huuki-Myers, Louise A., Abby Spangler, Nicholas J. Eagles, Kelsey D. Montgomery, Sang Ho Kwon, Boyi Guo, Melissa Grant-Peters, et al. 2024. <span>“A Data-Driven Single-Cell and Spatial Transcriptomic Map of the Human Prefrontal Cortex.”</span> <em>Science</em>. <a href="https://doi.org/10.1126/science.adh1938">https://doi.org/10.1126/science.adh1938</a>.
</div>
<div id="ref-maynard_2021" class="csl-entry">
Maynard, KR, L Collado-Torres, LM Weber, C Uytingco, BK Barry, SR Williams, JL Catallini, et al. 2021. <span>“Transcriptome-Scale Spatial Gene Expression in the Human Dorsolateral Prefrontal Cortex.”</span> <em>Nature Neuroscience</em> 24 (3): 425–36. <a href="https://doi.org/10.1038/s41593-020-00787-0">https://doi.org/10.1038/s41593-020-00787-0</a>.
</div>
</div></section><section id="footnotes" class="footnotes footnotes-end-of-document"><h2 class="anchored quarto-appendix-heading">Footnotes</h2>

<ol>
<li id="fn1"><p>the measurement of RNA transcription is known as “transcriptomics”↩︎</p></li>
<li id="fn2"><p>The full interpretation of these kinds of plots takes much nuance we won’t discuss here↩︎</p></li>
<li id="fn3"><p>The following are brief notes on cell type function to provide context, not comprehensive descriptions of the complex roles of these cell types↩︎</p></li>
</ol>
</section></div> ]]></description>
  <category>paper preview</category>
  <category>spatialDLPFC</category>
  <category>spatial</category>
  <category>single cell</category>
  <guid>https://lahuuki.github.io/posts/2024-05-23-spatialDLPFC/</guid>
  <pubDate>Thu, 23 May 2024 00:00:00 GMT</pubDate>
  <media:content url="https://lahuuki.github.io/posts/2024-05-23-spatialDLPFC/images/spatialDLPFC_logo.png" medium="image" type="image/png"/>
</item>
</channel>
</rss>
