Gene founder events facilitate evolutionary innovations. phylomapr
enables quick retrieval of precomputed gene age maps (phylomaps) in R. Gene age maps loaded from phylomapr
integrate seamlessly with myTAI
.
Installation
# install biomartr
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("ropensci/biomartr")
devtools::install_github("LotharukpongJS/phylomapr")
Use Cases
Retrieve gene age maps using phylomapr
Load the phylomap
of Apostichopus japonicus (Japanese sea cucumber) generated using the GenEra.
# either
Aj.map <- phylomapr::Apostichopus_japonicus.PhyloMap
# or alternatively
library(phylomapr)
Aj.map <- Apostichopus_japonicus.PhyloMap
head(Aj.map)
Phylostratum GeneID
1 2 tr|A0A0B6VS88|A0A0B6VS88_STIJA
2 1 tr|A0A0G2R1N3|A0A0G2R1N3_STIJA
3 1 tr|A0A0H4BK46|A0A0H4BK46_STIJA
4 3 tr|A0A0X7YCD7|A0A0X7YCD7_STIJA
5 1 tr|A0A1B2ZDN7|A0A1B2ZDN7_STIJA
6 2 tr|A0A1X9J403|A0A1X9J403_STIJA
To get the data description.
?Apostichopus_japonicus.PhyloMap
:phylomapr R Documentation
Apostichopus_japonicus.PhyloMap package
Phylomap of Apostichopus japonicus
:
Description
Gene ages inferred using GenEra on refence protein sequences from: DIAMOND was run using the ultra-sensitive
Uniprot proteomes. Note
mode.
:
Usage
Apostichopus_japonicus.PhyloMap
:
Format
30,032 rows and 2 variables:
A tibble with
Phylostratum (or gene age) assignment
Phylostratum dbl
GeneID chr GeneID annotation from UniProt
:
Source
<https://genomebiology.biomedcentral.com/articles/10.1186/s13059-023-02895-z>
Loading gene age maps into myTAI
myTAI
facilitates evolutionary transcriptomic studies. Below are some ways in which gene age maps retrieved via phylomapr
can be integrate seamlessly into myTAI
.
Plot the developmental hourglass (on simulated gene expression data)
using simulated developmental gene expression of Apostichopus japonicus (Japanese sea cucumber).
Aj.map <- phylomapr::Apostichopus_japonicus.PhyloMap
Simulate developmental gene expression.
# Set the random seed for reproducibility
set.seed(123)
# Generate log-normally distributed counts (controversial) for each gene and developmental stage, and
# Create a data frame with the count table
Aj.ExpressionMatrix <- tibble::tibble(
GeneID = Aj.map$GeneID,
`24H` = stats::rlnorm(length(Aj.map$GeneID), meanlog = 3, sdlog = 1),
`48H` = stats::rlnorm(length(Aj.map$GeneID), meanlog = 3, sdlog = 1),
`72H` = stats::rlnorm(length(Aj.map$GeneID), meanlog = 3, sdlog = 1)
)
Aj.PES <- myTAI::MatchMap(Aj.map, Aj.ExpressionMatrix, remove.duplicates = FALSE, accumulate = NULL)
And test the hourglass on the simulated data.
myTAI::PlotSignature(tidyr::drop_na(Aj.PES))
Next, transform the simulated gene expression data
Note: this requires myTAI (version > 1.0.1.0000)
.
Aj.PES.log2 <- myTAI::tf(tidyr::drop_na(Aj.PES),FUN = log2, pseudocount = 1)
hist(Aj.PES.log2$`24H`)
Compare this to the distribution of raw abundance (TPM).
hist(Aj.PES$`24H`, breaks = 200)
myTAI::PlotSignature(tidyr::drop_na(Aj.PES.log2))
Tutorials
Gene names in different databases: GeneIDs can differ between databases. This could be an issue when the gene age is estimated with one gene naming convention and the RNA-seq mapping is done with another. This tutorial shows how one could convert gene IDs (
convertID()
) between databases.Adding phylomaps to
phylomapr
: Advanced gene age (phylo)mappers who ran their own gene age inference may want to contribute tophylomapr
, which is at its core a collaborative effort. This tutorial shows how one could add new phylomaps tophylomapr
.
Acknowledgement
I would like to thank several individuals for making this mini-project possible.
First I would like to thank Hajk-Georg Drost for providing me with the intellectual environment that enabled this project.
Furthermore, I would like to thank Susana M. Coelho for hosting and facilitating this research, as well as the Max Planck Institute for Biology Tübingen and the Max Planck Society.
I also thank the BMBF-funded de.NBI Cloud within the German Network for Bioinformatics Infrastructure (de.NBI) (031A532B, 031A533A, 031A533B, 031A534A, 031A535A, 031A537A, 031A537B, 031A537C, 031A537D, 031A538A).