Skip to contents

convertID takes a phylomap with gene IDs from one database and converts them gene IDs used in another database. This function wraps around the packages biomartr, stringr and dplyr.

Usage

convertID(
  phylomap = phylomap,
  mart = "ENSEMBL_MART_ENSEMBL",
  dataset = NULL,
  attributes = c("ensembl_gene_id", "ensembl_peptide_id"),
  filters = "uniprot_gn_id",
  split_uniprot_gene = TRUE
)

Arguments

phylomap

a phylomap dataset, e.g. phylomapr::Homo_sapiens.PhyloMap, whose GeneIDs are to be converted.

mart

a character string specifying the mart to be used. Users can obtain available marts using biomartr::getMarts().

dataset

a character string specifying the dataset within the mart to be used, e.g. dataset = "hsapiens_gene_ensembl".

attributes

a character vector specifying the attributes that shall be used, e.g. attributes = c("ensembl_gene_id", "ensembl_peptide_id").

filters

a character vector specifying the filter (query key) for the BioMart query, e.g. filter = "uniprot_gn_id".

split_uniprot_gene

a Boolean value specifying whether the uniprot geneIDs (e.g. sp|A0A061ACU2|PIEZ1_CAEEL) should be split (via stringr::str_split(x, "[|]"))[2]))

Details

Gene IDs differ between databases used. Through convertID, users can obtain the corresponding Gene IDs from a different database, e.g. ENSEMBL.

Note: the lowest (oldest) phylostratum for each gene is chosen.

References

Lotharukpong JS et al. (2023) (unpublished)

Drost HG, Paszkowski J. Biomartr: genomic data retrieval with R. Bioinformatics (2017) 33(8): 1216-1217. doi:10.1093/bioinformatics/btw821.

Examples


# load the first 100 genes from the Homo_sapiens.PhyloMap.
phylomap_example <- head(phylomapr::Homo_sapiens.PhyloMap, 100)

# convert phylomap from uniprot to ensembl gene IDs.
converted_phylomap_example <- convertID(
      phylomap = phylomap_example,
      mart = "ENSEMBL_MART_ENSEMBL",
      dataset = "hsapiens_gene_ensembl",
      filters = "uniprot_gn_id"
)
#> Starting Gene ID conversion...
#> Starting BioMart query ...
#> 
#> 
#> Please cite: Drost HG, Paszkowski J. Biomartr: genomic data retrieval with R. Bioinformatics (2017) 33(8): 1216-1217. doi:10.1093/bioinformatics/btw821.

# Previously
head(phylomap_example)
#> # A tibble: 6 × 2
#>   Phylostratum GeneID                   
#>          <dbl> <chr>                    
#> 1            1 sp|A0A024RBG1|NUD4B_HUMAN
#> 2            1 sp|A0A075B6H7|KV37_HUMAN 
#> 3            1 sp|A0A075B6H8|KVD42_HUMAN
#> 4            1 sp|A0A075B6H9|LV469_HUMAN
#> 5            1 sp|A0A075B6I0|LV861_HUMAN
#> 6            1 sp|A0A075B6I1|LV460_HUMAN
# Converted
head(converted_phylomap_example)
#> # A tibble: 6 × 2
#>   Phylostratum GeneID         
#>          <dbl> <chr>          
#> 1            1 ENSG00000177144
#> 2            1 ENSG00000211623
#> 3            1 ENSG00000211632
#> 4            1 ENSG00000211633
#> 5            1 ENSG00000211637
#> 6            1 ENSG00000211638