# UCSCXenaTools
**Repository Path**: ShixiangWang/UCSCXenaTools
## Basic Information
- **Project Name**: UCSCXenaTools
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: GPL-3.0
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2020-09-26
- **Last Updated**: 2024-04-26
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# UCSCXenaTools
**UCSCXenaTools** is an R package for accessing genomics data from UCSC
Xena platform, from cancer multi-omics to single-cell RNA-seq. Public
omics data from UCSC Xena are supported through [**multiple turn-key
Xena Hubs**](https://xenabrowser.net/datapages/), which are a collection
of UCSC-hosted public databases such as TCGA, ICGC, TARGET, GTEx, CCLE,
and others. Databases are normalized so they can be combined, linked,
filtered, explored and downloaded.
**Who is the target audience and what are scientific applications of
this package?**
- Target Audience: cancer and clinical researchers, bioinformaticians
- Applications: genomic and clinical analyses
## Table of Contents
- [Installation](#installation)
- [Data Hub List](#data-hub-list)
- [Basic usage](#basic-usage)
- [Citation](#citation)
- [How to contribute](#how-to-contribute)
- [Acknowledgment](#acknowledgment)
## Installation
Install stable release from CRAN with:
``` r
install.packages("UCSCXenaTools")
```
You can also install devel version of **UCSCXenaTools** from github
with:
``` r
# install.packages("remotes")
remotes::install_github("ropensci/UCSCXenaTools")
```
If you want to build vignette in local, please add two options:
``` r
remotes::install_github("ropensci/UCSCXenaTools", build_vignettes = TRUE, dependencies = TRUE)
```
## Data Hub List
All datasets are available at .
Currently, **UCSCXenaTools** supports the following data hubs of UCSC
Xena.
- UCSC Public Hub:
- TCGA Hub:
- GDC Xena Hub:
- ICGC Xena Hub:
- Pan-Cancer Atlas Hub:
- UCSC Toil RNAseq Recompute Compendium Hub:
- PCAWG Xena Hub:
- ATAC-seq Hub:
- Singel Cell Xena Hub:
- Kids First Xena Hub:
- Treehouse Xena Hub:
Users can update dataset list from the newest version of UCSC Xena by
hand with `XenaDataUpdate()` function, followed by restarting R and
`library(UCSCXenaTools)`.
If any url of data hub is changed or a new data hub is online, please
remind me by emailing to or [opening an issue on
GitHub](https://github.com/ropensci/UCSCXenaTools/issues).
## Basic usage
Download UCSC Xena datasets and load them into R by **UCSCXenaTools** is
a workflow with `generate`, `filter`, `query`, `download` and `prepare`
5 steps, which are implemented as `XenaGenerate`, `XenaFilter`,
`XenaQuery`, `XenaDownload` and `XenaPrepare` functions, respectively.
They are very clear and easy to use and combine with other packages like
`dplyr`.
To show the basic usage of **UCSCXenaTools**, we will download clinical
data of LUNG, LUAD, LUSC from TCGA (hg19 version) data hub. Users can
learn more about **UCSCXenaTools** by running
`browseVignettes("UCSCXenaTools")` to read vignette.
### XenaData data.frame
**UCSCXenaTools** uses a `data.frame` object (built in package)
`XenaData` to generate an instance of `XenaHub` class, which records
information of all datasets of UCSC Xena Data Hubs.
You can load `XenaData` after loading `UCSCXenaTools` into R.
``` r
library(UCSCXenaTools)
#> =========================================================================================
#> UCSCXenaTools version 1.4.7
#> Project URL: https://github.com/ropensci/UCSCXenaTools
#> Usages: https://cran.r-project.org/web/packages/UCSCXenaTools/vignettes/USCSXenaTools.html
#>
#> If you use it in published research, please cite:
#> Wang et al., (2019). The UCSCXenaTools R package: a toolkit for accessing genomics data
#> from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq.
#> Journal of Open Source Software, 4(40), 1627, https://doi.org/10.21105/joss.01627
#> =========================================================================================
#> --Enjoy it--
data(XenaData)
head(XenaData)
#> # A tibble: 6 × 17
#> XenaHosts XenaHostNames XenaCohorts XenaDatasets SampleCount DataSubtype Label
#>
#> 1 https://… publicHub Breast Can… ucsfNeve_pu… 51 gene expre… Neve…
#> 2 https://… publicHub Breast Can… ucsfNeve_pu… 57 phenotype Phen…
#> 3 https://… publicHub Glioma (Ko… kotliarov20… 194 copy number Kotl…
#> 4 https://… publicHub Glioma (Ko… kotliarov20… 194 phenotype Phen…
#> 5 https://… publicHub Lung Cance… weir2007_pu… 383 copy number CGH
#> 6 https://… publicHub Lung Cance… weir2007_pu… 383 phenotype Phen…
#> # … with 10 more variables: Type , AnatomicalOrigin ,
#> # SampleType , Tags , ProbeMap , LongTitle ,
#> # Citation , Version , Unit , Platform
```
### Workflow
Select datasets.
``` r
# The options in XenaFilter function support Regular Expression
XenaGenerate(subset = XenaHostNames=="tcgaHub") %>%
XenaFilter(filterDatasets = "clinical") %>%
XenaFilter(filterDatasets = "LUAD|LUSC|LUNG") -> df_todo
df_todo
#> class: XenaHub
#> hosts():
#> https://tcga.xenahubs.net
#> cohorts() (3 total):
#> TCGA Lung Cancer (LUNG)
#> TCGA Lung Adenocarcinoma (LUAD)
#> TCGA Lung Squamous Cell Carcinoma (LUSC)
#> datasets() (3 total):
#> TCGA.LUNG.sampleMap/LUNG_clinicalMatrix
#> TCGA.LUAD.sampleMap/LUAD_clinicalMatrix
#> TCGA.LUSC.sampleMap/LUSC_clinicalMatrix
```
Query and download.
``` r
XenaQuery(df_todo) %>%
XenaDownload() -> xe_download
```
**For researchers in China, now Hiplot team has deployed several Xena
mirror sites (`https://xena.hiplot.com.cn/`) at Shanghai. You can set an
option `options(use_hiplot = TRUE)` before querying data step to speed
up both data querying and downloading.**
``` r
options(use_hiplot = TRUE)
XenaQuery(df_todo) %>%
XenaDownload() -> xe_download
#> The hiplot server may down, we will not use it for now.
#> This will check url status, please be patient.
#> All downloaded files will under directory C:\Users\ADMINI~1\AppData\Local\Temp\RtmpIfw2E2.
#> The 'trans_slash' option is FALSE, keep same directory structure as Xena.
#> Creating directories for datasets...
#> Downloading TCGA.LUNG.sampleMap/LUNG_clinicalMatrix
#> Downloading TCGA.LUAD.sampleMap/LUAD_clinicalMatrix
#> Downloading TCGA.LUSC.sampleMap/LUSC_clinicalMatrix
```
Prepare data into R for analysis.
``` r
cli = XenaPrepare(xe_download)
class(cli)
#> [1] "list"
names(cli)
#> [1] "LUNG_clinicalMatrix" "LUAD_clinicalMatrix" "LUSC_clinicalMatrix"
```
## More to read
- [Introduction and basic usage of
UCSCXenaTools](https://shixiangwang.github.io/home/en/tools/ucscxenatools-intro/)
- [UCSCXenaTools: Retrieve Gene Expression and Clinical Information
from UCSC Xena for Survival
Analysis](https://shixiangwang.github.io/home/en/post/ucscxenatools-201908/)
- [Obtain RNAseq Values for a Specific Gene in Xena
Database](https://shixiangwang.github.io/home/en/post/2020-07-22-ucscxenatools-single-gene/)
- [UCSC Xena Access APIs in
UCSCXenaTools](https://shixiangwang.github.io/home/en/tools/ucscxenatools-api/)
## Citation
Cite me by the following paper.
Wang et al., (2019). The UCSCXenaTools R package: a toolkit for accessing genomics data
from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq.
Journal of Open Source Software, 4(40), 1627, https://doi.org/10.21105/joss.01627
# For BibTex
@article{Wang2019UCSCXenaTools,
journal = {Journal of Open Source Software},
doi = {10.21105/joss.01627},
issn = {2475-9066},
number = {40},
publisher = {The Open Journal},
title = {The UCSCXenaTools R package: a toolkit for accessing genomics data from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq},
url = {https://dx.doi.org/10.21105/joss.01627},
volume = {4},
author = {Wang, Shixiang and Liu, Xuesong},
pages = {1627},
date = {2019-08-05},
year = {2019},
month = {8},
day = {5},
}
Cite UCSC Xena by the following paper.
Goldman, Mary, et al. "The UCSC Xena Platform for cancer genomics data
visualization and interpretation." BioRxiv (2019): 326470.
## How to contribute
For anyone who wants to contribute, please follow the guideline:
- Clone project from GitHub
- Open `UCSCXenaTools.Rproj` with RStudio
- Modify source code
- Run `devtools::check()`, and fix all errors, warnings and notes
- Create a pull request
## Acknowledgment
This package is based on [XenaR](https://github.com/mtmorgan/XenaR),
thanks [Martin Morgan](https://github.com/mtmorgan) for his work.
[](https://ropensci.org)