circBase help

Contents

  1. Searching
  2. Results
  3. Downloading data
  4. Discovering circRNAs
  5. Coordinate conversion
  6. Submit your data
  7. About

1. Searching

Currently there are three main ways of obtaining data from circBase: simple search, list search and conditional retrieval via table browser.

Simple search

Search textbox on main page can be used to query the database by circBase identifier (e.g. mmu_circ_0000010), refseq transcript ID (NM_027671), gene symbol (Pvt1), genomic coordinates (chrII:123456-7891011) or Gene Ontology term identifiers. Identifiers and names (such as mm9_circ_001517 or CDR1as) used in already published data have been added to the database as aliases and may be used in searches as well. Search is case-insensitive.
The database can also be queried by DNA or RNA sequence, either through the simple search interface (exact matches of sequences longer than 6 nt, or their reverse complements, will be returned), or by using Blat. For building Blat references, circRNAs were cut opposite to the head-to-tail junction. Therefore, it is possible to search with sequences that span circular junctions.

List search

Upon selecting the organism, user can paste or upload a list of circBase or refseq identifiers, gene symbols or genomic coordinates. Single-column input is expected for identifiers, while for genomic coordinates an arbitrary number of columns is possible, but only first three columns will be interpreted as genomic position (chromosome | start | end). Therefore, 12-column BED file can be uploaded to get a list of circRNAs overlapping submitted genomic regions.

Table browser

Table browser interface should enable quick and simple conditional data retrieval. Organism and dataset are mandatory selections, that can be further refined by selecting one or more cell lines, excluding circRNAs overlapping repeats, defining genomic position and limiting the genomic or spliced sequence length.

2. Results

Results are presented in an interactive table. Columns can be sorted by clicking the column names, and some cells can be used to link out to external resources. Clicking the genomic position will link to the UCSC genome browser (internal version), additional information on a particular circRNA is linked from the circRNA name column, while NCBI resources (or WormBase for C.elegans) can be accessed by clicking best transcript or gene symbol link. Clicking cells in dataset column will open PubMed record of a respective publication.

3. Downloading data

Tables can be exported in .xlsx, tab-separated .txt or .bed format through the Download table: menu. Genomic sequence of retrieved circRNAs can be downloaded in FASTA format.

4. Discovering circRNAs

All the code needed to discover circRNAs in your own (Ribominus) RNA-seq data is available from the downloads section. Please refer to the included README file for further instructions.

5. Coordinate conversion

Genome assemblies currently used in circBase are hg19 for H. sapiens, mm9 for M. musculus and ce6 for C. elegans circRNAs. To convert the data between assemblies, user are encouraged to use liftOver. liftOver can be used directly from the UCSC genome browser at doRiNA ("Convert" option from the top menu), through the UCSC's web interface, as a standalone command-line tool, or as a part of the Galaxy platform.

6. Submit your data

If you would like to submit your data to circBase, or to suggest implementation of a published circRNA dataset, please contact us.

7. About

circBase is developed by the Rajewsky lab at the Berlin Institute for Medical Systems Biology, Max Delbruck Center for Molecular Medicine, Berlin, Germany.
Contact: petar.glazar@mdc-berlin.de

First version: April 18th 2013. Last Update: Dec 15th 2015.