PASTA2

Developed by:
Shaojun Tang and Alberto Riva
Department of Molecular Genetics
and Microbiology
and UF Genetics Institute


PASTA2 is the second module in the PASTA computational pipeline for the analysis of transcriptome data from RNA-Seq experiments. Building on the splice junction predictions generated by PASTA1, PASTA2 automatically builds alternatively spliced gene models and identifies differentially expressed isoforms in case-control experiments.

Availability

PASTA2 is distributed as a GNU/Linux command-line 64bit executable. The downloadable package contains the program, documentation, and sample data. Source code is available upon request.


Download PASTA2

Prerequisites

PASTA2 requires the reference sequence for the genome under analysis. The reference sequence should be stored as one file per chromosome, named chrN.fa where N is the chromosome number.

Installation and testing

Extract the package in a suitable directory using the following command:
tar xvf pasta2-1.0.tar.gz

This will create a pasta2-1.0/ directory, containing the pasta script and three subdirectories: bin/, doc/ and sample/. The bin/ directory contains compiled code for the PASTA2 program, please do not modify anything in it. The sample/ directory contains two sample files for testing PASTA2: exonjunctions.txt (tab-delimited file containing splice junction coordinates in the format “chr start end expression-level”) and alignments.txt (default alignment output from Bowtie).

To run the sample script, simply go to the pasta2-1.0 directory and run sample.sh like this: ./sample.sh /dir/mouse/mm9/ (Please provide your full directory path to the mouse genome). Feel free to examine sample.sh to see how it invokes the PASTA2 program.

Usage

To run an analysis, use the bin/pasta2 script, supplying options on the command line and/or in a configuration file. The basic syntax of the pasta2 program is the following:

pasta2 [options] -dir dir -refseq referenceDir -exonjunction junctionsFile -mapping mappingFile

A complete description of all PASTA2 arguments can be found in this file, or can be obtained calling pasta2 -help all. A short description of the arguments follows (when two consecutive arguments appear before a description, they are aliases of each other):

-dir
-od
Path to the directory where the results will be stored.
-input
Path to the directory where input files are stored.
-exonjuncton
The name of the input file containing predicted splice junctions.
-mapping
The name of the input file containing RNA-Seq alignment results.
-refseq
Location of the reference sequence files.
-prefix
Prefix for output files.
-expression-level
Minimum gene coverage.

Example:

/home/user/software/pasta2-1.0/pasta2 \
  -dir /home/user/rnaseq/data/ \
  -exonjunction exonjunctions.txt \
  -mapping alignments.txt \
  -refseq /home/user/data/ReferenceGenome/mm9/ \
  -expression-level 0