Lecture 3 – notes

Please download MobaXTerm from: https://mobaxterm.mobatek.net/

Note: lines starting with # are comments; lines starting with $ and in bold are commands to be executed, all other lines are command outputs.


# Go to your directory:
$ cd /ufrc/bioinf_workshop/YOURNAME

# or:
$ cd /ufrc/bioinf_workshop/share/
$ mkdir YOURNAME
$ cd YOURNAME

# Check that you are in the correct place:
$ pwd
/ufrc/bioinf_workshop/...

# Load required modules
$ module load dibig_tools bowtie2 samtools

# Generate a reference sequence
$ simseq.py reference.fa
Random sequence of 1000000bp written to file reference.fa

# Generate bowtie index of reference sequence:
$ bowtie2-build reference.fa
[unreadable messages...]

# Verify index was built
$ ls *.bt2
[several .bt2 files]

# Generate 100,000 paired-end short reads from reference
$ simreads.py -p -nr 100000 reference.fa

# Check that fastq files have been generated
$ ls -lh *.fastq.gz

# Run bowtie2, generating a bam file
bowtie2 -x reference -1 reads_R1.fastq.gz -2 reads_R2.fastq.gz | samtools view -b > aln.bam
100000 reads; of these:
100000 (100.00%) were paired; of these:
0 (0.00%) aligned concordantly 0 times
100000 (100.00%) aligned concordantly exactly 1 time
0 (0.00%) aligned concordantly >1 times
----
0 pairs aligned concordantly 0 times; of these:
0 (0.00%) aligned discordantly 1 time
----
0 pairs aligned 0 times concordantly or discordantly; of these:
0 mates make up the pairs; of these:
0 (0.00%) aligned 0 times
0 (0.00%) aligned exactly 1 time
0 (0.00%) aligned >1 times
100.00% overall alignment rate

# Examine contents of bam file
$ samtools view aln.bam | more
[... bam file contents ...]

# How many reads?
$ samtools view aln.bam | wc -l
200000

# Bam files are much more useful if they are sorted and indexed.
$ samtools sort aln.bam -o aln.sort.bam
$ samtools index aln.sort.bam

# Show sequencing depth at each position:
$ samtools depth aln.sort.bam | more

# Results would be more interesting with lower-quality reads:
$ $ simreads.py -p -nr 100000 -qs 30 -qe 10 reference.fa

# Same process using predefined submit scripts:
$ submit btbuild.qsub reference.fa
48483284 (this will be different every time)

$ submit -after 48483284 bowtie2-pe.qsub reference reads_R1.fastq.gz reads_R2.fastq.gz aln.bam