Ting-You Wang

Nanopore raw data visualization using squigualiser

Posted on June 18, 2024

An Introduction to Nanopore raw data visualization using squigualiser

Software preparation

Tool for converting raw data to BLOW5 format

  • If the raw data is in POD5 format, Blue-crab is required.
  • If the raw data is in FAST5 format, slow5tools is required.
# Create environment
mamba create --name ont python=3.9
mamba activate ont


# Install blue-crab
mamba install zstd
python3 -m pip install --upgrade pip
pip install blue-crab

# Install slow5tools
mamba install hdf5
mamba install slow5tools

Aligning raw signals to basecalled reads using F5C

mamba install f5c=1.4

Signal-to-read visualization using squigualiser

pip install squigualiser

Step1: Converting data format

For multiple FAST5 files

slow5tools f2s ./fast5_dir -d blow5_dir # convert multiple FAST5 files to multiple BLOW5 files
slow5tools merge blow5_dir -o data.blow5 # merge BLOW5 into one
slow5tools get data.blow5 -l read_ids.txt --to blow5 -o target.blow5 # extract records from a blow5 file based on a list of read ids
slow5tools index target.blow5 # index BLOW5 file

For single POD5 file

blue-crab p2s data.pod5 -o data.blow5

Step2: raw signals to basecalled reads alignment

f5c resquiggle -c --rna --pore r9 -o target.paf target.fastq target.blow5

Step3: Signal-to-read visualization

squigualiser plot -f target.fastq -s target.blow5 -a target.paf -o out_dir --save_svg

# show the whole fastq sequence, output in the HTML file
squigualiser plot -f target.fastq -s target.blow5 -a target.paf -o target_dir --rna --sig_scale znorm --fixed_width --base_limit 20000 --sig_plot_limit 99999999

# show the whole fastq sequence, output in the SVG file
squigualiser plot -f target.fastq -s target.blow5 -a target.paf -o target_dir --rna --sig_scale znorm --fixed_width --base_limit 20000 --sig_plot_limit 99999999 --no_samples --no_colours --save_svg --region 200-500 --xrange 6000 --plot_limit 9000

Options

-r read_id # specify the read with read_id to plot
--rna # specify for RNA reads
--fixed_width # plot with fixed base width
--base_limit # maximum number of bases to plot
--sig_plot_limit # maximum number of signal samples to plot

Published in categories tutorial  Tagged with Nanopore  ONT  Long-reads  visualization