Warning
vg commands on graphs that are compressed (.gz files) does not work. It will raise a ‘invalid graph type’ error.
Convert rGFA to somewhat of a GFA1
The command vg convert -Wf -r 89 graph_rgfa.gfa > graph_gfa1.gfa
adds the variations and the reference as paths in the graphs. Those paths do not describe genomes as rGFA is not a container that keeps this information (erases SNPs and lose paths), but it can help to have a compatible GFA1 file.
Convert from GFA1.1 to GFA1
vg convert in.gfa -W -f > out.gfa
-W
stands for suppress W-lines-f
is to output to file
Convert from vg, json to GFA
vg view [-J|-V|-F] input_graph -g > out.gfa
Call bubbles on graph to get variants
vg deconstruct -e -a -p ref graph.gfa > variants.vcf
-p [STR]
stands for the path to use as reference to call variants- for W-lines, paths are referenced with two parts : link those with a ’#’ sign. Generally, post-treatment is done on those vcf files, using vcfbub and vcfwave
# For some reason input must be gzipped, gzip file.vcf
# The VCF shall also contain snarl level annotations
vcfbub -l 0 -r 10000 -i pggb.vcf.gz >filt_pggb.vcf
# vcfwave requires the GT filed to be removed
awk -i inplace '{$0=gensub(/\s*\S+/,"",3)}1' file