Graph construction with minigraph

To use minigraph on a set of input genomes, with default parameters, you should use this command:

minigraph -cxggs reference.fasta alt00.fasta alt01.fasta [...] > graph.gfa

You can specify any number of .fasta/.fa files, as well as .gfa graph files, and you can find details about arguments in the manpage

  • c enables base-level alignment
  • x is to specify a preset, here ggs, which is a simple algorithm for incremental graph generation

Publication and availability

Publication is available, and source code is available here

The output will be in rGFA format, a sub-type of GFA1 that adds information about positions in the graph but removes information of genomes’ origins. In rGFA, you don’t have W-lines or P-lines that do serves to get the information of which fragment goes to which genome. It’s a development choice that was made in the formalism of rGFA, because H. Li see his tool as a way to embed multiple genomes on a reference, and not doing something which is reference-free.

A pull request was made in 2022, adding P-lines support to minigraph but was never accepted. However, one can get this version by getting the associated commit ID.

Warning

minigraph outputs nodes prefixed with s ; with some tools (such as odgi) it may cause crashes. To convert those rGFA’s to standard GFA files, you can use gfautil

It may be possible to get some kind of paths in a rGFA using vg convert according to this answer