aboutsummaryrefslogtreecommitdiff
path: root/README.md
blob: ff14ab3b7a9b2dca509ec634d6a02ba96e45c5bd (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
Bam Query Index (qidx)
=====================

> Warning: this is a work in progress

`qidx` is tool for indexing BAM alignments by query name. While the
samtools have the ability to sort data by query name (also called
the read name), there htslib does not provide built-in utilities
to retrieve alignments by query name. This can be advantageous
for examining multi-mapped alignments.

While a utility [bri](https://github.com/jts/bri) predated `qidx`
providing the same utilities, it reads all alignments into memory
which is impractical for most human genome data.

Notes:

- `qidx` creates a disk-backed using a sparse memory-mapped file. The underlying
operating system must support `mmap` and file holes
- `qidx` doesn't currently support compression. it is currently recommended to
use block-level compression (such as `zfs` `zstd` compression)
- the bamfile must be sorted by query name before the index is built `samtools sort -n`