aboutsummaryrefslogtreecommitdiff
path: root/README.md
diff options
context:
space:
mode:
Diffstat (limited to 'README.md')
-rw-r--r--README.md22
1 files changed, 22 insertions, 0 deletions
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..ff14ab3
--- /dev/null
+++ b/README.md
@@ -0,0 +1,22 @@
+Bam Query Index (qidx)
+=====================
+
+> Warning: this is a work in progress
+
+`qidx` is tool for indexing BAM alignments by query name. While the
+samtools have the ability to sort data by query name (also called
+the read name), there htslib does not provide built-in utilities
+to retrieve alignments by query name. This can be advantageous
+for examining multi-mapped alignments.
+
+While a utility [bri](https://github.com/jts/bri) predated `qidx`
+providing the same utilities, it reads all alignments into memory
+which is impractical for most human genome data.
+
+Notes:
+
+- `qidx` creates a disk-backed using a sparse memory-mapped file. The underlying
+operating system must support `mmap` and file holes
+- `qidx` doesn't currently support compression. it is currently recommended to
+use block-level compression (such as `zfs` `zstd` compression)
+- the bamfile must be sorted by query name before the index is built `samtools sort -n`