diff options
author | flu0r1ne <flu0r1ne@flu0r1ne.net> | 2022-10-30 19:30:29 -0500 |
---|---|---|
committer | flu0r1ne <flu0r1ne@flu0r1ne.net> | 2022-10-30 19:30:29 -0500 |
commit | 20e52f326cdf1b6c2ca9b2c0b5be07637d9196d2 (patch) | |
tree | 1d957cfd5ca8b9ccd0a91fa5d5415599edd241d1 /README.md | |
download | qidx-20e52f326cdf1b6c2ca9b2c0b5be07637d9196d2.tar.xz qidx-20e52f326cdf1b6c2ca9b2c0b5be07637d9196d2.zip |
Initial commit
Diffstat (limited to 'README.md')
-rw-r--r-- | README.md | 22 |
1 files changed, 22 insertions, 0 deletions
diff --git a/README.md b/README.md new file mode 100644 index 0000000..ff14ab3 --- /dev/null +++ b/README.md @@ -0,0 +1,22 @@ +Bam Query Index (qidx) +===================== + +> Warning: this is a work in progress + +`qidx` is tool for indexing BAM alignments by query name. While the +samtools have the ability to sort data by query name (also called +the read name), there htslib does not provide built-in utilities +to retrieve alignments by query name. This can be advantageous +for examining multi-mapped alignments. + +While a utility [bri](https://github.com/jts/bri) predated `qidx` +providing the same utilities, it reads all alignments into memory +which is impractical for most human genome data. + +Notes: + +- `qidx` creates a disk-backed using a sparse memory-mapped file. The underlying +operating system must support `mmap` and file holes +- `qidx` doesn't currently support compression. it is currently recommended to +use block-level compression (such as `zfs` `zstd` compression) +- the bamfile must be sorted by query name before the index is built `samtools sort -n` |