Package htsjdk.samtools
Class SRAIndex
- java.lang.Object
-
- htsjdk.samtools.SRAIndex
-
- All Implemented Interfaces:
BAMIndex
,BrowseableBAMIndex
,Closeable
,AutoCloseable
public class SRAIndex extends Object implements BrowseableBAMIndex
Emulates BAM index so that we can request chunks of records from SRAFileReader Here is how it works: SRA allows reading of alignments by Reference position fast, so we divide our "file" range for alignments as a length of all references. Reading unaligned reads is then fast if we use read positions for lookup and (internally) filter out aligned fragments. Total SRA "file" range is calculated as sum of all reference lengths plus number of reads (both aligned and unaligned) in SRA archive. Now, we can use Chunks to lookup for aligned and unaligned fragments. We emulate BAM index bins by mapping SRA reference positions to bin numbers. And then we map from bin number to list of chunks, which represent SRA "file" positions (which are simply reference positions). We only emulate last level of BAM index bins (and they refer to a portion of reference SRA_BIN_SIZE bases long). For all other bins RuntimeException will be returned (but since nobody else creates bins, except SRAIndex class that is fine). But since the last level of bins was not meant to refer to fragments that only partially overlap bin reference positions, we also return chunk that goes 5000 bases left before beginning of the bin to assure fragments that start before the bin positions but still overlap with it can be retrieved by SRA reader. Later we will add support to NGS API to get a maximum number of bases that we need to go left to retrieve such fragments. Created by andrii.nikitiuk on 9/4/15.
-
-
Field Summary
Fields Modifier and Type Field Description static int
SRA_BIN_SIZE
Number of reference bases bins in last level can representstatic int
SRA_CHUNK_SIZE
Chunks of that size will be created when using SRA index-
Fields inherited from interface htsjdk.samtools.BAMIndex
BAI_INDEX_SUFFIX, BAMIndexSuffix, CSI_INDEX_SUFFIX
-
-
Constructor Summary
Constructors Constructor Description SRAIndex(SAMFileHeader header, SRAIterator.RecordRangeInfo recordRangeInfo)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
close()
Close the index and release any associated resources.BinList
getBinsOverlapping(int referenceIndex, int startPos, int endPos)
Provides a list of bins that contain bases at requested positionsint
getFirstLocusInBin(Bin bin)
Gets the first locus that this bin can index into.int
getLastLocusInBin(Bin bin)
Gets the last locus that this bin can index into.int
getLevelForBin(Bin bin)
SRA only operates on bins from last levelint
getLevelSize(int levelNumber)
Gets the size (number of bins in) a given level of a BAM index.BAMIndexMetaData
getMetaData(int reference)
Gets meta data for the given reference including information about number of aligned, unaligned, and noCoordinate recordsBAMFileSpan
getSpanOverlapping(int referenceIndex, int startPos, int endPos)
Gets the compressed chunks which should be searched for the contents of records contained by the span referenceIndex:startPos-endPos, inclusive.BAMFileSpan
getSpanOverlapping(Bin bin)
Perform an overlapping query of all bins bounding the given location.long
getStartOfLastLinearBin()
Gets the start of the last linear bin in the index.
-
-
-
Field Detail
-
SRA_BIN_SIZE
public static final int SRA_BIN_SIZE
Number of reference bases bins in last level can represent- See Also:
- Constant Field Values
-
SRA_CHUNK_SIZE
public static final int SRA_CHUNK_SIZE
Chunks of that size will be created when using SRA index- See Also:
- Constant Field Values
-
-
Constructor Detail
-
SRAIndex
public SRAIndex(SAMFileHeader header, SRAIterator.RecordRangeInfo recordRangeInfo)
- Parameters:
header
- sam headerrecordRangeInfo
- info about record ranges withing SRA archive
-
-
Method Detail
-
getLevelSize
public int getLevelSize(int levelNumber)
Gets the size (number of bins in) a given level of a BAM index.- Specified by:
getLevelSize
in interfaceBrowseableBAMIndex
- Parameters:
levelNumber
- Level for which to inspect the size.- Returns:
- Size of the given level.
-
getLevelForBin
public int getLevelForBin(Bin bin)
SRA only operates on bins from last level- Specified by:
getLevelForBin
in interfaceBrowseableBAMIndex
- Parameters:
bin
- The bin for which to determine the level.- Returns:
- bin level
-
getFirstLocusInBin
public int getFirstLocusInBin(Bin bin)
Gets the first locus that this bin can index into.- Specified by:
getFirstLocusInBin
in interfaceBrowseableBAMIndex
- Parameters:
bin
- The bin to test.- Returns:
- first position that associated with given bin number
-
getLastLocusInBin
public int getLastLocusInBin(Bin bin)
Gets the last locus that this bin can index into.- Specified by:
getLastLocusInBin
in interfaceBrowseableBAMIndex
- Parameters:
bin
- The bin to test.- Returns:
- last position that associated with given bin number
-
getBinsOverlapping
public BinList getBinsOverlapping(int referenceIndex, int startPos, int endPos)
Provides a list of bins that contain bases at requested positions- Specified by:
getBinsOverlapping
in interfaceBrowseableBAMIndex
- Parameters:
referenceIndex
- sequence of desired SAMRecordsstartPos
- 1-based start of the desired interval, inclusiveendPos
- 1-based end of the desired interval, inclusive- Returns:
- a list of bins that contain relevant data
-
getSpanOverlapping
public BAMFileSpan getSpanOverlapping(Bin bin)
Description copied from interface:BrowseableBAMIndex
Perform an overlapping query of all bins bounding the given location.- Specified by:
getSpanOverlapping
in interfaceBrowseableBAMIndex
- Parameters:
bin
- The bin over which to perform an overlapping query.- Returns:
- The file pointers
-
getSpanOverlapping
public BAMFileSpan getSpanOverlapping(int referenceIndex, int startPos, int endPos)
Description copied from interface:BAMIndex
Gets the compressed chunks which should be searched for the contents of records contained by the span referenceIndex:startPos-endPos, inclusive. See the BAM spec for more information on how a chunk is represented.- Specified by:
getSpanOverlapping
in interfaceBAMIndex
- Parameters:
referenceIndex
- The contig.startPos
- Genomic start of query.endPos
- Genomic end of query.- Returns:
- A file span listing the chunks in the BAM file.
-
getStartOfLastLinearBin
public long getStartOfLastLinearBin()
Description copied from interface:BAMIndex
Gets the start of the last linear bin in the index.- Specified by:
getStartOfLastLinearBin
in interfaceBAMIndex
- Returns:
- a position where aligned fragments end
-
getMetaData
public BAMIndexMetaData getMetaData(int reference)
Description copied from interface:BAMIndex
Gets meta data for the given reference including information about number of aligned, unaligned, and noCoordinate records- Specified by:
getMetaData
in interfaceBAMIndex
- Parameters:
reference
- the reference of interest- Returns:
- meta data for the reference
-
-