Package htsjdk.variant.variantcontext
Class GenotypesContext
- java.lang.Object
-
- htsjdk.variant.variantcontext.GenotypesContext
-
- All Implemented Interfaces:
Serializable
,Iterable<Genotype>
,Collection<Genotype>
,List<Genotype>
- Direct Known Subclasses:
LazyGenotypesContext
public class GenotypesContext extends Object implements List<Genotype>, Serializable
Represents an ordered collection of Genotype objects- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description static GenotypesContext
NO_GENOTYPES
static constant value for an empty GenotypesContext.protected ArrayList<Genotype>
notToBeDirectlyAccessedGenotypes
An ArrayList of genotypes contained in this context WARNING: TO ENABLE THE LAZY VERSION OF THIS CLASS, NO METHODS SHOULD DIRECTLY ACCESS THIS VARIABLE.protected List<String>
sampleNamesInOrder
sampleNamesInOrder a list of sample names, one for each genotype in genotypes, sorted in alphabetical orderprotected Map<String,Integer>
sampleNameToOffset
a map optimized for efficient lookup.static long
serialVersionUID
-
Constructor Summary
Constructors Modifier Constructor Description protected
GenotypesContext()
Create an empty GenotypeContextprotected
GenotypesContext(int n)
Create an empty GenotypeContext, with initial capacity for n elementsprotected
GenotypesContext(ArrayList<Genotype> genotypes)
Create an GenotypeContext containing genotypesprotected
GenotypesContext(ArrayList<Genotype> genotypes, Map<String,Integer> sampleNameToOffset, List<String> sampleNamesInOrder)
Create a fully resolved GenotypeContext containing genotypes, sample lookup table, and sorted sample names
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
add(int i, Genotype genotype)
boolean
add(Genotype genotype)
Adds a single genotype to this context.boolean
addAll(int i, Collection<? extends Genotype> genotypes)
boolean
addAll(Collection<? extends Genotype> genotypes)
Adds all of the genotypes to this context Seeadd(Genotype)
for important information about this functions constraints and performance costsvoid
checkImmutability()
void
clear()
boolean
contains(Object o)
boolean
containsAll(Collection<?> objects)
boolean
containsSample(String sample)
boolean
containsSamples(Collection<String> samples)
static GenotypesContext
copy(GenotypesContext toCopy)
Create a freshly allocated GenotypeContext containing the genotypes in toCopystatic GenotypesContext
copy(Collection<Genotype> toCopy)
Create a GenotypesContext containing the genotypes in iteration order contained in toCopystatic GenotypesContext
create()
Basic creation routinestatic GenotypesContext
create(int nGenotypes)
Basic creation routinestatic GenotypesContext
create(Genotype... genotypes)
Create a fully resolved GenotypeContext containing genotypesstatic GenotypesContext
create(ArrayList<Genotype> genotypes)
Create a fully resolved GenotypeContext containing genotypesstatic GenotypesContext
create(ArrayList<Genotype> genotypes, Map<String,Integer> sampleNameToOffset, List<String> sampleNamesInOrder)
Create a fully resolved GenotypeContext containing genotypes, sample lookup table, and sorted sample namesprotected void
ensureSampleNameMap()
protected void
ensureSampleOrdering()
Genotype
get(int i)
Genotype
get(String sampleName)
Gets sample associated with this sampleName, or null if none is foundprotected ArrayList<Genotype>
getGenotypes()
int
getMaxPloidy(int defaultPloidy)
What is the max ploidy among all samples? Returns defaultPloidy if no genotypes are presentSet<String>
getSampleNames()
List<String>
getSampleNamesOrderedByName()
GenotypesContext
immutable()
int
indexOf(Object o)
protected void
invalidateSampleNameMap()
protected void
invalidateSampleOrdering()
boolean
isEmpty()
boolean
isLazyWithData()
boolean
isMutable()
Iterable<Genotype>
iterateInSampleNameOrder()
Iterate over the Genotypes in this context in their sample name order (A, B, C) regardless of the underlying order in the vector of genotypesIterable<Genotype>
iterateInSampleNameOrder(Iterable<String> sampleNamesInOrder)
Iterate over the Genotypes in this context in the order specified by sampleNamesInOrderIterator<Genotype>
iterator()
int
lastIndexOf(Object o)
ListIterator<Genotype>
listIterator()
ListIterator<Genotype>
listIterator(int i)
Genotype
remove(int i)
Note that remove requires us to invalidate our sample -> index cache.boolean
remove(Object o)
See for important warningremove(int)
boolean
removeAll(Collection<?> objects)
Genotype
replace(Genotype genotype)
Replaces the genotype in this context -- note for efficiency reasons we do not add the genotype if it's not present.boolean
retainAll(Collection<?> objects)
Genotype
set(int i, Genotype genotype)
int
size()
List<Genotype>
subList(int i, int i1)
GenotypesContext
subsetToSamples(Set<String> samples)
Return a freshly allocated subcontext of this context containing only the samples listed in samples.Object[]
toArray()
<T> T[]
toArray(T[] ts)
String
toString()
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface java.util.Collection
parallelStream, removeIf, stream, toArray
-
Methods inherited from interface java.util.List
equals, hashCode, replaceAll, sort, spliterator
-
-
-
-
Field Detail
-
serialVersionUID
public static final long serialVersionUID
- See Also:
- Constant Field Values
-
NO_GENOTYPES
public static final GenotypesContext NO_GENOTYPES
static constant value for an empty GenotypesContext. Useful since so many VariantContexts have no genotypes
-
sampleNamesInOrder
protected List<String> sampleNamesInOrder
sampleNamesInOrder a list of sample names, one for each genotype in genotypes, sorted in alphabetical order
-
sampleNameToOffset
protected Map<String,Integer> sampleNameToOffset
a map optimized for efficient lookup. Each genotype in genotypes must have its sample name in sampleNameToOffset, with a corresponding integer value that indicates the offset of that genotype in the vector of genotypes
-
-
Constructor Detail
-
GenotypesContext
protected GenotypesContext()
Create an empty GenotypeContext
-
GenotypesContext
protected GenotypesContext(int n)
Create an empty GenotypeContext, with initial capacity for n elements
-
GenotypesContext
protected GenotypesContext(ArrayList<Genotype> genotypes)
Create an GenotypeContext containing genotypes
-
GenotypesContext
protected GenotypesContext(ArrayList<Genotype> genotypes, Map<String,Integer> sampleNameToOffset, List<String> sampleNamesInOrder)
Create a fully resolved GenotypeContext containing genotypes, sample lookup table, and sorted sample names- Parameters:
genotypes
- our genotypes in arbitrarysampleNameToOffset
- map optimized for efficient lookup. Each genotype in genotypes must have its sample name in sampleNameToOffset, with a corresponding integer value that indicates the offset of that genotype in the vector of genotypessampleNamesInOrder
- a list of sample names, one for each genotype in genotypes, sorted in alphabetical order.
-
-
Method Detail
-
create
public static final GenotypesContext create()
Basic creation routine- Returns:
- an empty, mutable GenotypeContext
-
create
public static final GenotypesContext create(int nGenotypes)
Basic creation routine- Returns:
- an empty, mutable GenotypeContext with initial capacity for nGenotypes
-
create
public static final GenotypesContext create(ArrayList<Genotype> genotypes, Map<String,Integer> sampleNameToOffset, List<String> sampleNamesInOrder)
Create a fully resolved GenotypeContext containing genotypes, sample lookup table, and sorted sample names- Parameters:
genotypes
- our genotypes in arbitrarysampleNameToOffset
- map optimized for efficient lookup. Each genotype in genotypes must have its sample name in sampleNameToOffset, with a corresponding integer value that indicates the offset of that genotype in the vector of genotypessampleNamesInOrder
- a list of sample names, one for each genotype in genotypes, sorted in alphabetical order.- Returns:
- an mutable GenotypeContext containing genotypes with already present lookup data
-
create
public static final GenotypesContext create(ArrayList<Genotype> genotypes)
Create a fully resolved GenotypeContext containing genotypes- Parameters:
genotypes
- our genotypes in arbitrary- Returns:
- an mutable GenotypeContext containing genotypes
-
create
public static final GenotypesContext create(Genotype... genotypes)
Create a fully resolved GenotypeContext containing genotypes- Parameters:
genotypes
- our genotypes in arbitrary- Returns:
- an mutable GenotypeContext containing genotypes
-
copy
public static final GenotypesContext copy(GenotypesContext toCopy)
Create a freshly allocated GenotypeContext containing the genotypes in toCopy- Parameters:
toCopy
- the GenotypesContext to copy- Returns:
- an mutable GenotypeContext containing genotypes
-
copy
public static final GenotypesContext copy(Collection<Genotype> toCopy)
Create a GenotypesContext containing the genotypes in iteration order contained in toCopy- Parameters:
toCopy
- the collection of genotypes- Returns:
- an mutable GenotypeContext containing genotypes
-
immutable
public final GenotypesContext immutable()
-
isMutable
public boolean isMutable()
-
checkImmutability
public final void checkImmutability() throws UnsupportedOperationException
- Throws:
UnsupportedOperationException
-
invalidateSampleNameMap
protected void invalidateSampleNameMap()
-
invalidateSampleOrdering
protected void invalidateSampleOrdering()
-
ensureSampleOrdering
protected void ensureSampleOrdering()
-
ensureSampleNameMap
protected void ensureSampleNameMap()
-
isLazyWithData
public boolean isLazyWithData()
-
clear
public void clear()
-
size
public int size()
-
isEmpty
public boolean isEmpty()
-
add
public boolean add(Genotype genotype) throws UnsupportedOperationException
Adds a single genotype to this context. There are many constraints on this input, and important impacts on the performance of other functions provided by this context. First, the sample name of genotype must be unique within this context. However, this is not enforced in the code itself, through you will invalid the contract on this context if you add duplicate samples and are running with CoFoJa enabled. Second, adding genotype also updates the sample name -> index map, so add() followed by containsSample and related function is an efficient series of operations. Third, adding the genotype invalidates the sorted list of sample names, to add() followed by any of the SampleNamesInOrder operations is inefficient, as each SampleNamesInOrder must rebuild the sorted list of sample names at an O(n log n) cost.- Specified by:
add
in interfaceCollection<Genotype>
- Specified by:
add
in interfaceList<Genotype>
- Parameters:
genotype
-- Returns:
- Throws:
UnsupportedOperationException
- if the context has been made immutable
-
addAll
public boolean addAll(Collection<? extends Genotype> genotypes)
Adds all of the genotypes to this context Seeadd(Genotype)
for important information about this functions constraints and performance costs
-
addAll
public boolean addAll(int i, Collection<? extends Genotype> genotypes)
-
contains
public boolean contains(Object o)
-
containsAll
public boolean containsAll(Collection<?> objects)
- Specified by:
containsAll
in interfaceCollection<Genotype>
- Specified by:
containsAll
in interfaceList<Genotype>
-
getMaxPloidy
public int getMaxPloidy(int defaultPloidy)
What is the max ploidy among all samples? Returns defaultPloidy if no genotypes are present- Parameters:
defaultPloidy
- the default ploidy, if all samples are no-called- Returns:
-
get
public Genotype get(String sampleName)
Gets sample associated with this sampleName, or null if none is found- Parameters:
sampleName
-- Returns:
-
lastIndexOf
public int lastIndexOf(Object o)
- Specified by:
lastIndexOf
in interfaceList<Genotype>
-
listIterator
public ListIterator<Genotype> listIterator()
- Specified by:
listIterator
in interfaceList<Genotype>
-
listIterator
public ListIterator<Genotype> listIterator(int i)
- Specified by:
listIterator
in interfaceList<Genotype>
-
remove
public Genotype remove(int i)
Note that remove requires us to invalidate our sample -> index cache. The loop: GenotypesContext gc = ... for ( sample in samples ) if ( gc.containsSample(sample) ) gc.remove(sample) is extremely inefficient, as each call to remove invalidates the cache and containsSample requires us to rebuild it, an O(n) operation. If you must remove many samples from the GC, use either removeAll or retainAll to avoid this O(n * m) operation.
-
remove
public boolean remove(Object o)
See for important warningremove(int)
-
removeAll
public boolean removeAll(Collection<?> objects)
-
retainAll
public boolean retainAll(Collection<?> objects)
-
replace
public Genotype replace(Genotype genotype)
Replaces the genotype in this context -- note for efficiency reasons we do not add the genotype if it's not present. The return value will be null indicating this happened. Note this operation is preserves the map cache Sample -> Offset but invalidates the sorted list of samples. Using replace within a loop containing any of the SampleNameInOrder operation requires an O(n log n) resorting after each replace operation.- Parameters:
genotype
- a non null genotype to bind in this context- Returns:
- null if genotype was not added, otherwise returns the previous genotype
-
toArray
public Object[] toArray()
-
toArray
public <T> T[] toArray(T[] ts)
-
iterateInSampleNameOrder
public Iterable<Genotype> iterateInSampleNameOrder(Iterable<String> sampleNamesInOrder)
Iterate over the Genotypes in this context in the order specified by sampleNamesInOrder- Parameters:
sampleNamesInOrder
- a Iterable of String, containing exactly one entry for each Genotype sample name in this context- Returns:
- a Iterable over the genotypes in this context.
-
iterateInSampleNameOrder
public Iterable<Genotype> iterateInSampleNameOrder()
Iterate over the Genotypes in this context in their sample name order (A, B, C) regardless of the underlying order in the vector of genotypes- Returns:
- a Iterable over the genotypes in this context.
-
getSampleNames
public Set<String> getSampleNames()
- Returns:
- The set of sample names for all genotypes in this context, in arbitrary order
-
getSampleNamesOrderedByName
public List<String> getSampleNamesOrderedByName()
- Returns:
- The set of sample names for all genotypes in this context, in their natural ordering (A, B, C)
-
containsSample
public boolean containsSample(String sample)
-
containsSamples
public boolean containsSamples(Collection<String> samples)
-
subsetToSamples
public GenotypesContext subsetToSamples(Set<String> samples)
Return a freshly allocated subcontext of this context containing only the samples listed in samples. Note that samples can contain names not in this context, they will just be ignored.- Parameters:
samples
-- Returns:
-
-