Class GenotypesContext

    • Field Detail

      • NO_GENOTYPES

        public static final GenotypesContext NO_GENOTYPES
        static constant value for an empty GenotypesContext. Useful since so many VariantContexts have no genotypes
      • sampleNamesInOrder

        protected List<String> sampleNamesInOrder
        sampleNamesInOrder a list of sample names, one for each genotype in genotypes, sorted in alphabetical order
      • sampleNameToOffset

        protected Map<String,​Integer> sampleNameToOffset
        a map optimized for efficient lookup. Each genotype in genotypes must have its sample name in sampleNameToOffset, with a corresponding integer value that indicates the offset of that genotype in the vector of genotypes
      • notToBeDirectlyAccessedGenotypes

        protected ArrayList<Genotype> notToBeDirectlyAccessedGenotypes
        An ArrayList of genotypes contained in this context WARNING: TO ENABLE THE LAZY VERSION OF THIS CLASS, NO METHODS SHOULD DIRECTLY ACCESS THIS VARIABLE. USE getGenotypes() INSTEAD.
    • Constructor Detail

      • GenotypesContext

        protected GenotypesContext()
        Create an empty GenotypeContext
      • GenotypesContext

        protected GenotypesContext​(int n)
        Create an empty GenotypeContext, with initial capacity for n elements
      • GenotypesContext

        protected GenotypesContext​(ArrayList<Genotype> genotypes)
        Create an GenotypeContext containing genotypes
      • GenotypesContext

        protected GenotypesContext​(ArrayList<Genotype> genotypes,
                                   Map<String,​Integer> sampleNameToOffset,
                                   List<String> sampleNamesInOrder)
        Create a fully resolved GenotypeContext containing genotypes, sample lookup table, and sorted sample names
        Parameters:
        genotypes - our genotypes in arbitrary
        sampleNameToOffset - map optimized for efficient lookup. Each genotype in genotypes must have its sample name in sampleNameToOffset, with a corresponding integer value that indicates the offset of that genotype in the vector of genotypes
        sampleNamesInOrder - a list of sample names, one for each genotype in genotypes, sorted in alphabetical order.
    • Method Detail

      • create

        public static final GenotypesContext create()
        Basic creation routine
        Returns:
        an empty, mutable GenotypeContext
      • create

        public static final GenotypesContext create​(int nGenotypes)
        Basic creation routine
        Returns:
        an empty, mutable GenotypeContext with initial capacity for nGenotypes
      • create

        public static final GenotypesContext create​(ArrayList<Genotype> genotypes,
                                                    Map<String,​Integer> sampleNameToOffset,
                                                    List<String> sampleNamesInOrder)
        Create a fully resolved GenotypeContext containing genotypes, sample lookup table, and sorted sample names
        Parameters:
        genotypes - our genotypes in arbitrary
        sampleNameToOffset - map optimized for efficient lookup. Each genotype in genotypes must have its sample name in sampleNameToOffset, with a corresponding integer value that indicates the offset of that genotype in the vector of genotypes
        sampleNamesInOrder - a list of sample names, one for each genotype in genotypes, sorted in alphabetical order.
        Returns:
        an mutable GenotypeContext containing genotypes with already present lookup data
      • create

        public static final GenotypesContext create​(ArrayList<Genotype> genotypes)
        Create a fully resolved GenotypeContext containing genotypes
        Parameters:
        genotypes - our genotypes in arbitrary
        Returns:
        an mutable GenotypeContext containing genotypes
      • create

        public static final GenotypesContext create​(Genotype... genotypes)
        Create a fully resolved GenotypeContext containing genotypes
        Parameters:
        genotypes - our genotypes in arbitrary
        Returns:
        an mutable GenotypeContext containing genotypes
      • copy

        public static final GenotypesContext copy​(GenotypesContext toCopy)
        Create a freshly allocated GenotypeContext containing the genotypes in toCopy
        Parameters:
        toCopy - the GenotypesContext to copy
        Returns:
        an mutable GenotypeContext containing genotypes
      • copy

        public static final GenotypesContext copy​(Collection<Genotype> toCopy)
        Create a GenotypesContext containing the genotypes in iteration order contained in toCopy
        Parameters:
        toCopy - the collection of genotypes
        Returns:
        an mutable GenotypeContext containing genotypes
      • isMutable

        public boolean isMutable()
      • invalidateSampleNameMap

        protected void invalidateSampleNameMap()
      • invalidateSampleOrdering

        protected void invalidateSampleOrdering()
      • ensureSampleOrdering

        protected void ensureSampleOrdering()
      • ensureSampleNameMap

        protected void ensureSampleNameMap()
      • isLazyWithData

        public boolean isLazyWithData()
      • add

        public boolean add​(Genotype genotype)
                    throws UnsupportedOperationException
        Adds a single genotype to this context. There are many constraints on this input, and important impacts on the performance of other functions provided by this context. First, the sample name of genotype must be unique within this context. However, this is not enforced in the code itself, through you will invalid the contract on this context if you add duplicate samples and are running with CoFoJa enabled. Second, adding genotype also updates the sample name -> index map, so add() followed by containsSample and related function is an efficient series of operations. Third, adding the genotype invalidates the sorted list of sample names, to add() followed by any of the SampleNamesInOrder operations is inefficient, as each SampleNamesInOrder must rebuild the sorted list of sample names at an O(n log n) cost.
        Specified by:
        add in interface Collection<Genotype>
        Specified by:
        add in interface List<Genotype>
        Parameters:
        genotype -
        Returns:
        Throws:
        UnsupportedOperationException - if the context has been made immutable
      • getMaxPloidy

        public int getMaxPloidy​(int defaultPloidy)
        What is the max ploidy among all samples? Returns defaultPloidy if no genotypes are present
        Parameters:
        defaultPloidy - the default ploidy, if all samples are no-called
        Returns:
      • get

        public Genotype get​(String sampleName)
        Gets sample associated with this sampleName, or null if none is found
        Parameters:
        sampleName -
        Returns:
      • remove

        public Genotype remove​(int i)
        Note that remove requires us to invalidate our sample -> index cache. The loop: GenotypesContext gc = ... for ( sample in samples ) if ( gc.containsSample(sample) ) gc.remove(sample) is extremely inefficient, as each call to remove invalidates the cache and containsSample requires us to rebuild it, an O(n) operation. If you must remove many samples from the GC, use either removeAll or retainAll to avoid this O(n * m) operation.
        Specified by:
        remove in interface List<Genotype>
        Parameters:
        i -
        Returns:
      • replace

        public Genotype replace​(Genotype genotype)
        Replaces the genotype in this context -- note for efficiency reasons we do not add the genotype if it's not present. The return value will be null indicating this happened. Note this operation is preserves the map cache Sample -> Offset but invalidates the sorted list of samples. Using replace within a loop containing any of the SampleNameInOrder operation requires an O(n log n) resorting after each replace operation.
        Parameters:
        genotype - a non null genotype to bind in this context
        Returns:
        null if genotype was not added, otherwise returns the previous genotype
      • iterateInSampleNameOrder

        public Iterable<Genotype> iterateInSampleNameOrder​(Iterable<String> sampleNamesInOrder)
        Iterate over the Genotypes in this context in the order specified by sampleNamesInOrder
        Parameters:
        sampleNamesInOrder - a Iterable of String, containing exactly one entry for each Genotype sample name in this context
        Returns:
        a Iterable over the genotypes in this context.
      • iterateInSampleNameOrder

        public Iterable<Genotype> iterateInSampleNameOrder()
        Iterate over the Genotypes in this context in their sample name order (A, B, C) regardless of the underlying order in the vector of genotypes
        Returns:
        a Iterable over the genotypes in this context.
      • getSampleNames

        public Set<String> getSampleNames()
        Returns:
        The set of sample names for all genotypes in this context, in arbitrary order
      • getSampleNamesOrderedByName

        public List<String> getSampleNamesOrderedByName()
        Returns:
        The set of sample names for all genotypes in this context, in their natural ordering (A, B, C)
      • containsSample

        public boolean containsSample​(String sample)
      • containsSamples

        public boolean containsSamples​(Collection<String> samples)
      • subsetToSamples

        public GenotypesContext subsetToSamples​(Set<String> samples)
        Return a freshly allocated subcontext of this context containing only the samples listed in samples. Note that samples can contain names not in this context, they will just be ignored.
        Parameters:
        samples -
        Returns: