public final class VcfRecord extends java.lang.Object implements VcfEmission
Class VcfRecord
represents a VCF record.
Instances of class VcfRecord
are immutable.
Modifier and Type | Field and Description |
---|---|
static java.lang.String |
GL_FORMAT
The VCF FORMAT code for log-scaled genotype likelihood data: "GL".
|
static java.lang.String |
PL_FORMAT
The VCF FORMAT code for phred-scaled genotype likelihood data: "PL".
|
Modifier and Type | Method and Description |
---|---|
int |
allele(int hap)
Returns the allele on the specified haplotype or -1 if the
allele is missing.
|
int |
allele1(int sample)
Returns the first allele for the specified sample or -1 if the
allele is missing.
|
int |
allele2(int sample)
Returns the second allele for the specified sample or -1 if the
allele is missing.
|
int |
alleleCount(int allele)
Returns the number of haplotypes that carry the specified allele.
|
java.lang.String |
filter()
Returns the FILTER field.
|
java.lang.String |
format()
Returns the FORMAT field.
|
java.lang.String[] |
formatData(java.lang.String formatCode)
Returns an array of length
this.nSamples()
containing the specified FORMAT subfield data for each sample. |
int |
formatIndex(java.lang.String formatCode)
Returns the index of the specified FORMAT subfield if the
specified subfield is defined for this VCF record, and returns -1
otherwise.
|
java.lang.String |
formatSubfield(int subfieldIndex)
Returns the specified FORMAT subfield.
|
static VcfRecord |
fromGL(VcfHeader vcfHeader,
java.lang.String vcfRecord,
float maxLR)
Constructs and returns a new
VcfRecord instance from a
VCF record and its GL or PL format subfield data. |
static VcfRecord |
fromGT(VcfHeader vcfHeader,
java.lang.String vcfRecord)
Constructs and returns a new
VcfRecord instance from a
VCF record and its GT format subfield data |
static VcfRecord |
fromGTGL(VcfHeader vcfHeader,
java.lang.String vcfRecord,
float maxLR)
Constructs and returns a new
VcfRecord instance from a VCF
record and its GT, GL, and PL format subfield data. |
float |
gl(int sample,
int allele1,
int allele2)
Returns the probability of the observed data if the specified pair
of ordered alleles is the true genotype in the specified sample.
|
static int |
gtIndex(int a1,
int a2)
Returns the VCF genotype index for the specified pair of alleles.
|
int |
hapIndex(int allele,
int copy)
Returns index of the haplotype that carries the specified copy of the
specified allele.
|
boolean |
hasFormat(java.lang.String formatCode)
Returns
true if the specified FORMAT subfield is
present, and returns false otherwise. |
java.lang.String |
info()
Returns the INFO field.
|
boolean |
isPhased(int sample)
Returns
true if the genotype emission probabilities for
the specified sample are determined by a phased, nonmissing genotype,
and returns false otherwise. |
boolean |
isRefData()
Returns
true if the genotype emission probabilities
for each sample are determined by a phased called genotype
that has no missing alleles, and returns false otherwise. |
int |
majorAllele()
Returns the index of the major allele.
|
Marker |
marker()
Returns the marker.
|
int |
nAlleles()
Returns the number of marker alleles.
|
int |
nFormatSubfields()
Returns the number of FORMAT subfields.
|
int |
nHapPairs()
Returns the number of haplotype pairs.
|
int |
nHaps()
Returns the number of haplotypes.
|
int |
nSamples()
Returns the number of samples.
|
java.lang.String |
qual()
Returns the QUAL field.
|
java.lang.String |
sampleData(int sample)
Returns the data for the specified sample.
|
java.lang.String |
sampleData(int sample,
int subfieldIndex)
Returns the specified data for the specified sample.
|
java.lang.String |
sampleData(int sample,
java.lang.String formatCode)
Returns the specified data for the specified sample.
|
Samples |
samples()
Returns the list of samples.
|
boolean |
storesNonMajorIndices()
Returns
true if this instance stores the indices of haplotypes
that carry non-major alleles, and returns false otherwise. |
java.lang.String |
toString()
Returns the VCF record.
|
VcfHeader |
vcfHeader()
Returns the VCF meta-information lines and the VCF header line.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
toVcfRec
public static final java.lang.String GL_FORMAT
public static final java.lang.String PL_FORMAT
public static int gtIndex(int a1, int a2)
a1
- the first allelea2
- the second allelejava.lang.IllegalArgumentException
- if a1 < 0 || a2 < 0
public static VcfRecord fromGT(VcfHeader vcfHeader, java.lang.String vcfRecord)
VcfRecord
instance from a
VCF record and its GT format subfield datavcfHeader
- meta-information lines and header line for the
specified VCF record.vcfRecord
- a VCF record with a GL format field corresponding to
the specified vcfHeader
objectVcfRecord
instancejava.lang.IllegalArgumentException
- if the VCF record does not have a
GT format fieldjava.lang.IllegalArgumentException
- if a VCF record format error is
detectedjava.lang.IllegalArgumentException
- if there are not
vcfHeader.nHeaderFields()
tab-delimited fields in the
specified VCF recordjava.lang.NullPointerException
- if
vcfHeader == null || vcfRecord == null
public static VcfRecord fromGL(VcfHeader vcfHeader, java.lang.String vcfRecord, float maxLR)
VcfRecord
instance from a
VCF record and its GL or PL format subfield data. If both
GL and PL format subfields are present, the GL format field will be used.
If the maximum normalized genotype likelihood is 1.0 for a sample,
then any other genotype likelihood for the sample that is less than
lrThreshold
is set to 0.vcfHeader
- meta-information lines and header line for the
specified VCF recordvcfRecord
- a VCF record with a GL format field corresponding to
the specified vcfHeader
objectmaxLR
- the maximum likelihood ratioVcfRecord
instancejava.lang.IllegalArgumentException
- if the VCF record does not have a
GL format fieldjava.lang.IllegalArgumentException
- if a VCF record format error is
detectedjava.lang.IllegalArgumentException
- if there are not
vcfHeader.nHeaderFields()
tab-delimited fields in the
specified VCF recordjava.lang.NullPointerException
- if
vcfHeader == null || vcfRecord == null
public static VcfRecord fromGTGL(VcfHeader vcfHeader, java.lang.String vcfRecord, float maxLR)
VcfRecord
instance from a VCF
record and its GT, GL, and PL format subfield data.
If the GT format subfield is present and non-missing, the
GT format subfield is used to determine genotype likelihoods. Otherwise
the GL or PL format subfield is used to determine genotype likelihoods.
If both the GL and PL format subfields are present, only the GL format
subfield will be used. If the maximum normalized genotype likelihood
is 1.0 for a sample, then any other genotype likelihood for the sample
that is less than lrThreshold
is set to 0.vcfHeader
- meta-information lines and header line for the
specified VCF recordvcfRecord
- a VCF record with a GT, a GL or a PL format field
corresponding to the specified vcfHeader
objectmaxLR
- the maximum likelihood ratioVcfRecord
java.lang.IllegalArgumentException
- if the VCF record does not have a
GT, GL, or PL format fieldjava.lang.IllegalArgumentException
- if a VCF record format error is
detectedjava.lang.IllegalArgumentException
- if there are not
vcfHeader.nHeaderFields()
tab-delimited fields in the
specified VCF recordjava.lang.NullPointerException
- if
vcfHeader == null || vcfRecord == null
public java.lang.String qual()
public java.lang.String filter()
public java.lang.String info()
public java.lang.String format()
public int nFormatSubfields()
public java.lang.String formatSubfield(int subfieldIndex)
subfieldIndex
- a FORMAT subfield indexjava.lang.IndexOutOfBoundsException
- if
subfieldIndex < 0 || subfieldIndex >= this.nFormatSubfields()
public boolean hasFormat(java.lang.String formatCode)
true
if the specified FORMAT subfield is
present, and returns false
otherwise.formatCode
- a FORMAT subfield codetrue
if the specified FORMAT subfield is
presentpublic int formatIndex(java.lang.String formatCode)
formatCode
- the format subfield code-1
otherwisepublic java.lang.String sampleData(int sample)
sample
- a sample indexjava.lang.IndexOutOfBoundsException
- if
sample < 0 || sample >= this.nSamples()
public java.lang.String sampleData(int sample, java.lang.String formatCode)
sample
- a sample indexformatCode
- a FORMAT subfield codejava.lang.IllegalArgumentException
- if
this.hasFormat(formatCode)==false
java.lang.IndexOutOfBoundsException
- if
sample < 0 || sample >= this.nSamples()
public java.lang.String sampleData(int sample, int subfieldIndex)
sample
- a sample indexsubfieldIndex
- a FORMAT subfield indexjava.lang.IndexOutOfBoundsException
- if
field < 0 || field >= this.nFormatSubfields()
java.lang.IndexOutOfBoundsException
- if
sample < 0 || sample >= this.nSamples()
public java.lang.String[] formatData(java.lang.String formatCode)
this.nSamples()
containing the specified FORMAT subfield data for each sample. The
k
-th element of the array is the specified FORMAT subfield data
for the k
-th sample.formatCode
- a format subfield codethis.nSamples()
containing the specified FORMAT subfield data for each samplejava.lang.IllegalArgumentException
- if
this.hasFormat(formatCode) == false
public Samples samples()
VcfEmission
samples
in interface VcfEmission
public int nSamples()
VcfEmission
nSamples
in interface VcfEmission
public VcfHeader vcfHeader()
public Marker marker()
HapsMarker
marker
in interface HapsMarker
marker
in interface MarkerContainer
public int allele1(int sample)
VcfEmission
allele1
in interface HapsMarker
allele1
in interface VcfEmission
sample
- the sample indexpublic int allele2(int sample)
VcfEmission
allele2
in interface HapsMarker
allele2
in interface VcfEmission
sample
- the sample indexpublic boolean isPhased(int sample)
VcfEmission
true
if the genotype emission probabilities for
the specified sample are determined by a phased, nonmissing genotype,
and returns false
otherwise.isPhased
in interface VcfEmission
sample
- the sample indextrue
if the genotype emission probabilities
for the specified sample are determined by a phased, nonmissing genotypepublic boolean isRefData()
VcfEmission
true
if the genotype emission probabilities
for each sample are determined by a phased called genotype
that has no missing alleles, and returns false
otherwise.isRefData
in interface VcfEmission
true
if the genotype emission probabilities
for each sample are determined by a phased called genotype
that has no missing allelespublic float gl(int sample, int allele1, int allele2)
VcfEmission
gl
in interface VcfEmission
sample
- the sample indexallele1
- the first allele indexallele2
- the second allele indexpublic int allele(int hap)
VcfEmission
allele
in interface HapsMarker
allele
in interface VcfEmission
hap
- the haplotype indexpublic int nAlleles()
VcfEmission
nAlleles
in interface VcfEmission
public boolean storesNonMajorIndices()
VcfEmission
true
if this instance stores the indices of haplotypes
that carry non-major alleles, and returns false
otherwise.storesNonMajorIndices
in interface VcfEmission
true
if this instance stores the indices of haplotypes
that carry non-major alleles, and returns false
otherwisepublic int majorAllele()
VcfEmission
majorAllele
in interface VcfEmission
public int alleleCount(int allele)
VcfEmission
alleleCount
in interface VcfEmission
allele
- an allele indexpublic int hapIndex(int allele, int copy)
VcfEmission
hapIndex
in interface VcfEmission
allele
- an allele indexcopy
- a copy indexpublic int nHaps()
HapsMarker
2*this.nHapPairs()
.nHaps
in interface HapsMarker
public int nHapPairs()
HapsMarker
this.nHaps()/2
.nHapPairs
in interface HapsMarker
public java.lang.String toString()
toString
in class java.lang.Object