TASSELpy.net.maizegenetics.analysis.popgen package

Submodules

TASSELpy.net.maizegenetics.analysis.popgen.LDResult module

class TASSELpy.net.maizegenetics.analysis.popgen.LDResult.LDResult(*args, **kwargs)[source]

Bases: TASSELpy.java.lang.Object.Object

Methods

Builder
castTo(pyType) Casts this object to another java/python type
clone(*args) Creates and returns a copy of this object
dPrime(*args) Gets the D’ value
equals(*args) Indicates whether some other object is “equal to” this one
getArray(size) Gets an empty wrapped java array that can accept the type of the wrapped
getClass(*args) Returns the runtime class of this Object.
getDblArray(rows[, cols]) Gets an empty wrapped java array that can accept the type of other wrapped java arrays: i.e.
hashCode(*args) Returns a hash code vlaue for the object
n(*args) Gets the number of individuals used to calculate LD
p(*args) Gets the p-value for r2
r2(*args) Gets the r2 value
site1(*args) Gets the first site
site2(*args) Gets the second site
toString(*args) Returns a string representation of the object
wrap_existing_array(arr_instance) Wraps a java array of this class’s type
class Builder(*args, **kwargs)[source]

Bases: TASSELpy.java.lang.Object.Object

Methods

build
castTo
clone
dprime
equals
getArray
getClass
getDblArray
hashCode
n
p
r2
toString
wrap_existing_array
__init__(*args, **kwargs)[source]

Constructs an LDResult builder

Signature:

Builder (int site1, int site2)

Parameters:
  • site1 (int) – Index of the first site
  • site2 (int) – Index of the second site
build(*args)[source]

Builds the LDResult

Signature:build ()
Returns:The LDResult
Return type:LDResult
dprime(*args)[source]

Adds the dprime value

Signature:dprime (float value)
Parameters:value (float) – The dprime value
Returns:The builder with the dprime value included
Return type:Builder
n(*args)[source]

Adds the n value

Signature:n (int value)
Parameters:value (int) – The n value
Returns:The builder with the n value included
Return type:Builder
p(*args)[source]

Adds the p value

Signature:p (float value)
Parameters:value (float) – The p value
Returns:The builder with the p value included
Return type:Builder
r2(*args)[source]

Adds the r2 value

Signature:r2 (float value)
Parameters:value (float) – The r2 value
Returns:The builder with the r2 value included
Return type:Builder
LDResult.__init__(*args, **kwargs)[source]

Constructs an LDResult

Signature:LDResult (int site1, int site2, float r2, float dprime, float p, int n)
LDResult.dPrime(*args)[source]

Gets the D’ value

Signature:dPrime ()
Returns:The D’ value
Return type:float
LDResult.n(*args)[source]

Gets the number of individuals used to calculate LD

Signature:n ()
Returns:The number of individuals used to calculate LD
Return type:int
LDResult.p(*args)[source]

Gets the p-value for r2

Signature:p ()
Returns:The p value
Return type:float
LDResult.r2(*args)[source]

Gets the r2 value

Signature:r2 ()
Returns:The r2 value
Return type:float
LDResult.site1(*args)[source]

Gets the first site

Signature:site1 ()
Returns:Index of the first site
Return type:int
LDResult.site2(*args)[source]

Gets the second site

Signature:site2 ()
Returns:Index of the second site
Return type:int

TASSELpy.net.maizegenetics.analysis.popgen.LinkageDisequilibrium module

class TASSELpy.net.maizegenetics.analysis.popgen.LinkageDisequilibrium.LinkageDisequilibrium(*args, **kwargs)[source]

Bases: TASSELpy.java.lang.Thread.Thread, TASSELpy.net.maizegenetics.util.TableReport.TableReport

This class calculates D’ and r^2 estimates of linkage disequilibrium. It also calculates the significance of the LD by either Fisher Exact or the multinomial permutation test. This class can work with either normal alignments of annotated alignments. The alignments should be stripped of invariable numSites.

{@link testDesign} sets matrix design for LD calculation. Either all by

all, sliding window, site by all, or site list.

There are multiple approaches for dealing with heterozygous sites. {@link HetTreatment} sets the way these are treated. Haplotype assumes fully phased heterozygous sites (any hets are double counted). This is the best approach for speed when things are fully phased. Homozygous converted all hets to missing. Genotype does a 3x3 genotype analysis (to be implemented)

2 state estimates of D’ and r^2 can be found reviewed and discussed in Weir 1996

Multi-state loci (>=3) require an averaging approach. In TASSEL 3 in 2010, Buckler removed these approach as the relative magnitudes and meaningfulness of these approaches has never been clear. Additionally with the moving away from SSR to SNPs these methods are less relevant. Researchers should convert to biallelic - either by ignoring rarer classes or collapsing rarer states.

Methods

calculateBitLDForHaplotype(*args) Calculates the Bit LD between two sites
calculateDPrime(*args) Calculates the normalized D’ from Weir Genetic Analysis II 1986 pg.
calculateRSqr(*args) Calculates the normalized r2 from Awadella Science 1999 286:2524
castTo(pyType) Casts this object to another java/python type
clone(*args) Creates and returns a copy of this object
equals(*args) Indicates whether some other object is “equal to” this one
getAlignment(*args) Returns an annotated alignment if one was used for this LD.
getArray(size) Gets an empty wrapped java array that can accept the type of the wrapped
getClass(*args) Returns the runtime class of this Object.
getColumnCount(*args) Gets the number of columns
getDPrime(*args) Gets the D’ estimate for a given pair of numSites
getDblArray(rows[, cols]) Gets an empty wrapped java array that can accept the type of other wrapped java arrays: i.e.
getElementCount(*args) Gets the total number of elements in the dataset
getLDForSitePair(*args) Method for estimating LD between a pair of bit sets.
getPval(*args) Returns P-value estimate for a given pair of numPSites.
getRSqr(*args) Gets the r2 estimate for a given pair of numSites
getRow(*args) Returns specified row
getRowCount(*args) Gets the number of rows
getSampleSize(*args) Get number of gametes included in LD calculations (after missing data
getSiteCount(*args) Gets the counts of the numSites in the alignment
getTableColumnNames(*args) Gets the names of the columns
getTableTitle(*args) Gets the title of the table
getValueAt(*args) Returns value at given row and column
getX(*args)
signature:getX (int row)
getY(*args)
signature:getY (int row)
hashCode(*args) Returns a hash code vlaue for the object
run(*args) When an object implementing interface Runnable is used to create a thread, starting the
toDict() Outputs the table as a dictionary
toString(*args) Returns a string representation of the object
wrap_existing_array(arr_instance) Wraps a java array of this class’s type
HetTreatment = <Haplotype: 0, Homozygous: 1, Genotype: 2>
__init__(*args, **kwargs)[source]

Constructor for doing LD analysis

Signature:

LinkageDisequilibrium (GenotypeTable alignment, int windowSize, testDesign LDType, int testSite, ProgressListener listener, boolean isAccumulativeReport, int numAccumulateIntervals, int[] sitesList, HetTreatment hetTreatment)

Parameters:
  • alignment (GenotypeTable) – Input alignment with segregating sites
  • windowSize (int) – Size of sliding window
  • LDType (testDesign) – One of the testDesign enum types (All, SlidingWindow, SiteByAll, SiteList) testSite listener isAccumulativeReport numAccumulateIntervals sitesList hetTreatment
static calculateBitLDForHaplotype(*args)[source]

Calculates the Bit LD between two sites

Signature:

calculateBitLDForHaplotype (boolean ignoreHets, int minTaxaForEstimate, GenotypeTable alignment, int site1, int site2)

Parameters:
  • ignoreHets (boolean) – Whether to ignore heterozygous sites
  • minTaxaForEstimate (int) – The minimum number of taxa required to estimate LD
  • alignment (GenotypeTable) – A GenotypeTable containing the sites
  • site1 (int) – The index of the first site
  • site2 (int) – The index of the second site
Returns:

LDResult containing the LD info

Return type:

LDResult

static calculateDPrime(*args)[source]

Calculates the normalized D’ from Weir Genetic Analysis II 1986 pg. 120

Signature:

calculateDPrime (int countAB, int countAb, int countaB, int countab, int minTaxaForEstimate)

Parameters:
  • countAB (int) – Count of AB alleles
  • countAb (int) – Count of Ab alleles
  • countaB (int) – Count of aB alleles
  • countab (int) – Count of ab alleles
  • minTaxaForEsimate – The minimum number of taxa required for the estimate
Returns:

Value of D’

Return type:

double

static calculateRSqr(*args)[source]

Calculates the normalized r2 from Awadella Science 1999 286:2524

Signature:

calculateRSqr (int countAB, int countAb, int countaB, int countab, int minTaxaForEstimate)

Parameters:
  • countAB (int) – Count of AB alleles
  • countAb (int) – Count of Ab alleles
  • countaB (int) – Count of aB alleles
  • countab (int) – Count of ab alleles
  • minTaxaForEsimate – The minimum number of taxa required for the estimate
Returns:

Value of r2

Return type:

double

getAlignment(*args)[source]

Returns an annotated alignment if one was used for this LD. This could be used to access information of locus position

Signatures:

GenotypeTable getAlignment

Returns:

The GenotypeTable

getDPrime(*args)[source]

Gets the D’ estimate for a given pair of numSites

Signature:

getDPrime (int r, int c)

Parameters:
  • r (int) – site 1
  • c (int) – site 2
Returns:

D’

Return type:

float

static getLDForSitePair(*args)[source]

Method for estimating LD between a pair of bit sets. Since there can be tremendous missing data, minimum minor and minimum site counts ensure that meaningful results are estimated. Site indices are merely there for annotating the LDResult.

Signature:

getLDForSitePair (BitSet rMj, BitSet rMn, BitSet cMj, BitSet cMn, int minMinorCnt, int minCnt, float minR2, FisherExact myFisherExact, int site1Index, int site2Index)

Parameters:
  • rMj (BitSet) – site 1 major alleles
  • rMn (BitSet) – site 1 minor alleles
  • cMj (BitSet) – site 2 major alleles
  • cMn (BitSet) – site 2 minor alleles
  • minMinorCnt (int) – minimum minor allele count after intersection
  • minCnt (int) – minimum count after intersection
  • minR2 (float) – results below this r2 are ignored for p-value calculation (save time)
  • myFisherExact (FisherExact) – Instance of FisherExact to do Fisher Exact test
  • site1Index (int) – Annotation of LDResult with site indices
  • site2Index (int) – Annotation of LDResult with site indices
Returns:

An LDResult for the pair of sites

Return type:

LDResult

getPval(*args)[source]

Returns P-value estimate for a given pair of numPSites. If there were only 2 alleles at each locus, then the Fisher Exact P-value (one-tail) is returned. If more states, then the permuted Monte Carlo test is used

Signature:

getPVal (int r, int c)

Parameters:
  • r (int) – site 1
  • c (int) – site 2
Returns:

P-value

Return type:

double

getRSqr(*args)[source]

Gets the r2 estimate for a given pair of numSites

Signature:

getRSqr (int r, int c)

Parameters:
  • r (int) – site 1
  • c (int) – site 2
Returns:

r2

Return type:

float

getSampleSize(*args)[source]

Get number of gametes included in LD calculations (after missing data was excluded)

Signature:

getSampleSize (int r, int c)

Parameters:
  • r (int) – site 1
  • c (int) – site 2
Returns:

Number of gametes

Return type:

int

getSiteCount(*args)[source]

Gets the counts of the numSites in the alignment

Signature:getSiteCount ()
Returns:The counts of the numSites in the alignment
Return type:int
getX(*args)[source]
Signature:getX (int row)
Parameters:row (int) – row
Returns:X
Return type:int
getY(*args)[source]
Signature:getY (int row)
Parameters:row (int) – row
Returns:Y
Return type:int
testDesign = <All: 0, SlidingWindow: 1, SiteByAll: 2, SiteList: 3>

Module contents