Boundary Aligner

pytadbit.boundary_aligner.aligner.align(sequences, method='reciprocal', **kwargs)[source]

Align Topologically Associating Domains. Supports multiple alignment by building a consensus TAD and aligning each TAD to it. Note: as long as we are using multiple alignments in an iterative way, the order of sequences we be relevant. Here experiments are sorted according to the value of the first boundary found in order to try to reduce this problem.

Parameters:method (reciprocal) – method used to align
pytadbit.boundary_aligner.locally.smith_waterma(tads1, tads2)[source]

WARNING: not working!

pytadbit.boundary_aligner.reciprocally.reciprocal(tads1, tads2, penalty=None, verbose=False, max_dist=100000)[source]

Method based on reciprocal closest boundaries (bd). bd1 will be aligned with bd2 (closest boundary from bd1) if and only if bd1 is the closest boundary of bd2 too (and of course if the distance between bd1 and bd2 is lower than max_dist).

Parameters:
  • tads1 – list of boundaries
  • tads2 – list of boundaries
  • penalty (None) – if None, penalty will be two times max_dist
  • verbose – print alignment
  • max_dist (100000) – distance threshold from which two boundaries can not be aligned together
Returns:

the alignment and a score between 0 and 1 (0: bad, 1: good).

pytadbit.boundary_aligner.globally.needleman_wunsch(tads1, tads2, penalty=-6.0, ext_pen=-5.6, max_dist=500000, verbose=False)[source]

Align two lists of TAD boundaries.

Parameters:
  • tads1 – list of boundaries for one chromosome under one condition
  • tads2 – list of boundaries for the same chromosome under other conditions
  • penalty (-0.1) – penalty to open a gap in the alignment of boundaries
  • max_dist (500000) – distance from which match are denied. A bin_size of 20Kb the number of bins corresponding to 0.5Mb is 25
  • verbose (False) – print the Needleman-Wunsch score matrix, and the alignment of boundaries
Returns:

the max score in the Needleman-Wunsch score matrix.