LocARNA-2.0.0
Public Types | Public Member Functions | List of all members
LocARNA::AnchorConstraints Class Reference

Represents anchor constraints between two sequences. More...

#include <anchor_constraints.hh>

Public Types

typedef size_t size_type
 size type
 
typedef std::pair< size_type, size_typesize_pair_t
 size pair
 
typedef size_pair_t range_t
 type of range
 

Public Member Functions

 AnchorConstraints (size_type lenA, std::vector< std::string > seqCA, size_type lenB, std::vector< std::string > seqCB, bool strict)
 Construct from sequence lengths and anchor names. More...
 
 AnchorConstraints (size_type lenA, const std::string &seqCA, size_type lenB, const std::string &seqCB, bool strict)
 Construct from sequence lengths and anchor names. More...
 
bool allowed_match (size_type i, size_type j) const
 is match allowed More...
 
bool allowed_del_unopt (size_type i, size_type j) const
 is deletion allowed? (unoptimized) More...
 
bool allowed_del (size_type i, size_type j) const
 is deletion allowed? More...
 
bool allowed_ins_unopt (size_type i, size_type j) const
 is insertion allowed? (unoptimized) More...
 
bool allowed_ins (size_type i, size_type j) const
 is insertion allowed? More...
 
std::string get_name_a (size_type i) const
 get the name of position i in A
 
std::string get_name_b (size_type j) const
 get the name of position j in B
 
size_type name_size () const
 returns length/size of the names
 
bool empty () const
 is the constraint declaration empty
 
size_pair_t rightmost_anchor () const
 Get rightmost anchor. More...
 
size_pair_t leftmost_anchor () const
 Get leftmost anchor. More...
 
bool is_anchored_a (size_type i) const
 Is position in A anchored? More...
 
bool is_anchored_b (size_type i) const
 Is position in B anchored? More...
 
bool is_named_a (size_type i) const
 Is position in A named? More...
 
bool is_named_b (size_type i) const
 Is position in B named? More...
 
void print_debug ()
 write some debug information to stderr
 

Detailed Description

Represents anchor constraints between two sequences.

Maintains the constraints on (non-structural) alignment edges that have to be satisfied during the alignment

alignment algorithms can

and ask informations about sequence names.

SEMANTIC OF ANCHOR CONSTRAINTS

Generally, anchor constraints (i,j) enforce that positions i in A and j in B are matched; neither i nor j are deleted (for local alignment, this implies that both positions occur in the local alignment) The class allows to choose between two semantics of anchor constraints. The relaxed semantics can drop constraints and produce inconsisitencies during multiple alignment, when some names occur only in a subset of the sequences. Therefore, the strict semantics is introduced, which avoids such problems by introducing additional (order) dependencies between different names (consequently, the constraint specification is somewhat less flexible).

Relaxed semantics (originally, the only implemented semantics):

a) Positions with equal names must be matched (aligned to each other) Consequently, positions with names that occur also in the other sequence cannot be deleted. b) Names that occur in only one sequence, do not impose any constraints. Therefore, names can occur in arbitrary order.

Strict (ordered) semantics:

a) Names must be strictly lexicographically ordered in the annotation of each sequence b) Positions of equal names must be matched. c) Alignment columns must not violate the lex order, in the following sense: each alignment column, where at least one position is named, receives this name; the names of alignment columns must be lex-ordered (in the order of the columns).

Constructor & Destructor Documentation

◆ AnchorConstraints() [1/2]

LocARNA::AnchorConstraints::AnchorConstraints ( size_type  lenA,
std::vector< std::string >  seqCA,
size_type  lenB,
std::vector< std::string >  seqCB,
bool  strict 
)

Construct from sequence lengths and anchor names.

Parameters
lenAlength of sequence A
seqCAvector of anchor strings for sequence A
lenBlength of sequence B
seqCBvector of anchor strings for sequence B
strictuse strict semantics

The constraints (=alignment edges that have to be satisfied) are encoded as follows: equal symbols in the sequences for A and B form an edge

In order to specify an arbitrary number of sequences, the strings can consist of several lines, then a symbol consists of all characters of the column. '.' and ' ' are neutral character, in the sense that columns consisting only of neutral characters do not specify names that have to match. However, neutral characters are not identified in names that contain at least one non-neutral character!

Example: seqCA={"..123...."} seqCB={"...12.3...."}

specifies the edges (3,4), (4,5), and (5,7)

Example 2: seqCA={"..AAB....", "..121...."} seqCB={"...AA.B....", "...12.1...."} specifies the same constraints, allowing a larger name space for constraints.

◆ AnchorConstraints() [2/2]

LocARNA::AnchorConstraints::AnchorConstraints ( size_type  lenA,
const std::string &  seqCA,
size_type  lenB,
const std::string &  seqCB,
bool  strict 
)

Construct from sequence lengths and anchor names.

Parameters
lenAlength of sequence A
seqCAconcatenated anchor strings for sequence A (separated by '#')
lenBlength of sequence B
seqCBconcatenated anchor strings for sequence B (separated by '#')
strictuse strict semantics

for semantics of anchor strings see first constructor

Member Function Documentation

◆ allowed_del()

bool LocARNA::AnchorConstraints::allowed_del ( size_type  i,
size_type  j 
) const
inline

is deletion allowed?

Parameters
iposition/matrix index of first sequence
jposition/matrix index of second sequence
Returns
whether it is allowed to delete i immediately right of j
See also
allowed_del_unopt()

◆ allowed_del_unopt()

bool LocARNA::AnchorConstraints::allowed_del_unopt ( size_type  i,
size_type  j 
) const
inline

is deletion allowed? (unoptimized)

Parameters
iposition/matrix index of first sequence
jposition/matrix index of second sequence
Returns
whether it is allowed to delete i immediately right of j
See also
allowed_match(), allowed_ins()

Definition (strict semantics): allowed_del(i, j) iff (! is_anchored(i) && names_a_[ max { i'<=i | named(i') ] < names_b_[ min { j'>=j+1 | named(j') ] && names_a_[ min { i'>=i | named(i') ] > names_b_[ max { j'<=j | named(j') ])

Definition (relaxed semantics): allowed_del(i,j) iff i~"j+0.5" does not cross (or touch) any edge i'~j', where name_a_[i']=name_b_[j']

◆ allowed_ins()

bool LocARNA::AnchorConstraints::allowed_ins ( size_type  i,
size_type  j 
) const
inline

is insertion allowed?

Parameters
iposition/matrix index of first sequence
jposition/matrix index of second sequence
Returns
whether it is allowed to insert j immediately right of i
See also
allowed_match(), allowed_del()

◆ allowed_ins_unopt()

bool LocARNA::AnchorConstraints::allowed_ins_unopt ( size_type  i,
size_type  j 
) const
inline

is insertion allowed? (unoptimized)

Parameters
iposition/matrix index of first sequence
jposition/matrix index of second sequence
Returns
whether it is allowed to insert j immediately right of i
See also
allowed_match(), allowed_del()

◆ allowed_match()

bool LocARNA::AnchorConstraints::allowed_match ( size_type  i,
size_type  j 
) const
inline

is match allowed

Parameters
iposition/matrix index of first sequence
jposition/matrix index of second sequence
Returns
whether i~j is an allowed match

Test whether the alignment edge i~j (i.e. the match of i and j) is allowed? An alignment edge is allowed, iff it is not in conflict with any anchor constraint.

Definition (strict semantics): allowed_match(i,j) iff (names_a_[ max { i'<=i | named(i') ] <= names_b_[ min { j'>=j | named(j') ] && names_a_[ min { i'>=i | named(i') ] >= names_b_[ max { j'<=j | named(j') ])

Definition (relaxed semantics): allowed_match(i,j) iff i~j does not cross (or touch) any edge i'~j' != i~j, where name_a_[i']=name_b_[j']

◆ is_anchored_a()

bool LocARNA::AnchorConstraints::is_anchored_a ( size_type  i) const
inline

Is position in A anchored?

Parameters
iposition in A
Note
defined only for positions i in 0..lenA_+1

◆ is_anchored_b()

bool LocARNA::AnchorConstraints::is_anchored_b ( size_type  i) const
inline

Is position in B anchored?

See also
is_anchored_a

◆ is_named_a()

bool LocARNA::AnchorConstraints::is_named_a ( size_type  i) const
inline

Is position in A named?

Parameters
iposition in A

◆ is_named_b()

bool LocARNA::AnchorConstraints::is_named_b ( size_type  i) const
inline

Is position in B named?

See also
is_named_a

◆ leftmost_anchor()

size_pair_t LocARNA::AnchorConstraints::leftmost_anchor ( ) const
inline

Get leftmost anchor.

Returns
the positions (i,j) of the leftmost anchor constraint
Note
if there are no anchors, return (lenA+1,lenB+1)

◆ rightmost_anchor()

size_pair_t LocARNA::AnchorConstraints::rightmost_anchor ( ) const
inline

Get rightmost anchor.

Returns
the positions (i,j) of the rightmost anchor constraint
Note
if there are no anchors, return (0,0)

The documentation for this class was generated from the following files: