LocARNA-2.0.0
|
Represents a multiple alignment. More...
#include <multiple_alignment.hh>
Classes | |
class | AliColumn |
read only proxy class representing a column of the alignment More... | |
class | SeqEntry |
A row in a multiple alignment. More... | |
Public Types | |
enum class | FormatType { STOCKHOLM , PP , CLUSTAL , FASTA } |
file format type for multiple alignments More... | |
enum class | AnnoType { consensus_structure , structure , fixed_structure , anchors } |
type of sequence annotation. enumerates legal annotation types More... | |
typedef std::vector< SeqEntry >::const_iterator | const_iterator |
const iterator of sequence entries | |
typedef std::vector< SeqEntry >::iterator | iterator |
iterator of sequence entries | |
using | value_type = SeqEntry |
Public Member Functions | |
MultipleAlignment () | |
Construct empty. | |
MultipleAlignment (const MultipleAlignment &ma)=default | |
Copy construct. | |
MultipleAlignment (MultipleAlignment &&ma)=default | |
Move construct. | |
MultipleAlignment & | operator= (const MultipleAlignment &ma)=default |
Copy assignment. | |
MultipleAlignment & | operator= (MultipleAlignment &&ma)=default |
Move assignment. | |
virtual | ~MultipleAlignment () |
virtual destructor | |
MultipleAlignment (const std::string &file, FormatType format=FormatType::CLUSTAL) | |
Construct from file. More... | |
MultipleAlignment (std::istream &in, FormatType format=FormatType::CLUSTAL) | |
Construct from stream. More... | |
MultipleAlignment (const std::string &name, const std::string &sequence) | |
Construct as degenerate alignment of one sequence. More... | |
MultipleAlignment (const std::string &nameA, const std::string &nameB, const std::string &alistringA, const std::string &alistringB) | |
Construct as pairwise alignment from names and alignment strings. More... | |
MultipleAlignment (const Alignment &alignment, bool only_local=false, bool special_gap_symbols=false) | |
Construct from Alignment object. More... | |
MultipleAlignment (const AlignmentEdges &edges, const Sequence &seqA, const Sequence &seqB) | |
Construct from alignment edges and sequences. More... | |
void | normalize_rna_symbols () |
normalize rna symbols More... | |
size_type | num_of_rows () const |
Number of rows of multiple aligment. More... | |
bool | empty () const |
Emptiness check. More... | |
const SequenceAnnotation & | annotation (const AnnoType &annotype) const |
Read access of annotation by prefix. More... | |
void | set_annotation (const AnnoType &annotype, const SequenceAnnotation &annotation) |
Write access to annotation. More... | |
bool | has_annotation (const AnnoType &annotype) const |
bool | is_proper () const |
Test whether alignment is proper. More... | |
pos_type | length () const |
Length of multiple aligment. More... | |
const_iterator | begin () const |
Begin for read-only traversal of name/sequence pairs. More... | |
const_iterator | end () const |
End for read-only traversal of name/sequence pairs. More... | |
bool | contains (const std::string &name) const |
Test whether name exists. More... | |
size_type | index (const std::string &name) const |
Access index by name. More... | |
const SeqEntry & | seqentry (size_type index) const |
Access name/sequence pair by index. More... | |
const SeqEntry & | seqentry (const std::string &name) const |
Access name/sequence pair by name. More... | |
size_type | deviation (const MultipleAlignment &ma) const |
Deviation of a multiple alignment from a reference alignment. More... | |
double | sps (const MultipleAlignment &ma, bool compalign=true) const |
Sum-of-pairs score between a multiple alignment and a reference alignment. More... | |
double | cmfinder_realignment_score (const MultipleAlignment &ma) const |
Cmfinder realignment score of a multiple alignment to a reference alignment. More... | |
double | avg_deviation_score (const MultipleAlignment &ma) const |
Average deviation score. More... | |
std::string | consensus_sequence () const |
Consensus sequence of multiple alignment. More... | |
AliColumn | column (size_type col_index) const |
Access alignment column. More... | |
void | append (const SeqEntry &seqentry) |
Append sequence entry. More... | |
void | prepend (const SeqEntry &seqentry) |
Prepend sequence entry. More... | |
void | operator+= (const AliColumn &c) |
Append a column. More... | |
void | operator+= (char c) |
Append the same character to each row. More... | |
void | reverse () |
reverse the multiple alignment | |
std::ostream & | write (std::ostream &out, FormatType format=MultipleAlignment::FormatType::CLUSTAL) const |
Write alignment to stream. More... | |
std::ostream & | write (std::ostream &out, size_t width, FormatType format=MultipleAlignment::FormatType::CLUSTAL) const |
Write alignment to stream (wrapped) More... | |
std::ostream & | write_name_sequence_line (std::ostream &out, const std::string &name, const std::string &sequence, size_t namewidth) const |
Write formatted line of name and sequence. More... | |
std::ostream & | write (std::ostream &out, size_type start, size_type end, FormatType format=MultipleAlignment::FormatType::CLUSTAL) const |
Write sub-alignment to stream. More... | |
template<size_t N> | |
bool | checkAlphabet (const Alphabet< char, N > &alphabet) const |
check character constraints More... | |
void | write_debug (std::ostream &out=std::cout) const |
Print contents of object to stream. More... | |
Static Public Member Functions | |
static size_t | num_of_annotypes () |
number of annotation types More... | |
Static Public Attributes | |
static const std::vector< FormatType > | FormatTypes |
collection of the format types More... | |
static const std::vector< AnnoType > | AnnoTypes |
collection of the format types More... | |
Protected Member Functions | |
void | init (const AlignmentEdges &edges, const Sequence &seqA, const Sequence &seqB, bool special_gap_symbols) |
Initialize from alignment edges and sequences. More... | |
Represents a multiple alignment.
The multiple alignment is implemented as vector of name/sequence pairs.
Supports traversal of name/sequence pairs. The sequence entries support mapping from columns to positions and back.
Names are unique in a multiple alignment object.
Sequences positions and column indices are 1..len.
MultipleAlignment can have anchor and structure annotation and can read and write them.
|
strong |
type of sequence annotation. enumerates legal annotation types
|
strong |
LocARNA::MultipleAlignment::MultipleAlignment | ( | const std::string & | file, |
FormatType | format = FormatType::CLUSTAL |
||
) |
Construct from file.
file | name of input file |
format | file format ( |
failure | on read problems |
LocARNA::MultipleAlignment::MultipleAlignment | ( | std::istream & | in, |
FormatType | format = FormatType::CLUSTAL |
||
) |
Construct from stream.
in | input stream with alignment in clustalW-like format |
format | file format ( |
failure | on read errors |
LocARNA::MultipleAlignment::MultipleAlignment | ( | const std::string & | name, |
const std::string & | sequence | ||
) |
Construct as degenerate alignment of one sequence.
name | name of sequence |
sequence | sequence strings |
LocARNA::MultipleAlignment::MultipleAlignment | ( | const std::string & | nameA, |
const std::string & | nameB, | ||
const std::string & | alistringA, | ||
const std::string & | alistringB | ||
) |
Construct as pairwise alignment from names and alignment strings.
nameA | name of sequence A |
nameB | name of sequence B |
alistringA | alignment strings of sequence A |
alistringB | alignment strings of sequence B |
LocARNA::MultipleAlignment::MultipleAlignment | ( | const Alignment & | alignment, |
bool | only_local = false , |
||
bool | special_gap_symbols = false |
||
) |
Construct from Alignment object.
alignment | object of type Alignment |
only_local | if true, construct only local alignment |
special_gap_symbols | if true, use special distinct gap symbols for gaps due to loop deletion '_' or sparsification '~' |
LocARNA::MultipleAlignment::MultipleAlignment | ( | const AlignmentEdges & | edges, |
const Sequence & | seqA, | ||
const Sequence & | seqB | ||
) |
Construct from alignment edges and sequences.
edges | alignment edges |
seqA | sequence A |
seqB | sequence B |
const SequenceAnnotation & LocARNA::MultipleAlignment::annotation | ( | const AnnoType & | annotype | ) | const |
Read access of annotation by prefix.
type | of annotation |
void LocARNA::MultipleAlignment::append | ( | const SeqEntry & | seqentry | ) |
Append sequence entry.
seqentry | new sequence entry |
double LocARNA::MultipleAlignment::avg_deviation_score | ( | const MultipleAlignment & | ma | ) | const |
Average deviation score.
ma | multiple alignment |
|
inline |
Begin for read-only traversal of name/sequence pairs.
bool LocARNA::MultipleAlignment::checkAlphabet | ( | const Alphabet< char, N > & | alphabet | ) | const |
check character constraints
Check whether the alignment contains characters from the given alphabet only
alphabet | alphabet of admissible characters |
double LocARNA::MultipleAlignment::cmfinder_realignment_score | ( | const MultipleAlignment & | ma | ) | const |
Cmfinder realignment score of a multiple alignment to a reference alignment.
ma | multiple alignment |
Access alignment column.
col_index | column index |
std::string LocARNA::MultipleAlignment::consensus_sequence | ( | ) | const |
Consensus sequence of multiple alignment.
Consensus sequence by simple majority in each column. Assume that only ascii < 127 characters occur
bool LocARNA::MultipleAlignment::contains | ( | const std::string & | name | ) | const |
Test whether name exists.
name | name of a sequence |
size_type LocARNA::MultipleAlignment::deviation | ( | const MultipleAlignment & | ma | ) | const |
Deviation of a multiple alignment from a reference alignment.
ma | multiple alignment |
|
inline |
Emptiness check.
|
inline |
End for read-only traversal of name/sequence pairs.
|
inline |
Annotation availability
prefix | annotation prefix |
|
inline |
Access index by name.
name | name of a sequence |
|
protected |
Initialize from alignment edges and sequences.
edges | alignment edges |
seqA | sequence A |
seqB | sequence B |
special_gap_symbols | if true, use special distinct gap symbols for gaps due to loop deletion '_' or sparsification '~' |
bool LocARNA::MultipleAlignment::is_proper | ( | ) | const |
Test whether alignment is proper.
|
inline |
Length of multiple aligment.
void LocARNA::MultipleAlignment::normalize_rna_symbols | ( | ) |
normalize rna symbols
Normalize the symbols in all aligned sequences assuming that they code for RNA
|
inlinestatic |
number of annotation types
|
inline |
Number of rows of multiple aligment.
void LocARNA::MultipleAlignment::operator+= | ( | char | c | ) |
Append the same character to each row.
c | character that is appended |
void LocARNA::MultipleAlignment::operator+= | ( | const AliColumn & | c | ) |
Append a column.
c | column that is appended |
void LocARNA::MultipleAlignment::prepend | ( | const SeqEntry & | seqentry | ) |
Prepend sequence entry.
seqentry | new sequence entry |
|
inline |
Access name/sequence pair by name.
name | name of name/sequence pair |
Access name/sequence pair by index.
index | index of name/sequence pair (0-based) |
|
inline |
Write access to annotation.
prefix | annotation prefix |
annotation | sequence annotation |
double LocARNA::MultipleAlignment::sps | ( | const MultipleAlignment & | ma, |
bool | compalign = true |
||
) | const |
Sum-of-pairs score between a multiple alignment and a reference alignment.
ma | multiple alignment |
compalign | whether to compute score like compalign |
std::ostream & LocARNA::MultipleAlignment::write | ( | std::ostream & | out, |
FormatType | format = MultipleAlignment::FormatType::CLUSTAL |
||
) | const |
Write alignment to stream.
out | output stream |
format | alignment format; only CLUSTAL or STOCKHOLM; default: CLUSTAL ( |
Writes one line "<name> <seq>" for each sequence; moereover, writes annotations.
std::ostream & LocARNA::MultipleAlignment::write | ( | std::ostream & | out, |
size_t | width, | ||
FormatType | format = MultipleAlignment::FormatType::CLUSTAL |
||
) | const |
Write alignment to stream (wrapped)
out | output stream |
width | output stream |
format | alignment format; only CLUSTAL or STOCKHOLM; default: CLUSTAL ( |
Writes lines "<name> <seq>" per sequence, wraps lines at width
std::ostream & LocARNA::MultipleAlignment::write | ( | std::ostream & | out, |
size_type | start, | ||
size_type | end, | ||
FormatType | format = MultipleAlignment::FormatType::CLUSTAL |
||
) | const |
Write sub-alignment to stream.
Write from position start to position end to output stream out; write lines "<name> <seq>"
out | output stream |
start | start column (1-based) |
end | end column (1-based) |
format | alignment format; default: CLUSTAL ( |
void LocARNA::MultipleAlignment::write_debug | ( | std::ostream & | out = std::cout | ) | const |
Print contents of object to stream.
out | output stream |
std::ostream & LocARNA::MultipleAlignment::write_name_sequence_line | ( | std::ostream & | out, |
const std::string & | name, | ||
const std::string & | sequence, | ||
size_t | namewidth | ||
) | const |
Write formatted line of name and sequence.
The line is formatted such that it fits the output of the write methods.
out | output stream |
name | name string |
sequence | sequence string |
|
static |
collection of the format types
|
static |
collection of the format types