# 5.2 Glossary ¶

## 5.2.1 Pairwise alignment (noun)¶

A hypothesis about which bases or amino acids in two biological sequences are derived from a common ancestral base or amino acid. By definition, the aligned sequences will be of equal length with gaps (usually denoted with -, or . for terminal gaps) indicating hypothesized insertion deletion events. A pairwise alignment may be represented as follows:

ACC---GTAC
CCCATCGTAG

## 5.2.2 kmer (noun)¶

A kmer is simply a word (or list of adjacent characters) in a sequence of length k. For example, the overlapping kmers in the sequence ACCGTGACCAGTTACCAGTTTGACCAA are as follows:

In [1]:
import skbio
skbio.DNA('ACCGTGACCAGTTACCAGTTTGACCAA').kmer_frequencies(k=5, overlap=True)

Out[1]:
{'ACCAA': 1,
'ACCAG': 2,
'ACCGT': 1,
'AGTTA': 1,
'AGTTT': 1,
'CAGTT': 2,
'CCAGT': 2,
'CCGTG': 1,
'CGTGA': 1,
'GACCA': 2,
'GTGAC': 1,
'GTTAC': 1,
'GTTTG': 1,
'TACCA': 1,
'TGACC': 2,
'TTACC': 1,
'TTGAC': 1,
'TTTGA': 1}

It is common for bioinformaticians to substitute the value of k for the letter k in the word kmer. For example, you might here someone say "we identified all seven-mers in our sequence", to mean they identified all kmers of length seven.