Lineage-specific subsittuiton rate acceleration in primates

2018-08-10

Introduction

Here we show how to use to annotate an alignment with respect to accelerated substitution rates in (any combination of) five primates.

Data

The package provide an example alignment and phylogenetic tree model:

library(linACC)
library(rphast)

ali_file = system.file("extdata", "ali.maf", package="linACC")
ali      = read.msa(ali_file)
ali_file = system.file("extdata", "ali_fast.maf", package="linACC")
ali_fast = read.msa(ali_file)


mod_file = system.file("extdata", "hg19.100way.phastCons.mammals.mod", package="linACC")
mod      = read.tm(mod_file)

Nested model structure

After reading in the alignemnt and phylogenetic model, we set up the nested model structure within the primates:

##- what we looking at
#====================
NBITS    = 10
specs    = c("hg19","panTro4","gorGor3","ponAbe2","nomLeu3")
specs    = as.vector(rbind(specs,specs))
specs = paste(specs,c(".sel",".bgc"),sep="")
PCUT     = 0.01 #- stop traversing if all over sig-thresh


#- enumerate the models for five leaf branches
#=============================================
mod.vecs = int2bitVec(0:(2^NBITS-1),NBITS)
mod.strs = int2bitStr(0:(2^NBITS-1),NBITS)
mod.vecs = t(apply(mod.vecs,1,as.logical))

#- reorder by numbers of parameters
ord            = order(apply(mod.vecs,2,sum))
mod.vecs       = mod.vecs[,ord]
mod.strs       = mod.strs[ord]
rownames(mod.vecs) = specs
colnames(mod.vecs) = mod.strs

#- make the adjacency matrix (nested *with one extra par*)
#=========================================================
amat = getAdjacency(mod.vecs)

Model fitting

Now we are ready to annotate our alignment to a model class:

#- Alignment gets classified as background:
ftd = fitNested(ali,mod.vecs,mod,amat)
cla = classifyNested(ftd,mod.vecs,pcut=.01)

#- Alignment gets classified as accelerated in gibbon
ftd = fitNested(ali_fast,mod.vecs,mod,amat)
cla = classifyNested(ftd,mod.vecs,pcut=.01)