序列比对(sequencealignment)retinol-bindingprotein(NP_006735)-lactoglobulin(P02754)中心思想:从随机的相似性中寻找同源导致的相似性1MKWVWALLLLAAWAAAERDCRVSSFRVKENFDKARFSGTWYAMAKKDPEG50RBP.||||.|...|:.||||.:|:1...MKCLLLALALTCGAQALIVT..QTMKGLDIQKVAGTWYSLAMAASD.44lactoglobulin51LFLQDNIVAEFSVDETGQMSATAKGRVR.LLNNWD..VCADMVGTFTDTE97RBP:||||::|.|.|||:|||.45ISLLDAQSAPLRV.YVEELKPTPEGDLEILLQKWENGECAQKKIIAEKTK93lactoglobulin98DPAKFKMKYWGVASFLQKGNDDHWIVDTDYDTYAV...........QYSC136RBP||||.|:.|||||..|94IPAVFKIDALNENKVL........VLDTDYKKYLLFCMENSAEPEQSLAC135lactoglobulin137RLLNLDGTCADSYSFVFSRDPNGLPPEAQKIVRQRQ.EELCLARQYRLIV185RBP.|||:||.||||136QCLVRTPEVDDEALEKFDKALKALPMHIRLSFNPTQLEEQCHI.......178lactoglobulin序列比对的基本概念Dayhoff模型和Blosum比对算法:局部比对和全局比对比对结果的统计检验序列比对的作用生物大分子的进化搜索相似序列ItisusedtoidentifydomainsormotifsthataresharedbetweenproteinsItisthebasisofdatabasesearching结构预测和基因预测蛋白质序列比对比DNA比对信息量更大•proteinismoreinformative(20vs4characters);manyaminoacidssharerelatedbiophysicalproperties•codonsaredegenerate:changesinthethirdpositionoftendonotaltertheaminoacidthatisspecified•proteinsequencesofferalonger“look-back”time•DNAsequencescanbetranslatedintoprotein,andthenusedinpairwisealignmentsQuery:181catcaactacaactccaaagacacccttacacccactaggatatcaacaaacctacccac240|||||||||||||||||||||||||||||||||||||||||||||||||||||||Sbjct:189catcaactgcaaccccaaagccacccct-cacccactaggatatcaacaaacctacccac247DNA比对的特殊作用•Manytimes,DNAalignmentsareappropriate--toconfirmtheidentityofacDNA--tostudynoncodingregionsofDNA--tostudyDNApolymorphisms--example:NeanderthalvsmodernhumanDNA序列比对的一些基本概念1MKWVWALLLLAAWAAAERDCRVSSFRVKENFDKARFSGTWYAMAKKDPEG50RBP.||||.|...|:.||||.:|:1...MKCLLLALALTCGAQALIVT..QTMKGLDIQKVAGTWYSLAMAASD.44lactoglobulin51LFLQDNIVAEFSVDETGQMSATAKGRVR.LLNNWD..VCADMVGTFTDTE97RBP:||||::|.|.|||:|||.45ISLLDAQSAPLRV.YVEELKPTPEGDLEILLQKWENGECAQKKIIAEKTK93lactoglobulin98DPAKFKMKYWGVASFLQKGNDDHWIVDTDYDTYAV...........QYSC136RBP||||.|:.|||||..|94IPAVFKIDALNENKVL........VLDTDYKKYLLFCMENSAEPEQSLAC135lactoglobulin137RLLNLDGTCADSYSFVFSRDPNGLPPEAQKIVRQRQ.EELCLARQYRLIV185RBP.|||:||.||||136QCLVRTPEVDDEALEKFDKALKALPMHIRLSFNPTQLEEQCHI.......178lactoglobulinPairwisealignmentofretinol-bindingproteinand-lactoglobulin1MKWVWALLLLAAWAAAERDCRVSSFRVKENFDKARFSGTWYAMAKKDPEG50RBP.||||.|...|:.||||.:|:1...MKCLLLALALTCGAQALIVT..QTMKGLDIQKVAGTWYSLAMAASD.44lactoglobulin51LFLQDNIVAEFSVDETGQMSATAKGRVR.LLNNWD..VCADMVGTFTDTE97RBP:||||::|.|.|||:|||.45ISLLDAQSAPLRV.YVEELKPTPEGDLEILLQKWENGECAQKKIIAEKTK93lactoglobulin98DPAKFKMKYWGVASFLQKGNDDHWIVDTDYDTYAV...........QYSC136RBP||||.|:.|||||..|94IPAVFKIDALNENKVL........VLDTDYKKYLLFCMENSAEPEQSLAC135lactoglobulin137RLLNLDGTCADSYSFVFSRDPNGLPPEAQKIVRQRQ.EELCLARQYRLIV185RBP.|||:||.||||136QCLVRTPEVDDEALEKFDKALKALPMHIRLSFNPTQLEEQCHI.......178lactoglobulinPairwisealignmentofretinol-bindingproteinand-lactoglobulinIdentity(bar)1MKWVWALLLLAAWAAAERDCRVSSFRVKENFDKARFSGTWYAMAKKDPEG50RBP.||||.|...|:.||||.:|:1...MKCLLLALALTCGAQALIVT..QTMKGLDIQKVAGTWYSLAMAASD.44lactoglobulin51LFLQDNIVAEFSVDETGQMSATAKGRVR.LLNNWD..VCADMVGTFTDTE97RBP:||||::|.|.|||:|||.45ISLLDAQSAPLRV.YVEELKPTPEGDLEILLQKWENGECAQKKIIAEKTK93lactoglobulin98DPAKFKMKYWGVASFLQKGNDDHWIVDTDYDTYAV...........QYSC136RBP||||.|:.|||||..|94IPAVFKIDALNENKVL.....