Take the amino acid sequence of the myb protein and search
against proteins with
BLAST.
-
Concatenate domains
We want to obtain a HMM for myb-domains to search against a database.
Keep that in mind while
screening the hits of the BLAST-search.
Select some myb-domains and copy the corresponding
parts of the sequences to a file in fasta-format
(fasta format is described in exercises section of Pairwise Alignments).
-
Multiply align domains
Download ClustalW (or ClustalX ?) from EBI-ftp-Server:
-> ftp.ebi.ac.uk
-> directory /pub/software/unix
-> clustalw.tar.Z
decompress and detar with
$ gtar xvzf clustalw.tar.Z
$ cd clustalw1.7
compile the source code distribution:
$ make
Multiply align the myb-domains by clustalw.
-
HMMER
Install HMMER 2.1.1
Download from
-> ftp.genetics.wustl.edu
-> directory /pub/eddy/hmmer
-> hmmer-2.1.1.tar.Z
decompress, detar (see above)
and install source code distribution (see file INSTALL):
$ cd hmmer-2.1.1
$ ./configure
$ make
The binaries now are in the directory binaries.
-
Build HMM of myb-domains
Build and calibrate a HMM of the myb-domains by means
of hmmbuild and hmmcalibrate.
-
Search a database for homologues
We want to search a fasta-database consisting of concatenated sequences
in fasta format
(fasta format is described in exercises section of Pairwise Alignments).
Therefore, first create a directory for the database and change into this directory:
$ mkdir database
$ cd database
Download e.g. the peptides of the Drosophila genomes in fasta format from NCBI (~7,5 MB)
-> ftp ncbi.nlm.nih.gov (ftp-Server at NCBI)
-> directory /genbank/genomes/D_melanogaster/Scaffolds/LARGE
-> download all files ending with '.faa'
concatenate the peptides of all scaffolds into one file:
$ cat *.faa > drosophila.fasta
($ rm *.faa)
Use hmmsearch to search against a database with the HMM of the
myb-domains:
Use '>' to direct standard output to a file.
The commandline may look like this:
$ hmmsearch myb_domains.hmm ~/database/Drosophila.fasta > myb_hmm_drome.log
-
Iterate search
Screen the hits, build a new HMM including selected hits and hmmsearch again.
Comments are very welcome.
luz@molgen.mpg.de