Here is a database for predicting protein domains: SMART (Simple Modular Architecture Research Tool), which is an online analysis tool for protein domain identification and annotation. Its data is synchronized with UniProt, Ensembl and STRING databases, and more than 1,300 protein domains have been manually annotated. More than 20 years have passed since the first edition was released in 1997, and it is still popular.
Tool URL: http://smart.embl-heidelberg.de/
SMART has two modes: normal and genomic. The main difference between the two is the difference in the underlying database. The former is redundant, while the latter only uses proteomic data that has completed genome sequencing.
The colors of the two modes are different, but the interface is similar. You can switch by clicking the corresponding mode. For example, enter the genomic mode like this:
You can search for protein domains by entering the ID (or ACC) or protein sequence of the Uniprot/Ensembl protein sequence.
Click the Sequence SMART button to submit the task. After a few seconds, you can get the prediction result in the figure below. The structure diagram of the web page is interactive, and it can also be saved as a vector diagram in svg format for easy use in the article.
The following is a list of predicted domains, including those shown and not shown in the figure above.
If you have a lot of sequences that need to be predicted, you can also enter SMART's batch processing page (as shown in the figure below, click the question mark button in the Sequence analysis section to have a link), URL: http://smart.embl-heidelberg.de/smart/batch.pl
Prepare the ID (or ACC) or protein sequence (fasta format) of the Uniprot/Ensembl protein sequence according to the requirements on the page, and you can submit up to 10,000.
Here is just a brief introduction, if you are interested, hurry up and try other functions~
PDB protein three-dimensional structure http://www.rcsb.org/pdb
SWISS-PROT protein sequence database http://kr.expasy.org/sprot/
PIR protein sequence database http://pir.georgetown.edu/
OWL non-redundant protein sequence http://www.bioinf.man.ac.uk/dbbrowser/OWL/
EMBL Nucleic Acid Sequence Database http://www.embl-heidelberg.de/
Translation database of TrEMBLEMBL http://kr.expasy.org/sprot/
GenBANK Nucleic Acid Sequence Database http://www.ncbi.nih.gov/Genbank/
PROSITE protein functional site http://kr.expasy.org/prosite/
SWISS-MODEL builds structure from sequence model http://www.expasy.org/swissmod/SWISS-MODEL.html
SWISS-3DIMAGE three-dimensional structure diagram http://us.expasy.org/sw3d/
DSSP protein secondary structure parameters http://www.cmbi.kun.nl/gv/dssp/
FSSP protein family with known spatial structure http://www.ebi.ac.uk/dali/fssp/fssp.html
SCOP protein classification database http://scop.mrc-lmb.cam.ac.uk/scop/
CATH protein classification database http://www.biochem.ucl.ac.uk/bsm/cath/
Pfam protein family and domain http://pfam.wustl.edu/