Skip to main content
placeholder image

Numerical characterization of protein sequences based on the generalized Chou’s pseudo amino acid composition

Journal Article


Download full-text (Open Access)

Abstract


  • The technique of comparison and analysis of biological sequences is playing an increasingly important role in the field of Computational Biology and Bioinformatics. One of the key steps in developing the technique is to identify an appropriate manner to represent a biological sequence. In this paper, on the basis of three physical–chemical properties of amino acids, a protein primary sequence is reduced into a six-letter sequence, and then a set of elements which reflect the global and local sequence-order information is extracted. Combining these elements with the frequencies of 20 native amino acids, a (21+λ) dimensional vector is constructed to characterize the protein sequence. The utility of the proposed approach is illustrated by phylogenetic analysis and identification of DNA-binding proteins.

Authors


  •   Li, Chun (external author)
  •   Li, Xueqin (external author)
  •   Lin, Yan-Xia

Publication Date


  • 2016

Citation


  • Li, C., Li, X. & Lin, Y. (2016). Numerical characterization of protein sequences based on the generalized Chou’s pseudo amino acid composition. Applied Sciences, 6 (12), 406-1-406-16.

Scopus Eid


  • 2-s2.0-85007453979

Ro Full-text Url


  • http://ro.uow.edu.au/cgi/viewcontent.cgi?article=7281&context=eispapers

Ro Metadata Url


  • http://ro.uow.edu.au/eispapers/6251

Start Page


  • 406-1

End Page


  • 406-16

Volume


  • 6

Issue


  • 12

Abstract


  • The technique of comparison and analysis of biological sequences is playing an increasingly important role in the field of Computational Biology and Bioinformatics. One of the key steps in developing the technique is to identify an appropriate manner to represent a biological sequence. In this paper, on the basis of three physical–chemical properties of amino acids, a protein primary sequence is reduced into a six-letter sequence, and then a set of elements which reflect the global and local sequence-order information is extracted. Combining these elements with the frequencies of 20 native amino acids, a (21+λ) dimensional vector is constructed to characterize the protein sequence. The utility of the proposed approach is illustrated by phylogenetic analysis and identification of DNA-binding proteins.

Authors


  •   Li, Chun (external author)
  •   Li, Xueqin (external author)
  •   Lin, Yan-Xia

Publication Date


  • 2016

Citation


  • Li, C., Li, X. & Lin, Y. (2016). Numerical characterization of protein sequences based on the generalized Chou’s pseudo amino acid composition. Applied Sciences, 6 (12), 406-1-406-16.

Scopus Eid


  • 2-s2.0-85007453979

Ro Full-text Url


  • http://ro.uow.edu.au/cgi/viewcontent.cgi?article=7281&context=eispapers

Ro Metadata Url


  • http://ro.uow.edu.au/eispapers/6251

Start Page


  • 406-1

End Page


  • 406-16

Volume


  • 6

Issue


  • 12