Skip to main content
placeholder image

Correlations of Length Distributions between Non-coding and Coding Sequences of Arabidopsis Thaliana

Conference Paper


Download full-text (Open Access)

Abstract


  • Gene length and organization are important attributes of genomics. With a large amount of sequence data becoming available, statistical analyses can be applied to this data and will offer beneficial output to research communities. Previous work in this field has focused on protein length and its coding region, while we are also investigating the non-coding regions, as well as trying to uncover any potential correlation that may exist between the regions. Analysis on the Arabidopsis thaliana found there was a strong correlation between the coding sequence length and the 3ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÿ UTR region, conditional on the 5ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÿ UTR ratios. These results seemed consistent over all chromosomes and data that either contained or lacked introns. Classification of proteins into functional classes presented several interesting results. It was found that the number of proteins in a specific functional category decreased as the value of the 5ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÿ UTR ratio length, separated into eight subsets, increased. This work has revealed some possible correlations between different gene regions.

Publication Date


  • 2008

Citation


  • Caldwell, R., Lin, Y. & Zhang, R. (2008). Correlations of Length Distributions between Non-coding and Coding Sequences of Arabidopsis Thaliana. IEEE International Conference on Bioinformatics and Biomedicine (pp. 72-77). Los Alamitos, California, Washington, Tokyo: IEEE Computer Society.

Scopus Eid


  • 2-s2.0-58049185289

Ro Full-text Url


  • http://ro.uow.edu.au/cgi/viewcontent.cgi?article=2429&context=infopapers

Ro Metadata Url


  • http://ro.uow.edu.au/infopapers/1409

Start Page


  • 72

End Page


  • 77

Place Of Publication


  • Los Alamitos, California, Washington, Tokyo

Abstract


  • Gene length and organization are important attributes of genomics. With a large amount of sequence data becoming available, statistical analyses can be applied to this data and will offer beneficial output to research communities. Previous work in this field has focused on protein length and its coding region, while we are also investigating the non-coding regions, as well as trying to uncover any potential correlation that may exist between the regions. Analysis on the Arabidopsis thaliana found there was a strong correlation between the coding sequence length and the 3ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÿ UTR region, conditional on the 5ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÿ UTR ratios. These results seemed consistent over all chromosomes and data that either contained or lacked introns. Classification of proteins into functional classes presented several interesting results. It was found that the number of proteins in a specific functional category decreased as the value of the 5ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÿ UTR ratio length, separated into eight subsets, increased. This work has revealed some possible correlations between different gene regions.

Publication Date


  • 2008

Citation


  • Caldwell, R., Lin, Y. & Zhang, R. (2008). Correlations of Length Distributions between Non-coding and Coding Sequences of Arabidopsis Thaliana. IEEE International Conference on Bioinformatics and Biomedicine (pp. 72-77). Los Alamitos, California, Washington, Tokyo: IEEE Computer Society.

Scopus Eid


  • 2-s2.0-58049185289

Ro Full-text Url


  • http://ro.uow.edu.au/cgi/viewcontent.cgi?article=2429&context=infopapers

Ro Metadata Url


  • http://ro.uow.edu.au/infopapers/1409

Start Page


  • 72

End Page


  • 77

Place Of Publication


  • Los Alamitos, California, Washington, Tokyo