Compositional searching of CpG islands in the human genome

Pedro Luis Luque-Escamilla, José Martínez-Aroza, José L. Oliver, Juan Francisco Gómez-Lopera, and Ramón Román-Roldán
Phys. Rev. E 71, 061925 – Published 29 June 2005

Abstract

We report on an entropic edge detector based on the local calculation of the Jensen-Shannon divergence with application to the search for CpG islands. CpG islands are pieces of the genome related to gene expression and cell differentiation, and thus to cancer formation. Searching for these CpG islands is a major task in genetics and bioinformatics. Some algorithms have been proposed in the literature, based on moving statistics in a sliding window, but its size may greatly influence the results. The local use of Jensen-Shannon divergence is a completely different strategy: the nucleotide composition inside the islands is different from that in their environment, so a statistical distance—the Jensen-Shannon divergence—between the composition of two adjacent windows may be used as a measure of their dissimilarity. Sliding this double window over the entire sequence allows us to segment it compositionally. The fusion of those segments into greater ones that satisfy certain identification criteria must be achieved in order to obtain the definitive results. We find that the local use of Jensen-Shannon divergence is very suitable in processing DNA sequences for searching for compositionally different structures such as CpG islands, as compared to other algorithms in literature.

  • Figure
  • Figure
  • Figure
  • Received 21 October 2004

DOI:https://doi.org/10.1103/PhysRevE.71.061925

©2005 American Physical Society

Authors & Affiliations

Pedro Luis Luque-Escamilla

  • Department of Engineering and Mining Mechanics, University of Jaén, Escuela Politécnica Superior, Campus Las Lagunillas s/n, 23071 Jaén, Spain

José Martínez-Aroza*

  • Department of Applied Mathematics, University of Granada, Facultad de Ciencias, Avenida Fuentenueva s/n, 18071 Granada, Spain

José L. Oliver

  • Department of Genetics, University of Granada, Facultad de Ciencias, Avenida Fuentenueva s/n, 18071 Granada, Spain

Juan Francisco Gómez-Lopera and Ramón Román-Roldán

  • Department of Applied Physics, University of Granada, Facultad de Ciencias, Avenida Fuentenueva s/n, 18071 Granada, Spain

  • *Corresponding author. Mailing address: Departamento de Matemática Aplicada, Facultad de Ciencias, 18071 Granada, Spain. FAX: +34 58 24 29 40. Electronic address: jmaroza@ugr.es

Article Text (Subscription Required)

Click to Expand

References (Subscription Required)

Click to Expand
Issue

Vol. 71, Iss. 6 — June 2005

Reuse & Permissions
Access Options
Author publication services for translation and copyediting assistance advertisement

Authorization Required


×
×

Images

×

Sign up to receive regular email alerts from Physical Review E

Log In

Cancel
×

Search


Article Lookup

Paste a citation or DOI

Enter a citation
×