Abstract

Recent studies of DNA sequence of letters A, C, G and T exhibit the inverse power law form 1/*f *^{a} frequency spectrum where *f* is the frequency and a the exponent^{1-5}. Inverse power-law form of the power spectra of *fractal* space-time fluctuations is generic to the dynamical systems in nature and is identified as *self-organized criticality*^{6-9}. In this study it is shown that the power spectra of the frequency distributions of bases A, C, G, T in the Human chromosome 1 DNA exhibit *self-organized criticality*. DNA is a quasicrystal possessing maximum packing efficiency^{10} in a hierarchy of spirals or loops. *Self-organized criticality* implies that non-coding *introns* may not be redundant, but serve to organize the effective functioning of the coding *exons* in the DNA molecule as a complete unit.

DNA topology is of fundamental importance for a wide range of biological processes^{11}. Since the topological state of genomic DNA is of importance for its replication, recombination and transcription, there is an immediate interest to obtain information about the supercoiled state from sequence periodicities^{12, 13}. Identification of dominant periodicities in DNA sequence will help understand the important role of coherent structures in genome sequence organization^{14, 15}. Li^{16} has discussed meaningful applications of spectral analyses in DNA sequence studies. Recent studies indicate that the DNA sequence of letters A, C, G and T exhibit the inverse power law form 1/*f *^{a} frequency spectrum where *f* is the frequency and a the exponent. It is possible, therefore, that the sequences have long-range order^{1-3, 17-19}. Power spectra of *fractal* space-time fluctuations of dynamical systems such as fluid flows, stock market price fluctuations, heart beat patterns, etc., exhibit inverse power-law form identified as *self-organized criticality*^{6} and represent a selfsimilar eddy continuum. A general systems theory^{7-9} developed by the author shows that such an eddy continuum can be visualised as a hierarchy of successively larger scale eddies enclosing smaller scale eddies. Since the large eddy is the integrated mean of the enclosed smaller eddies, the eddy energy (variance) spectrum follows the statistical normal distribution according to the *Central Limit Theorem*^{20}. Hence the additive amplitudes of eddies, when squared, represent the probabilities, which is also an observed feature of the subatomic dynamics of quantum systems such as the electron or photon^{21-23}. The long-range correlations intrinsic to self-organized criticality in dynamical systems are signatures of quantumlike chaos associated with the following characteristics: (a) The *fractal* fluctuations result from an overall logarithmic spiral trajectory with the quasiperiodic *Penrose* tiling pattern^{7-9} for the internal structure. (b) Conventional continuous periodogram power spectral analyses of such spiral trajectories will reveal a continuum of wavelengths with progressive increase in phase. (c) The broadband power spectrum will have embedded dominant wavebands, the bandwidth increasing with wavelength, and the wavelengths being functions of the *golden mean*. The first *13* values of the model predicted^{7-9} dominant peak wavelengths are *2.2, 3.6, 5.8, 9.5, 15.3, 24.8, 40.1, 64.9, 105.0, 167.0, 275, 445.0 and 720* in units of the block length *10*bp (base pairs) in the present study. Wavelengths (or periodicities) close to the model predicted values have been reported in weather and climate variability^{8}, prime number distribution^{24}, Riemann zeta zeros (non-trivial) distribution^{25}, stock market economics^{26}. (d) The conventional power spectrum plotted as the variance versus the frequency in log-log scale will now represent the eddy probability density on logarithmic scale versus the standard deviation of the eddy fluctuations on linear scale since the logarithm of the eddy wavelength represents the standard deviation, i.e. the r.m.s (root mean square) value of the eddy fluctuations. The r.m.s. value of the eddy fluctuations can be represented in terms of statistical normal distribution as follows. A normalized standard deviation *t=0* corresponds to cumulative percentage probability density equal to *50* for the mean value of the distribution. For the overall logarithmic spiral circulation the logarithm of the wavelength represents the r.m.s. value of eddy fluctuations and the normalized standard deviation *t* is defined for the eddy energy as

(1)

The parameter *L* in Eq. 1 is the wavelength and *T _{50}* is the wavelength up to which the cumulative percentage contribution to total variance is equal to

The Human chromosome 1 DNA base sequence was obtained from the entrez Databases, Homo sapiens Genome (build 30) at http://www.ncbi.nlm.nih.gov/entrez. The first

The power spectra of frequency distribution of bases were computed accurately by an elementary, but very powerful method of analysis developed by Jenkinson (1977)

- Figure 1: Average variance spectra for the four bases in Human chromosome 1 DNA. Continuous lines are for the variance spectra and open circles give the statistical normal distribution. The mean and standard deviation of the wavelengths T50 up to which the cumulative percentage contribution to total variance is equal to 50 are also given in the figure.

- Figure 2: Average wavelength class interval-wise percentage distribution of dominant (normalized variance greater than 1) wavelengths. Line + open circle give the average and dotted lines denote one standard deviation on either side of the mean. The computed percentage contribution to the total variance for each class interval is given by line + star.

The variance spectra for almost all the *280* data sets exhibit the universal inverse power-law form 1/*f *^{a} of the statistical normal distribution (Fig. 1) where *f* is the frequency and the spectral slope a decreases with increase in wavelength and approaches *1* for long wavelengths. The above result is also seen in Fig. 2 where the wavelength class interval-wise percentage frequency distribution of dominant wavelengths follow closely the corresponding computed variation of percentage contribution to the total variance as given by the statistical normal distribution. Inverse power-law form for power spectra implies long-range spatial correlations in the frequency distributions of the bases in DNA. Microscopic-scale quantum systems such as the electron or photon exhibit non-local connections or long-range correlations and are visualized to result from the superimposition of a continuum of eddies. Therefore, by analogy, the observed fractal fluctuations of the frequency distributions of the bases exhibit quantumlike chaos in the Human chromosome 1 DNA. The eddy continuum acts as a robust unified whole fuzzy logic network with global response to local perturbations. Therefore, artificial modification of base sequence structure at any location may have significant noticeable effect on the function of the DNA molecule as a whole. Further, the presence of *introns*, which do not have meaningful code, may not be redundant, but may serve to organize the effective functioning of the coding *exons* in the DNA molecule as a complete unit^{2}.

The results imply that the DNA base sequence self-organizes spontaneously to generate the robust geometry of logarithmic spiral with the quasiperiodic *Penrose* tiling pattern for the internal structure. The space filling geometric figure of the *Penrose* tiling pattern has intrinsic local five-fold symmetry^{28} and ten fold symmetry. One of the three basic components of DNA, the deoxyribose is a five-carbon sugar and may represent the local five-fold symmetry of the quasicrystalline structure of the quasiperiodic Penrose tiling pattern of the DNA molecule as a whole. The DNA molecule shows ten fold symmetry in the arrangement of *10* bases per turn of the double helix. The study of plant phyllotaxis in *Botany* shows that quasicrystalline structure provides maximum packing efficiency for seeds, florets, leaves, etc^{29, 10, 30}. Quasicrystalline structure of the quasiperiodic *Penrose* tiling pattern may be the geometrical structure underlying the packing of *10 ^{3}* to