• Tidak ada hasil yang ditemukan

The Nested Periodic Subspaces: Extensions of Ramanujan Sums for Period Estimation

N/A
N/A
Protected

Academic year: 2023

Membagikan "The Nested Periodic Subspaces: Extensions of Ramanujan Sums for Period Estimation"

Copied!
211
0
0

Teks penuh

Introduction

A Motivating Example

Unfortunately, the periods detected by this approach can only be divisors of the DFT length, which is 100 in this case. Note that the right periods 3, 7 and 11 are much more clearly visible in this graph.

Figure 1.1: Examples of signals with periodicity. (a) repeats in Protein Molecules (PDB 1n11), (b) DNA microsatellites used in forensics, (c) EEG waveform during an episode of absence seizure, (d) periodicity in art: (left) Tesellations by M
Figure 1.1: Examples of signals with periodicity. (a) repeats in Protein Molecules (PDB 1n11), (b) DNA microsatellites used in forensics, (c) EEG waveform during an episode of absence seizure, (d) periodicity in art: (left) Tesellations by M

Outline and Contributions of This Thesis

A consequence of this is that the columns of the DFT matrix in the previous subsection also span the Ramanujan subspaces. Note that the LCM property correctly predicts the period of the signal as 8 in each case.

Figure 1.7: The Ramanujan Filter Bank can be used to track time varying periodicity.
Figure 1.7: The Ramanujan Filter Bank can be used to track time varying periodicity.

Ramanujan Sums And Subspaces

Ramanujan Sums

In 1918, the Indian mathematician Srinivasa Ramanujan introduced a set of sequences known as Ramanujan Sums [8]. In his original application, Ramanujan showed that many arithmetic functions in number theory can be expanded as a series using Ramanujan sums.

Early Use of Ramanujan Sums in DSP

Although this is clearly an 8-period series, it can be shown that c8(n−1) is orthogonal tocq(n) for every q, including q = 8.

Generalizing Ramanujan Sums to Ramanujan Subspaces

The Ramanujan Periodicity Transform (RPT) Matrix

PM are therefore the periods of those columns of A that are multiplied by the non-zero entries of iny. Note that in Theorem 2.4.1 we assumed that the signal is known a priori to lie in a particular VP and then found its exact period using (2.13).

Estimating the period beyond a particular V P

When the data length is less than L, we can still try to plot the projection energies of the signal along the various Ramanujan subspaces. We empirically found in all our simulations that the LCM property could still be used to provide the correct period estimates unless the data length is too small.

Conclusion

Using the LCM property of NPDs, we can conclude that the signal has period 6. The LCM property of Ramanujan subspaces applied to the vertices of (6.19), gives the period.

Figure 2.3: (Top) 100 samples of a noise-less period 10 signal. (Middle) Noise corrupted input (SNR = 5dB)
Figure 2.3: (Top) 100 samples of a noise-less period 10 signal. (Middle) Noise corrupted input (SNR = 5dB)

Nested Periodic Matrices And Dictionaries

Nested Periodicity Matrix: Definition

Since A has full rank, its columns form a basis for CP, so by periodically expanding the columns of A we can obtain a basis for VP (from Eq. 2.8). Note that the Ramanujan Periodicity Transform matrix in Chapter 2 is an example of an NPM.

NPM Properties

If we expand it in terms of the columns of the 6×6 identity matrix, in general each of the 6 basis vectors would require non-zero coefficients. But if we use any of the NPMs for V6, then only the period 3 and period 1 columns of those matrices will be included in the signal extension.

Examples of NPMs

Since there are φ(di) such values ​​of ki, it follows that there are φ(di)signals of the form WPkn with period exactly equal to di. The columns of the P×P DFT matrix can be divided into K classes, one for each divisor di|P.

Nested Periodic Dictionaries

Parts (b), (c), (d) and (e) – The strength vs period plots for the solutions of the convex program (3.24) using Ramanujan, DFT, Random and Natural Basis dictionaries respectively. The effect of shifting the signal on the strength vs period plots of the natural base dictionary is shown in Fig.

Figure 3.2: Part (a) and (b) - Strength vs period plots for a period 70 signal using 70 × 70 Natural Basis and Ramanujan Periodicity Transform matrices respectively.
Figure 3.2: Part (a) and (b) - Strength vs period plots for a period 70 signal using 70 × 70 Natural Basis and Ramanujan Periodicity Transform matrices respectively.

Real World Examples

Parts (d) and (e) show plots of force versus period for the signals in parts (b) and (c) respectively using the convex program (3.24) and the Ramanujan dictionary. The first of them is related to the secondary structure while the second is related to the molecular size and volume of the amino acids.

Figure 3.9: Part (a) – A period 70 signal generated as a sum of period 7 and period 10 signals
Figure 3.9: Part (a) – A period 70 signal generated as a sum of period 7 and period 10 signals

Conclusion

What are the required dimensions of the various subspaces representing hidden periods in the dictionary. The performance of RFB is compared with the following four popular techniques mentioned in the previous section: (i) RADAR [48] (ii) REPwin [50] (iii) STFT-based FTwin of [50] and (iv) wavelet technique of [53].

Figure 3.11: Part (a) - ECG waveform of a 23 year old female patient. Part (b) - -DFT coefficients for positive frequencies.
Figure 3.11: Part (a) - ECG waveform of a 23 year old female patient. Part (b) - -DFT coefficients for positive frequencies.

A Unified Theory of Union-of-Subspaces Representations of

Introduction

Furthermore, there are a number of unanswered questions related to dictionaries constructed using such subspaces. In the EPS theory, the dimensions of the periodic subspaces are not properly explained or specified.

Review of Existing Subspace Methods

This same problem was caused by VP ⊂ VN Pis encountered in the period estimation algorithms proposed by Sethares and Staley in [13]. Independently of the ML framework, Sethares and Staley in [13] proposed several methods comparing the projection energies of the signal onto VPs.

Figure 4.2: An example to illustrate that the projection energies obtained on the different V P ’s depend on the order of projections
Figure 4.2: An example to illustrate that the projection energies obtained on the different V P ’s depend on the order of projections

Insights into the Relationships Between Various Techniques

From the direct sum property of Ramanujan sums, it follows that y(n) ⊥ Vd for every real divisor d of P. TPmax} is a set of nested periodic subspaces such that Ti ⊥ Tj whenever , j, then for each i, Timora necessarily be their Ramanujan subspace.

Fundamental Properties of Subspaces That Admit Unique Periodic

In [23], it was shown that the Eulerian structure of NPMs leads to their nested fundamental property (Section II-E). RPmax} is a linearly independent set due to the linear independence condition in Definition 4.4.1 of LIPS.

From Subspaces to Dictionaries

Note that the signals in a periodic dictionary are infinitely long, defined over all values ​​of the time index n. Doing so greatly reduces the number of atoms in the dictionary, as many of the DFT matrices of different sizes give rise to the same columns.

Figure 4.5: Parts (a) and (b): Two different decompositions of a period 8 signal onto {V 1 , V 2 ,
Figure 4.5: Parts (a) and (b): Two different decompositions of a period 8 signal onto {V 1 , V 2 ,

Fundamental Properties of Periodic Dictionaries

The signals in B must be able to span the complex exponentials in the Farey dictionary. SoB must contain at least as many linearly independent signals as in the Farey dictionary, namely ÍPmax.

Effect of Dictionary Redundancy on Period Estimation: A Numerical

For example, a signal whose period is known to belong to the set 1 ≤ P ≤ Pmax lies in the union of the subspaces V1,V2,. Increasing T to about 0.5 does not help much, and results in the recovered signal shown in Fig.

Figure 4.6: Effect of Redundancy on Signal Recovery: Part (a) - A randomly generated, noiseless, period 6 signal of length 56 samples
Figure 4.6: Effect of Redundancy on Signal Recovery: Part (a) - A randomly generated, noiseless, period 6 signal of length 56 samples

The Case of Mixtures of Periodic Signals

The importance of less redundant dictionaries is magnified in the context of periodic signal mixtures. The presence of spurious peaks due to the linear dependence between the underlying subspace, such as those in Fig.

Conclusion

Although this process is intuitive, a careful formulation of the estimation of hidden periods needs to be presented in a future work, as it requires a much deeper discussion.

Chapter Appendix

This is exactly what happens in the RFB as the length of the filters tends to infinity. In the following, we test the sensitivity of the RFB in detecting absence seizures.

The Ramanujan Filter Bank and Its Applications

The Ramanujan Filter Bank

The two discontinuities in the pentapeptide repeats, in the form of insertion loops in the crystal structure (Fig. 5.6), can in fact be predicted directly from the plot of RFB period versus site. In all our examples, we observed that the largest periods appearing on the RFB time period plot correspond to the most prominent repetitions.

Figure 5.1: Part (a) - Block diagram of the Ramanujan Filter Bank. Part (b) - An example of the impulse response of an RFB filter, h 10 (n)
Figure 5.1: Part (a) - Block diagram of the Ramanujan Filter Bank. Part (b) - An example of the impulse response of an RFB filter, h 10 (n)

Detecting Tandem Repeats in DNA

In simpler terms, the TRF algorithm first finds all words of a small size that are repeated in the sequence, and then uses statistical tests to determine whether these words are part of larger consecutive repetitions. Our results in the following sections show that RFB is able to detect a number of repetitions that the TRF algorithm (with its default settings) cannot find.

Figure 5.10: Protein repeats that were applied as inputs to FTwin [50], the wavelet based method of [53], RADAR [48], REPwin [50], and the RFB
Figure 5.10: Protein repeats that were applied as inputs to FTwin [50], the wavelet based method of [53], RADAR [48], REPwin [50], and the RFB

Adapting the Ramanujan Filter Bank to Detect Tandem Repeats in

In our experiments, we found that choosing T1 as 20% of the maximum value at the time vs period level gave good results. The portion of the time vs period level that this repetition showed is shown in Fig.

Figure 5.11: The section of the Time vs Period plane containing the tandem repeat of Table 5.3
Figure 5.11: The section of the Time vs Period plane containing the tandem repeat of Table 5.3

Absence Seizure Detection Using Ramanujan Filter Banks

We will then demonstrate the period estimation capabilities of the new filter banks using simulations. In the first, we study the accuracy of the period estimate as a function of the number of samples of y(n) used for the calculation of σy2.

Figure 5.12: An example of the 3Hz spike and wave discharge pattern in the EEG during an absence seizure.
Figure 5.12: An example of the 3Hz spike and wave discharge pattern in the EEG during an absence seizure.

Unique Representation Filter Banks: Removing Redundancy in the

Conclusion

If we try to solve the following system of linear equations as in the proof of Theorem 7.4.1, . Rather, the precise data length depends only on the number of expected harmonics in the signal.

Figure 5.16: Demonstrating a URFB for Table 5.7. See Sec. 5.8 for details.
Figure 5.16: Demonstrating a URFB for Table 5.7. See Sec. 5.8 for details.

MUSIC and Periodicity: An Overview of Prior Works

This means that the complex exponents in (6.3) turn out to be orthogonal to the eigenspace of the noise. As long as N > K, it can be shown that the only complex exponents orthogonal to the noise eigenspace are those in (6.3).

The Proposed Methods

This is because the number of ways Kl's in (6.17) can be chosen to sum the total signal space dimension K increases exponentially with Q. In contrast, the proposed iMUSIC only needs to compute (6.19) regardless of the number of hidden periodic components.

Generalizing iMUSIC From Farey Atoms to Other NPS Bases

So it is quite possible that some of the bar columns are not orthogonal to Ue. For the natural baseline and randomly generated NPSs, we can see spurious peaks at multiples of the true period, which is 10.

Figure 6.5: Demonstrating the NPS based iMUSIC methods on the signal shown in Fig. 6.4 using (a) a Ramanujan dictionary, (b) a natural basis dictionary and (c) a randomly generated NPS dictionary.
Figure 6.5: Demonstrating the NPS based iMUSIC methods on the signal shown in Fig. 6.4 using (a) a Ramanujan dictionary, (b) a natural basis dictionary and (c) a randomly generated NPS dictionary.

Experiments

Once again, both of these methods give close to 0 probability of correct estimation when their frequency grid sizes are comparable to the Farey dictionary. MUSIC and HMUSIC were implemented with a uniform frequency grid of the same size as the Farey grid.

Figure 6.7: Probability of Estimating both the component periods exactly. Comparison of the proposed Farey-MUSIC with other techniques
Figure 6.7: Probability of Estimating both the component periods exactly. Comparison of the proposed Farey-MUSIC with other techniques

Conclusion

They are also associated with several genetic disorders such as fragile X syndrome, myotonic dystrophy, Huntington's disease, and Friedreich's ataxia [69]. Adapting them for specific applications such as DNA and protein replication will be part of our future work.

Chapter Appendix

If we try to solve the following system of linear equations as in the proof of Theorem 7.2.3,. On the other hand, the technique used in the proof of Theorem 7.2.3 can reliably identify the period from only 12 samples.

Minimum Datalength for Integer Period Estimation

Introduction

Given a seriesx(n) with an integer period, what is the absolute lower bound of the data length required to identify its period. However, the minimum data length required by the dictionary-based algorithms in [23] has not been previously reported.

Figure 7.1: Part (a) - An arbitrary line spectrum; Part (b) - The harmonic line spectrum of a periodic signal
Figure 7.1: Part (a) - An arbitrary line spectrum; Part (b) - The harmonic line spectrum of a periodic signal

Minimum Datalength for the Single Period Case

So anyT consecutive samples of x(n) are sufficient to retrieve, and thus to estimate the period. This shows that at least T samples are needed to estimate the period from the set {P1,P2}.

Figure 7.2: An example illustrating the necessity of L min (Eq. (7.5)) samples. Part (a) - A period 6 signal; Part (b) - A period 15 signal
Figure 7.2: An example illustrating the necessity of L min (Eq. (7.5)) samples. Part (a) - A period 6 signal; Part (b) - A period 15 signal

Mixtures of Periodic Signals

There are no hidden periods other than 6 itself, since this signal can never be written as a sum of signals, all of which are strictly smaller than 6. If PN PS is the set of periods of all the non-zero components in the decomposition, then the extracted M-set of PN PS is intuitively interpreted as the hidden periods in those works.

Figure 7.3: Error Rate vs Data-Length for a fixed SNR. “Th" refers to the threshold.
Figure 7.3: Error Rate vs Data-Length for a fixed SNR. “Th" refers to the threshold.

Minimum Datalength for Estimating The Hidden Periods

We state the following result without proof, since the idea is similar to that in the proof of Theorem 7.4.2. We had allowed the number of columns in the translation matrixPin Definition 8.0.1 to be any natural numberN.

Figure 7.5: DFT spectra for the sequences in Fig. 7.4. The periods (2π / ω values) corresponding to the peak frequencies are shown in parenthesis
Figure 7.5: DFT spectra for the sequences in Fig. 7.4. The periods (2π / ω values) corresponding to the peak frequencies are shown in parenthesis

Connection to Dictionaries Spanning Periodic Signals

Non-Integer Periodicity and Connections to Caratheodory’s Results . 157

One of the most important results of Theorem 7.6.1 is the following: For a signal satisfying (7.46), the minimum required data length to estimate the basis is not actually twice the largest expected period. Once again, it can be shown that the minimum necessary and sufficient data length will depend only on the total number of lines in the spectrum.

Minimal Non-Contiguous Sampling For Period Estimation

Since there are G such outputs, PG1 x G = P1 samples of x(n) are sufficient to determine whether the period is P1 or P2. We will now prove that whenP1|P2, you need at least P2 samples to estimate the period.

Figure 7.7: Finding the period of x(n) when (P 1 , P 2 ) = G , P 1 . See text for details.
Figure 7.7: Finding the period of x(n) when (P 1 , P 2 ) = G , P 1 . See text for details.

Simulations Under Noise

In our second experiment, we fixed the number of samples for beP1 and plotted the error rate for different values ​​of SNR. Once again, as is consistent with intuition, we observed the error rate decrease as the SNR increases.

Figure 7.9: Error rate vs. Number of Samples. P 1 = 9, P 2 = 13, SNR = 0dB. See Sec. 7.9 for details.
Figure 7.9: Error rate vs. Number of Samples. P 1 = 9, P 2 = 13, SNR = 0dB. See Sec. 7.9 for details.

Concluding Remarks

In other words, the period is defined as the smallest parallelepiped that is the repetition area for the signal. So from the above result, the number of points in any iteration region with the translation matrix is ​​Pis|det(P)|.

Figure 8.1: Part (a) - A two dimensional periodic signal according to the definition in (8.1), whose period is represented by the matrix in (8.2)
Figure 8.1: Part (a) - A two dimensional periodic signal according to the definition in (8.1), whose period is represented by the matrix in (8.2)

Chapter Appendix

Arbitrarily Shaped Periods in Multi-Dimensional Periodicity

A Note on Translational Matrices

On the other hand, if Ph has full rank but has N > D, then it can be shown that the set of all copies of Sovereign obtained by translations along the columns of P would be the same set as that obtained by translation along the D directions represented by the. A left divisor of P, call it L0, which satisfies the property that every left divisor of P is a left divisor of L0, is known as the largest left divisor (gld) of P. It can be shown that a gld of P exists if it has rank D. Note that glds are not unique, since ifL0 is a gld, so is L0U for any unitary integer matrix.

Figure 8.4: A parallelepiped repetition region always exists for any signal following Definition 8.0.1
Figure 8.4: A parallelepiped repetition region always exists for any signal following Definition 8.0.1

Conclusion

Concluding Remarks

Comparison of CPU times for solving (3.32) and (3.24)

Period Estimation using STFT and the RFB

Example 1: Human Frataxin Gene

Example 2: Human Genome Sequence AC010136

Ramanujan Filter Bank: Period to Filters Map

Tables Mapping Periods To Unique Sets of Filters. The Table on the

A URFB for periods 1 to 4

Comparison between (7.30) and (7.41) for P max = 20

Gambar

Figure 1.2: 100 samples of a signal that is a mixture of randomly generated period 3, period 7 and period 11 signals.
Figure 1.3: The spectra of periodic signals have a regular harmonic structure as shown.
Figure 1.4: The 100 point DFT magnitude spectrum for the signal shown in Fig. 1.2.
Figure 1.7: The Ramanujan Filter Bank can be used to track time varying periodicity.
+7

Referensi

Dokumen terkait

Come un moderno alienista Niccolò Ammaniti disseziona la mente di una donna, ne esplora le paure, le ossessioni, i desideri inconfessabili in un romanzo che unisce spericolata