HK1140261B - Systems and methods for determining cross-talk coefficients in pcr and other data sets - Google Patents
Systems and methods for determining cross-talk coefficients in pcr and other data sets Download PDFInfo
- Publication number
- HK1140261B HK1140261B HK10106308.2A HK10106308A HK1140261B HK 1140261 B HK1140261 B HK 1140261B HK 10106308 A HK10106308 A HK 10106308A HK 1140261 B HK1140261 B HK 1140261B
- Authority
- HK
- Hong Kong
- Prior art keywords
- crosstalk
- data set
- pcr
- acquisition range
- channel
- Prior art date
Links
Description
Technical Field
The present invention relates generally to systems and methods for processing data representing sigmoid or growth curves, such as Polymerase Chain Reaction (PCR) curves, and more particularly to systems and methods for determining crosstalk characteristics of PCR detection systems.
Background
Polymerase Chain Reaction (PCR) is an in vitro method for enzymatic synthesis or amplification of defined nucleic acid sequences. This reaction typically uses two oligonucleotide primers that hybridize to opposite strands and flank the template or target DNA sequence to be amplified. Primer extension (elongation) is catalyzed by a thermostable DNA polymerase. A series of repeated cycles comprising template denaturation, primer annealing, and extension of the annealed primers by polymerase (extension) results in exponential accumulation of a particular DNA fragment. Fluorescent probes are typically used in this process to facilitate detection and quantification of the amplification process.
A typical set of real-time PCR curves is shown in fig. 1a, where the fluorescence intensity values are plotted against the cycle number of a typical PCR process. In this case, the formation of PCR products is monitored in each cycle of the PCR process. The amplification is typically measured in a temperature cycler that includes components and devices for measuring fluorescent signals during the amplification reaction. An example of such a temperature cycler is the Roche Diagnostics LightCycler (Cat. No. 20110468). The amplification products are detected, for example, by means of fluorescently labeled hybridization probes that emit a fluorescent signal when they bind to the target nucleic acid, or in some cases also by means of a fluorescent dye that binds to double-stranded DNA. As can be seen in FIG. 1a, the PCR curve comprises a base region 5 and a plateau (plateau) region 6. The region between the baseline region 5 and the plateau region 6 is generally referred to as the growth region.
A typical PCR detection system for analyzing radiation emissions from a PCR experiment includes two or more filters, each of which is operable to isolate a range of wavelengths for further analysis. For example, each optical filter generally allows substantially all radiation in a defined wavelength range to pass through. However, the probes or labels typically emit in partially overlapping wavelength bands, and the bandpass of the filter typically includes this overlapping region so that each detection channel will typically receive signals emitted from other probes. Such crosstalk signals tend to affect the true signal of interest. It is therefore desirable to correct for such crosstalk signals in each detection channel. One conventional way to do this is to determine quantitative crosstalk coefficients that can be used to correct the crosstalk signal in each of the detection channels.
In current crosstalk methods, the crosstalk coefficient is typically calculated using the ratio of the average plateau values of the fundamental and crosstalk signals; conventional methods rely solely on a plateau region containing less than 10% data. Also during PCR, a plateau region signal is generated when the chemistry is in an unstable state. For this reason, baseline signal thresholds are typically used for target recognition. Thus, conventional methods use noisy signals to determine crosstalk coefficients with limited information that does not include data from the true signal acquisition area on the curve. Moreover, it has been found that incorrect assumptions used in conventional crosstalk models can cause errors as a function of the data acquisition curve. Thus, conventional methods of calculating crosstalk coefficients may be satisfactory provided that (1) there is a plateau, (2) the plateau is flat, and (3) there is minimal noise in the plateau. However, this is not the case with many data sets.
It is therefore desirable to provide systems and methods for determining crosstalk coefficients in curves, such as sigmoid or growth curves and in particular PCR curves, which overcome the above-referenced problems and others.
Disclosure of Invention
The present invention provides systems and methods for determining crosstalk coefficients in curves such as sigmoid or growth curves and in particular PCR curves. The present invention also provides systems and methods for applying crosstalk coefficients using a linear subtractive model to produce a crosstalk-corrected data set.
According to various embodiments, the crosstalk signal coefficients are determined by minimizing the sum of the squares of the differences between the base signal (multiplied by the gain and optionally added to the linear term) and the crosstalk signal. This technique has been shown to be superior to conventional techniques that use the ratio of the mean plateau values of the fundamental and crosstalk signals. In addition, this technique analyzes the data over the entire signal acquisition range to determine crosstalk coefficients. For example, all data over the acquisition range may be used, or a portion of the data over the entire acquisition range may be used. Analyzing on all signal curve data provides more robust crosstalk correction over the entire data acquisition range. In addition, the conventional method assumes that the signals measured from all sources are linear addition models and that all signals are resolved between detectors; this is inaccurate and results in over-correction and under-correction of the resulting signal. The technique of the present invention instead uses a linear subtractive model that overcomes this problem and better models the actual detection system. These new techniques will find their greatest utility in examples where the crosstalk coefficient is in the range of 2% or more.
According to one aspect of the present invention, a method of determining crosstalk coefficients for a Polymerase Chain Reaction (PCR) optical detection system having at least two optical elements, each optical element operable to isolate a different specific range of electromagnetic wavelengths is provided. The method generally includes: acquiring a PCR data set over an acquisition range of a PCR growth process for each optical element; and simultaneously acquiring a crosstalk data set for each of the other optical elements over the acquisition range. The method also generally includes determining crosstalk coefficients using the PCR and the crosstalk data set. In certain aspects, the acquisition range includes a baseline region, a growth region, and a plateau region. In certain aspects, determining the crosstalk coefficient includes minimizing a sum of squares between each PCR data set and the crosstalk data set over the acquisition range. In certain aspects, crosstalk coefficients are applied to the PCR data set to produce a crosstalk-corrected PCR data set. In certain aspects, a linear subtractive model is used to apply the crosstalk coefficients.
According to another aspect of the invention, there is provided a computer readable medium comprising or storing code for controlling a processor to determine cross-talk coefficients for a Polymerase Chain Reaction (PCR) optical detection system having at least two optical elements, each optical element operable to isolate a different specific range of electromagnetic wavelengths. The code generally includes instructions for: receiving, for each optical element, a PCR data set acquired over an acquisition range of a PCR growth process; and simultaneously receiving, for each other optical element, a crosstalk data set for each filter acquired over the acquisition range. The code also generally includes instructions for determining crosstalk coefficients using the PCR and the crosstalk data set. In certain aspects, the acquisition range includes a baseline region, a growth region, and a plateau region. In certain aspects, the code for determining the crosstalk coefficient includes code for determining the crosstalk coefficient by minimizing a sum of squares between each PCR data set and the crosstalk data set over the acquisition range.
According to yet another aspect of the present invention, there is provided a kinetic Polymerase Chain Reaction (PCR) system generally comprising an optical detection module having at least two optical elements, each optical element operable to isolate a different specific electromagnetic wavelength range, wherein the optical detection module is generally adapted to: acquiring a PCR data set over an acquisition range of a PCR growth process for each optical element; and simultaneously acquiring a crosstalk data set for each of the other optical elements over the acquisition range. The system also generally includes an intelligence module adapted to process the acquired PCR data set and the crosstalk data set to determine crosstalk coefficients. In certain aspects, the acquisition range includes a baseline region, a growth region, and a plateau region. In certain aspects, the intelligence module determines the crosstalk coefficient by minimizing a sum of squares between each PCR data set and the crosstalk data set over the acquisition range.
According to yet another aspect of the present invention, a nucleic acid melting analysis system is provided that generally includes an optical detection module having at least two optical elements, each optical element operable to isolate a different specific electromagnetic wavelength range. The optical detection module is generally adapted to acquire a melting data set over a temperature acquisition range for each optical element and simultaneously acquire a crosstalk data set over the temperature acquisition range for each other optical element. The optical detection module is typically further adapted to determine the crosstalk coefficient by minimizing the sum of squares between each melting data set and crosstalk data set over the acquisition range.
According to yet another aspect of the present invention, a method of determining a crosstalk coefficient for an optical detection system having at least two optical elements, each optical element operable to isolate a different specific range of electromagnetic wavelengths, is provided. The method generally includes: acquiring a first data set over an acquisition range of the growth process for each optical element; simultaneously collecting a crosstalk data set over the collection range for each of the other optical elements; and determining a crosstalk coefficient using the first data set and the crosstalk data set over the acquisition range. In certain aspects, the growth process is one of a PCR process, a bacterial process, an enzymatic process, and a binding process.
Other features and advantages of the invention will be apparent from the remainder of the specification, including the drawings and claims. Further features and advantages of the invention, as well as the structure and operation of various embodiments of the invention, are described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers indicate identical or functionally similar elements.
Drawings
FIG. 1a shows a typical set of real-time PCR curves in which the fluorescence intensity values are plotted against the cycle number of a typical PCR process.
Fig. 1b shows a process for determining crosstalk coefficients for a detection system that uses two or more detection channels to analyze a PCR amplification process.
Fig. 2 shows a specific dual channel case: the FAM signal and its crosstalk to the HEX channel of the dual channel detection system.
Fig. 3 shows the overlap of the two dye spectra in the two filter ranges.
Fig. 4 shows 24 separate residual maps when testing the FAM and HEX channels using conventional methods, with the target in the FAM channel and no target in the HEX channel.
Figure 5 shows twenty-four views of figure 4 end-to-end.
Fig. 6 shows a superposition of all twenty-four graphs of fig. 4.
Fig. 7-9 show end-to-end residual maps of a data set processed according to an embodiment of the present invention.
Fig. 10-12 show overlay residual maps corresponding to the maps shown in fig. 7-9, respectively.
Fig. 13 shows a data set of a PCR experiment for an HIV assay, where the target is in the FAM filter and no target is in the HEX filter.
Fig. 14 shows a cross-talk data set for the HEX channel corresponding to the HIV assay of fig. 13.
Fig. 15, 16 and 17 show crosstalk correction signals in HEX channels using a conventional method and two embodiments of the present invention, respectively.
FIG. 18 shows a general block diagram depicting the relationship between software and hardware resources.
Detailed Description
The present invention provides systems and methods for determining crosstalk coefficients, particularly for PCR detection systems and PCR and nucleic acid unzipping data sets, and for generating crosstalk-corrected data sets using the crosstalk coefficients. The present invention also provides systems and methods for applying crosstalk coefficients using a linear subtractive model to produce a crosstalk-corrected data set.
While the remainder of this document will discuss embodiments and aspects of the invention in terms of its applicability to PCR datasets, it should be understood that the invention is applicable to datasets relating to other processes. Examples of other processes that may provide a sigmoid or growth curve-like or may be otherwise processed according to the techniques of the present invention include bacterial processes, melting processes, microbial growth processes, enzymatic processes (e.g., enzymatic kinetic reactions), and binding processes. For example, the techniques of the present invention may also be applied to analysis of data from nucleic acid melting processes and the like.
As shown in fig. 1a, the data for a typical PCR growth curve can be represented, for example, in a two-dimensional coordinate system, where the number of PCR cycles defines the x-axis and the indicator of accumulated polynucleotide growth defines the y-axis. Typically, as shown in FIG. 1a, the indicator of cumulative growth is the fluorescence intensity value, since the use of fluorescent markers is perhaps the most widely used labeling scheme. However, it should be understood that other indicators may be used depending on the particular labeling and/or detection scheme used. Examples of other useful indicators of cumulative signal growth include luminous intensity, chemiluminescent intensity, bioluminescent intensity, phosphorescent intensity, charge transfer, voltage, current, power, energy, temperature, viscosity, light scattering, radioactivity intensity, reflectivity, transmissivity, and absorptivity. The definition of cycle can also include time, process cycle, unit duty cycle, and regeneration cycle.
General procedure overview
One embodiment of a process 100 for determining crosstalk coefficients for a detection system that analyzes a PCR amplification process using two or more detection channels, according to the present invention, can be described with brief reference to fig. 1 b. In certain aspects, the PCR detection system includes a detector for analyzing radiation emissions from the PCR experiment. The detection system includes two or more optical elements, each of which is operable to isolate a range of wavelengths for further analysis. These optical elements are typically selected to match the emission characteristics of the fluorescent probes or labels used in the PCR amplification process, e.g., to isolate radiation within the emission band of a particular dye. For example, in certain aspects, the optical element includes one or more optical filters that allow substantially all radiation within a defined wavelength range to pass through. Other optical elements may include diffraction gratings. Each optical element defines a detection channel, e.g. a wavelength range, which is received by the detector element.
In an exemplary embodiment of the invention, the method may be implemented using a conventional personal computer system, including but not limited to: an input device for inputting a data set, such as a keyboard, mouse, or the like; a display device, such as a monitor, for representing a particular point of interest in the curve region; processing means, such as a CPU, required to perform each step in the method; a network interface such as a modem, a data storage device for storing a data set, computer code running on a processor, or the like. Furthermore, the method may also be implemented in a PCR device.
In step 110, for each detection channel, an experimental data set representing one or more PCR curves is received or otherwise acquired. In certain aspects, data is acquired over the entire acquisition range of the detection system, e.g., over a baseline region, a transition region, and a plateau region. An example of a plotted PCR data set (one or more sets of PCR data curves) is shown in fig. 2A, where the y-axis and x-axis represent the fluorescence intensity and cycle number, respectively, of the PCR curve. In some aspects, the data set should include data equally spaced along the axis. However, there may be one or more missing data points. In step 120, for each other channel, a crosstalk data set is acquired simultaneously with step 110 over an acquisition range. An example of a crosstalk data set is shown in fig. 2B. In step 130, the data set is processed to determine crosstalk coefficients, as will be described in more detail below. In certain aspects, the sum of squares between each PCR data set and the crosstalk data set is minimized, as will be described below. In other aspects, a sum of absolute values of differences between the PCR data set and the crosstalk data set is minimized.
Where the process 100 is implemented in an intelligent module (e.g., a processor executing instructions) residing in a PCR data acquisition device, such as a temperature cycler, the data set may be provided to the intelligent module in real-time as the data is collected, or the data set may be stored in a memory unit or buffer and provided to the intelligent module after the experiment is completed. Similarly, the data set may be provided to a separate system (such as a desktop computer system or other computer system) via a direct connection (e.g., USB or other direct wired or wireless connection) or a network connection (e.g., LAN, VPN, intranet, internet, etc.) to the acquisition device, or the data set may be provided on a portable medium (such as a CD, DVD, floppy disk, etc.).
FIG. 18 shows a general block diagram explaining the relationship between software and hardware resources. The system includes a kinetic PCR analysis module, which may be located in a thermocycler device, and an intelligent module that is part of a computer system. The data sets (PCR data sets) are transferred from the analysis module to the intelligent module and vice versa via a network connection or a direct connection. The data set is processed by computer code running on the processor and stored on the storage means of the intelligent module according to the method shown in fig. 1b and after processing is transmitted back to the storage means of the analysis module, wherein the modified data can be displayed on the display means.
In certain aspects, the data set includes data points having a pair of coordinate values (or a two-dimensional vector). For PCR data, the pair of coordinate values generally represents cycle number and fluorescence intensity values.
After the crosstalk coefficients are determined in step 130, the crosstalk coefficients determined in step 140 may be applied to the original or new PCR data set to produce a crosstalk-corrected data set. In step 150, the crosstalk coefficients and/or the crosstalk-corrected PCR data set or other data may be stored to a memory unit, provided to a different system, e.g., over a network connection or via a portable storage medium, or displayed on a display device (such as a monitor or printer).
Crosstalk coefficient determination
In the conventional crosstalk model, the crosstalk coefficient is generally calculated as follows:
assume that a single sample contains two visible dyes with distinct but overlapping spectra. The detection system consists of two distinct spectral filters, each filter passing about 95% of one dye spectrum and about 5% of the other dye spectrum. Each filter is typically optimized to pass light from only one dye. The light passed by each filter is treated as channel one and channel two, respectively.
(1) Take the average of five data points in the stationary signal region for channels 1 and 2:
PLAvg1 ═ average (five smooth channels 1)
PLAvg2 ═ average (five smooth channels 2)
(2) Calculating the crosstalk coefficient of the sample:
XT-DYE 2-CHANNEL 1 ═ PLAvg2/PLAvg1
(3) The sample size is now increased to 96 microwell plates (microwell plates) and XT is calculated for channel 1- > channel 2:
XT (1- > 2) ═ average (XT1, XT 2.., XT96)
This conventional method of calculating crosstalk is satisfactory provided that (1) there is a plateau, (2) the plateau is flat, and (3) there is minimal noise in the plateau. However, this is not the case with many data sets.
According to various embodiments of the present invention, methods for improving the accuracy of the calculation of crosstalk coefficients are provided. According to certain aspects, the crosstalk coefficients are determined by using an optimization technique similar to linear regression. Defining "Signal" as the base Signal and "XTSignal" as the crosstalk Signal, subscript i as the number of cycles, q as the multiplicative gain, r and s as the offset and slope, three exemplary embodiments can be described by the following equations:
(1) a simple gain "q" is used to minimize the sum of the squares between each fluorescence signal and the crosstalk signal, as shown in equation (1) below.
min[∑i(XTSignali-q*Signali)2] (1)
(2) A common offset "r" and slope "s" but a single simple gain "q" are used to minimize the sum of squares between each fluorescence signal and crosstalk signal, as shown in equation (2) below. This will result in a crosstalk coefficient for each channel and a common linear term for all channels.
(3) The offset "r", slope "s" and simple gain "q" are used to minimize the sum of squares between each fluorescence signal and crosstalk signal, as shown in equation (3) below. This will result in a crosstalk coefficient and a linear term for each channel.
min[∑i(XTSignali-(r+s*i+q*Signali))2] (3)
In other aspects, a sum of absolute values of differences between the PCR data set and the crosstalk data set is minimized. Other minimization methods may be used, such as, for example, the Levenberg-Marquardt method, the linear programming method, the Nelder-Mead method, the gradient descent method, the sub-gradient method, the simplex method, the ellipse method, the bundle method, the newton method, the quasi-newton method, the interior point method, and others that will be apparent to those skilled in the art.
One advantage of these various embodiments is that data over the entire signal acquisition range is used to determine the crosstalk coefficients. For example, all data over the acquisition range may be used, or a portion of the data over the entire acquisition range may be used. This provides a crosstalk coefficient that averages out (averaging out) the systematic data acquisition error. In contrast, conventional methods rely solely on a plateau region containing less than 10% data. Embodiments of the present invention also provide additional advantages for assays such as PCR. During PCR, the plateau region signal is generated when most of the probe is consumed. For this reason, baseline signal thresholds are typically used for target recognition. Thus, conventional methods use noisy signals to determine crosstalk coefficients with limited information that does not include data from real signal acquisition points on the curve. Moreover, it has been found that incorrect assumptions used in conventional crosstalk models can cause errors as a function of the data acquisition curve. Using the entire curve for crosstalk calculation removes some of these false assumptions. Further use of the correct model also virtually eliminates crosstalk correction errors.
According to certain aspects, background and/or baseline subtraction is performed on the data sets representing all signals (base and crosstalk signals) prior to determining the crosstalk coefficients. Background subtraction is typically done by subtracting a buffered signal that is unique to each channel. Baseline subtraction is generally accomplished by defining a baseline (e.g., slope and intercept) and subtracting this baseline from all signal values (base signal and crosstalk signal). The baseline can be defined by specifying baseline start and end values and performing a linear regression between these end points, or by curve fitting a function (such as a double sigmoid function) and using the slope and intercept parameters from this function as the baseline.
In certain aspects, determining the baseline includes performing a linear regression on the data set between the defined baseline start position and the baseline end position. In other aspects, determining the baseline includes curve fitting a double sigmoid function to identify slope and intercept values.
In a particular aspect, the method may include removing outliers or "spikes" from one or more of the PCR data set (signal) and the crosstalk data set prior to determining the crosstalk coefficients. U.S. patent application Ser. No. 11/316,315 entitled "Leverberg MargQuadrt Outier Spike Removal Method" and U.S. patent application Ser. No. 11/349,550 entitled "PCR Elbowder Determination By Use of a Double Signal Function Current With the Levenberg-Marquardt Algorithm and Normalization" disclose such techniques for fitting a Double Sigmoid Function to determine the slope and intercept parameters (among other parameters) of a PCR Curve and also for identifying and removing outliers or "spikes" in a PCR data set.
In some aspects of the method, the optical element includes one or more filters, each filter allowing light of a different specific wavelength range to pass through. In a particular embodiment, the system includes at least four filters.
In other aspects, the method includes displaying or outputting the determined crosstalk coefficients and/or storing the determined crosstalk coefficients to a memory module for later use in generating a crosstalk-corrected data set.
The present invention also relates to a computer-readable medium comprising code for controlling a processor to determine crosstalk coefficients for a Polymerase Chain Reaction (PCR) optical detection system having at least two optical elements, each optical element operable to isolate a different specific range of electromagnetic wavelengths, the code comprising instructions for: receiving, for each optical element, a PCR data set acquired over an acquisition range of a PCR growth process, the acquisition range including a baseline region, a growth region, and a plateau region; simultaneously receiving, for each other optical element, a crosstalk dataset for each filter acquired over the acquisition range; and determining crosstalk coefficients using the PCR and crosstalk data sets over the acquisition range. In certain aspects, the optical element includes one or more optical filters, each of which allows light of a different specific wavelength range to pass through.
In another aspect of the computer readable medium, the data acquisition range represents a plurality of PCR cycles, wherein determining the crosstalk coefficient includes minimizing a sum of squares between each PCR data set and the crosstalk data set over the acquisition range.
In a particular embodiment, minimizing the sum of squares includes using an equation of the form:
min[∑i(XTSignali-q*Signali)2],
where i is the number of PCR cycles, where Signali is the PCR dataset for the optical element, where XTSignali is the crosstalk dataset for the optical element, and where q is a multiplicative gain factor.
In other embodiments, minimizing the sum of squares includes using an equation of the form:
where i is the number of PCR cycles, where Signali is the PCR dataset for the optical element, where XTSignali is the crosstalk dataset for the optical element, where r is the common offset, where s is the common slope, and where q1, q2, and q3 are multiplicative gain factors.
In yet another embodiment, minimizing the sum of squares includes using an equation of the form:
min[∑i(XTSignali-(r+s*i+q*Signali))2],
where i is the number of PCR cycles, where Signali is the PCR dataset for the optical element, where XTSignali is the crosstalk dataset for the optical element, where r is the offset, where s is the slope, and where q is the multiplicative gain factor.
In certain aspects, the code further includes instructions to: determining a baseline for each PCR data set and each crosstalk data set; and subtracting the respective baseline from each data set prior to determining the crosstalk coefficients. Here, the instructions for determining the baseline may include instructions for performing a linear regression on the data set between the defined baseline start position and the baseline end position. In another aspect, the instructions for determining the baseline may include instructions for curve fitting a double sigmoid function to identify slope and intercept values.
In other aspects, the code further includes instructions for removing outliers from one or more of the PCR data set and the crosstalk data set prior to determining the crosstalk coefficients. The code may also include instructions for displaying or outputting the determined crosstalk coefficients. In other embodiments, the code may further include instructions for storing the determined crosstalk coefficients to a memory module for later use in generating a crosstalk-corrected data set.
The code may also include instructions for applying the determined crosstalk coefficients to a PCR data set using a linear subtractive model to produce a crosstalk-corrected data set. Here, the linear subtraction model may include an equation of the form:
wherein f isiIs the signal measured in channel (i) and fiCIs a signal corrected for crosstalk in channel (i), and wherein the coefficient aijRepresents the crosstalk coefficient from channel (j) to channel (i).
In another aspect, the linear subtraction model may include an equation of the form:
wherein f isiIs the signal measured in channel (i) and fiCIs a signal corrected for crosstalk in channel (i) where the coefficient aijRepresents the crosstalk coefficient from channel (j) to channel (i), and where r and s are the gain and linearity terms, respectively, common to all channels (i).
In yet another aspect, the linear subtraction model may include an equation of the form:
wherein f isiIs the signal measured in channel (i) and fiCIs a signal corrected for crosstalk in channel (i) where the coefficient aijRepresents the crosstalk coefficient from channel (j) to channel (i), and where riAnd siRespectively, a gain and a linear term that are different for each channel (i).
The present invention also relates to a kinetic Polymerase Chain Reaction (PCR) system comprising: an optical detection module having at least two optical elements, each optical element being operable to isolate a different specific electromagnetic wavelength range, wherein the optical detection module is adapted to: acquiring a PCR dataset for each optical element over an acquisition range of a PCR growth process, the acquisition range comprising a baseline region, a growth region, and a plateau region; and simultaneously acquiring a crosstalk data set over the acquisition range for each other optical element; and an intelligence module adapted to process the acquired PCR data sets and crosstalk data sets to determine crosstalk coefficients using each PCR and crosstalk data set over an acquisition range. In certain aspects of the system, the data acquisition range represents a plurality of PCR cycles and the intelligence module determines the crosstalk coefficient by minimizing a sum of squares between each PCR data set and the crosstalk data set over the acquisition range. Minimizing the sum of squares may include using an equation of the form:
min[∑i(XTSignali-q*Signali)2],
wherein i is the number of PCR cycles, wherein SignaliIs a PCR data set of optical elements, wherein XTSignaliIs a crosstalk data set for an optical element, and where q is a multiplicative gain factor.
In other embodiments, minimizing the sum of squares may include using an equation of the form:
wherein i is the number of PCR cycles, wherein SignaliIs a PCR data set of optical elements, wherein XTSignaliIs a crosstalk dataset for an optical element, where r is a common offset, where s is a common slope, and where q1, q2, and q3 are multiplicative gain factors.
In yet other embodiments, minimizing the sum of squares includes using an equation of the form:
min[∑i(XTSignali-(r+s*i+q*Signali))2],
wherein i is the number of PCR cycles, wherein SignaliIs a PCR data set of optical elements, wherein XTSignaliIs a crosstalk data set for an optical element, where r is an offset, where s is a slope, and where q is a multiplicative gain factor.
Within the system, the intelligence module may be further adapted to determine a baseline for each PCR data set and each crosstalk data set and subtract the respective baseline from each data set prior to determining the crosstalk coefficients. In particular embodiments, the intelligence module determines the baseline by performing a linear regression on the data set between the defined baseline start position and the baseline end position. In another particular embodiment, the intelligence module determines the baseline by curve fitting a double sigmoid function to identify slope and intercept values.
In certain embodiments of the system, the intelligence module is further adapted to remove outliers from one or more of the PCR data set and the crosstalk data set prior to determining the crosstalk coefficients. In other embodiments of the system, the optical element comprises one or more filters, each filter allowing light of a different specific wavelength range to pass through.
Within the system, the intelligence module may be further adapted to apply the determined crosstalk coefficients to the PCR data set using a linear subtraction model to produce a crosstalk-corrected data set. In certain embodiments, the linear subtraction model includes an equation of the form:
wherein f isiIs the signal measured in channel (i) and fiCIs a signal corrected for crosstalk in channel (i), and wherein the coefficient aijRepresents the crosstalk coefficient from channel (j) to channel (i). In another embodiment, the linear subtraction model includes an equation of the form:
wherein f isiIs the signal measured in channel (i) and fiCIs a signal corrected for crosstalk in channel (i) where the coefficient aijRepresents the crosstalk coefficient from channel (j) to channel (i), and where r and s are the gain and linearity terms, respectively, common to all channels (i). In yet another embodiment, the linear subtraction model includes an equation of the form:
wherein f isiIs the signal measured in channel (i) and fiCIs a signal corrected for crosstalk in channel (i) where the coefficient aijRepresents the crosstalk coefficient from channel (j) to channel (i), and where riAnd siRespectively, a gain and a linear term that are different for each channel (i).
The system may further include a display module in some aspects, wherein the intelligence module is further adapted to provide data to the display module to display the determined crosstalk coefficient or one of the crosstalk-corrected PCR data sets. In a particular embodiment, the system may further comprise a memory module, wherein the intelligence module is further adapted to store the determined crosstalk coefficients to the memory module for later use in generating the crosstalk-corrected data set.
Example Using HPV calibration assay (calibration assay)
Consider a specific dual channel scenario: FAM signals in HPV calibration assays and their crosstalk into HEX channels as shown in fig. 2A and 2B. FAM and HEX are well known fluorescent dyes with different excitation and emission characteristics. The signals in fig. 2A and 2B are almost ideal because there is a clear flat plateau with little noise. Thus, the crosstalk coefficient can be expected to be almost equal to the crosstalk coefficient calculated by the conventional method and the methods of equations (1) to (3).
Analysis using conventional methods of determining crosstalk coefficients
The average of five points in the plateau from the crosstalk data is taken and divided by the average of five points in the plateau from the pure signal. The crosstalk coefficient is then defined as the average of the crosstalk coefficients of all wells (wells) in the thermal cycler. The result is (via)Code shows):
TableMean=0*Range[24];
For[j=1,j≤24,j++,
TableMean[[j]]=Mean[Table[FC1[[60+i,j]],{i,50,55}]]/
Mean[Table[FC1[[i,j]],{i,50,55}]];
]
TableMean
{0.0173338,0.0171016,0.016073,0.015443,0.0165934,0.0181777,0.0170146,0.0155421,0.0166653,0.0145379,0.0150514,0.014366,0.0160809,0.0152727,0.0147269,0.0134197,0.0135746,0.0133044,0.0158858,0.0163226,0.0159318,0.0148446,0.0139202,0.0146015}
MTM=Mean[TableMean]
0.0154911
thus, in this example, the FAM-to-HEX crosstalk coefficient is determined to be 0.01549. For the methods of equations (1) - (3), a summary of the FAM to HEX crosstalk coefficients is shown in table 1 below:
TABLE 1
| Calculation method | Coefficient of crosstalk | Offset of | Linear term |
| Existing | 0.01549 | -- | -- |
| Equation 1 | 0.01572 | -- | -- |
| Equation 2 | 0.01470 | 0.03947 | -0.00011043 |
| Equation 3 | 0.01433 | 0.03132 | 0.00040327 |
Equation 1 is perhaps most similar to the conventional approach, which produces nearly identical crosstalk coefficients, while equations 2 and 3 differ because they include linear terms.
Comparison of crosstalk matrices: general methods and equation (1)
Consider the specific four-channel case: it is beneficial to compare all cross-talk coefficients of FAM, HEX, JA270 and CY5.5 in HPV calibration assays. The crosstalk coefficients calculated using the conventional method are shown in table 2 below, and the coefficients calculated using equations (1) - (3) are shown in table 3. Equation (1) does not use the diagonal element (a)11,a22,a33,a44) So these units are labeled "-". As discussed above for this HPV calibration set containing a clear flat plateau with little noise, the two crosstalk coefficient sets are expected to be almost identical.
Table 2: existing crosstalk matrices:
table 3: the crosstalk matrix calculated using equation 1:
thus, in this particular example, any differences in applying the conventional method from the method of equations (1) - (3) are due to the mathematical application of the crosstalk coefficients rather than the coefficients themselves. It should be noted, however, that there are many examples where the crosstalk coefficients are very different when comparing the current method to equations (1) - (3).
Applying crosstalk coefficients to produce crosstalk-corrected data
The conventional method of applying the crosstalk coefficient assumes an additive linear model shown in the following equation (4):
f1=a11c1+a12c2+a13c3+a14c4
f2=a21c1+a22c2+a23c3+a24c4
f3=a31c1+a32c2+a33c3+a34c4
f4=a41c1+a42c2+a43c3+a44c4
(4)
wherein f isiIs the measured signal, ciIs a fluorescent dye signal, and aIJIs the crosstalk from channel J to channel I.
These crosstalk coefficients also have the following properties:
equation set (4) can be solved by matrix inversion to obtain the dye signal ciDye signal ciIs defined as the crosstalk corrected signal. One problem with this approach is that it assumes that all signals from channel J are resolved (parse) between channels (1, 2, 3, 4). This is generally not correct.
According to one embodiment, the crosstalk coefficients are applied using a subtractive linear model to produce a crosstalk-corrected data set. The linear subtraction model of equation (1) is shown in equation (6) below:
f1C=f1-(a12f2+a13f3+a14f4)
f2C=f2-(a21f1+a23f3+a24f4)
(6)
f3C=f3-(a31f1+a32f2+a34f4)
f4C=f4-(a41f1+a42f2+a43f3)
wherein f isiIs the fluorescence measured in channel (i) and fiCIs a crosstalk corrected signal in channel (i). Coefficient aIJRepresents the crosstalk coefficient from channel (J) to channel (I). This model does not make assumptions about the fundamental signal being resolved between different channels.
The linear subtraction model of equation (2) is shown in equation (7) below:
f1C=f1-(a12f2+a13f3+a14f4)-(r+s*i)
f2C=f2-(a21f1+a23f3+a24f4)-(r+s*i)
(7)
f3C=f3-(a31f1+a32f2+a34f4)-(r+s*i)
f4C=f4-(a41f1+a42f2+a43f3)-(r+s*i)
wherein f isiIs the fluorescence measured in channel (i) and fiCIs a crosstalk corrected signal in channel (i). Coefficient aIJRepresents the crosstalk coefficient from channel (J) to channel (I). Equation (7) uses gain and linear terms r and s that are common to all channels.
The linear subtraction model of equation (3) is shown in equation (8) below:
f1C=f1-(a12f2+a13f3+a14f4)-(r1+s1*i)
f2C=f2-(a21f1+a23f3+a24f4)-(r2+s2*i)
(8)
f3C=f3-(a31f1+a32f2+a34f4)-(r3+s3*i)
f4C=f4-(a41f1+a42f2+a43f3)-(r4+s4*i)
wherein f isiIs the fluorescence measured in channel (i) and fiCIs a crosstalk corrected signal in channel (i). Coefficient aIJRepresents the crosstalk coefficient from channel (J) to channel (I). Equation (8) uses gain and linear terms r and s that are different in each channel.
It is further noted that the calculation of these crosstalk correction signals advantageously does not require matrix inversion. Also, in certain aspects, equations (6) - (8) may be modified by first subtracting the background or baseline from the base and crosstalk signals.
Comparison of Crosstalk applications in the case of Dual dyes
Consider the dye spectrum shown in fig. 3, where spectrum 10 represents dye 1(FAM) and spectrum 20 represents dye 2 (HEX). It is desirable to remove the crosstalk represented by the overlapping regions in the FAM and HEX dyes.
In the conventional method, solving equations (4) and (5) above for crosstalk correction signals of FAM as observed in filter 1 and HEX as observed in filter 2 gives the results shown in equations (9) and (10) below, respectively.
The crosstalk correction signal for this system using equation (1) is given as the following equations (11) and (12):
f1C=f1-a12f2 (11)
f2C=f2-a21f1 (12)
equation (11) overcomes two problems associated with equation (9), namely, for f1Multiplier (1-a)12) (which causes crosstalk overcompensation) is no longer present and does notThe correct denominator no longer exists.
Comparison of conventional and novel methods against HPV datasets
The existing methods and equations (1) - (3) were used to test HPV datasets for only FAM and HEX channels. This data contains the target in the FAM channel and no target in the HEX channel. These methods are applied to the crosstalk signals in the HEX channel and the resulting residual map is examined. Ideally, these residuals would be centered at zero intercept, with a slope of zero.
a) Residual map using conventional method
Fig. 4 shows 24 individual residual maps, while fig. 5 shows these twenty-four maps end-to-end. Fig. 6 shows the superposition of all twenty-four graphs. In an optimal modification, these residuals would be expected to be centered on the x-axis, with a slope and intercept of zero. However, in fig. 4, most of the graphs show that these residuals are significantly reduced from cycle 1 to cycle 60, indicating a non-optimal crosstalk implementation.
Upon observing fig. 6 (a superposition of all twenty-four plots of fig. 4), it is clear that there is a significant downward trend from cycle 1 to cycle 60 when using the conventional crosstalk method.
b) Residual map using equations (1) - (3)
Fig. 7-9 show end-to-end residual plots of equations (1) - (3), respectively. Similarly, FIGS. 10-12 show superimposed residual plots of equations (1) - (3). Comparing fig. 6 with fig. 10-12, it is apparent that the residual range using the inventive technique is advantageously less than conventional approaches. In addition, using the techniques of the present invention, the trend lines of the overlay have a slope and intercept closer to zero target. In this example, FAM and HEX crosstalk is approximately 2%. Thus, these figures show a more optimal modification, indicating the robustness of the inventive technique. The superiority of the present technique will be more apparent in the example where there are more dyes and the crosstalk is in the range of 2-20%.
Comparison of conventional and novel methods against HIV datasets
Fig. 13 and 14 show PCR experiments for HIV assays where the target is in the FAM filter and no target is in the HEX filter for FAM and HEX filters, respectively. Using the data in FIGS. 13 and 14, the crosstalk coefficient (a) for FAM- > HEX crosstalk for the conventional method and equation (1)21) Is calculated as a21=0.051。
Fig. 15, 16 and 17 show crosstalk correction signals in the HEX channel using the conventional method, equation (1) and equation (3), respectively. It can be seen that the use of either of the latter two methods greatly reduces signal overcorrection.
It should be appreciated that the crosstalk coefficient determination process described herein, including the crosstalk correction process, may be implemented in computer code running on a processor of a computer system. The code includes instructions for controlling a processor to implement various aspects and steps of the process. The code is typically stored on a hard disk, RAM or a portable medium such as a CD, DVD, etc. Similarly, these processes may be implemented in a PCR device, such as a temperature cycler, that includes a processor that executes instructions stored in a memory unit coupled to the processor. Code comprising such instructions may be downloaded to the PCR device memory unit via a network connection or direct connection to a code source or using a portable medium as is well known.
It will be appreciated by those skilled in the art that the crosstalk coefficient determination and crosstalk correction processes of the present invention can be implemented using a variety of programming languages (such as C, C + +, C #, Fortran, visual basic, etc.), and othersProvides pre-packaged routines, functions and procedures useful for data visualization and analysis). Another example of the latter is。
While the invention has been described by way of example and in terms of specific embodiments, it is to be understood that the invention is not limited to the disclosed embodiments. On the contrary, it is intended to cover various modifications and similar arrangements as would be apparent to those skilled in the art. Therefore, the scope of the appended claims should be accorded the broadest interpretation so as to encompass all such modifications and similar arrangements.
Claims (15)
1. A method of determining crosstalk coefficients for a Polymerase Chain Reaction (PCR) optical detection system having at least two optical elements, each optical element operable to isolate a different specific electromagnetic wavelength range, the method comprising:
-acquiring a PCR data set over an acquisition range of the PCR growth process for each optical element;
-simultaneously acquiring for each other optical element a crosstalk data set resulting from the overlap of at least two dye spectra in at least two filter ranges, the crosstalk data set being acquired over the acquisition range; and
-using the PCR data set and the crosstalk data set over the acquisition range to determine crosstalk coefficients,
wherein determining the crosstalk coefficient comprises minimizing a sum of squares of differences between the PCR data set and the crosstalk data set over the acquisition range.
2. The method of claim 1, wherein the data acquisition range represents a plurality of PCR cycles, and wherein determining the crosstalk coefficient comprises minimizing a sum of squares of differences between each PCR data set and the crosstalk data set over the acquisition range.
3. The method of claim 1 or 2, wherein minimizing the sum of squares comprises using an equation of the form:
min[∑i(XTSignali-q*Signali)2],
wherein i is the number of PCR cycles, wherein SignaliIs a PCR data set of optical elements, wherein XTSignaliIs a crosstalk data set for an optical element, and where q is a multiplicative gain factor.
4. The method of claim 1 or 2, wherein minimizing the sum of squares comprises using an equation of the form:
wherein i is the number of PCR cycles, wherein SignaliIs a PCR data set of optical elements, wherein XTSignaliIs a crosstalk dataset for an optical element, where r is a common offset, where s is a common slope, and where q1, q2, and q3 are multiplicative gain factors.
5. The method of claim 1 or 2, wherein minimizing the sum of squares comprises using an equation of the form:
min[∑i(XTSignali-(r+s*i+q*Signali))2],
wherein i is the number of PCR cycles, wherein SignaliIs a PCR data set of optical elements, wherein XTSignaliIs a crosstalk data set for an optical element, where r is an offset, where s is a slope, and where q is a multiplicative gain factor.
6. The method of claim 1, further comprising:
-determining a baseline for each PCR data set and each crosstalk data set; and
-subtracting the respective baseline from each data set before determining the crosstalk coefficients.
7. The method of claim 1, further comprising:
-applying the determined crosstalk coefficients to the PCR data set using a linear subtractive model to produce a crosstalk corrected data set.
8. The method of claim 7, wherein the linear subtraction model comprises an equation of the form:
wherein f isiIs the signal measured in channel i and fiCIs a signal corrected for crosstalk in channel i, and where the coefficient aijRepresenting the crosstalk coefficient from channel j to channel i.
9. The method of claim 7, wherein the linear subtraction model comprises an equation of the form:
wherein f isiIs the signal measured in channel i and fiCIs a signal corrected for crosstalk in channel i, where the coefficient aijRepresents the crosstalk coefficient from channel j to channel i, and where r and s are the gain and linearity terms, respectively, common to all channels i.
10. The method of claim 7, wherein the linear subtraction model comprises an equation of the form:
wherein f isiIs the signal measured in channel i and fiCIs a signal corrected for crosstalk in channel i, where the coefficient aijRepresents the crosstalk coefficient from channel j to channel i, and where riAnd siRespectively, a gain and a linear term that are different for each channel i.
11. An apparatus for determining crosstalk coefficients for a Polymerase Chain Reaction (PCR) optical detection system having at least two optical elements, each optical element operable to isolate a different specific electromagnetic wavelength range, the apparatus comprising:
-means for acquiring a PCR data set for each optical element over an acquisition range of a PCR growth process;
-means for simultaneously acquiring for each other optical element a crosstalk data set resulting from the overlap of at least two dye spectra in at least two filter ranges, the crosstalk data set being acquired over the acquisition range; and
-means for determining crosstalk coefficients using the PCR data sets and the crosstalk data sets over the acquisition range,
wherein the means for determining the crosstalk coefficient comprises means for minimizing a sum of squares of differences between the PCR data set and the crosstalk data set over the acquisition range.
12. A kinetic Polymerase Chain Reaction (PCR) system comprising:
-an optical detection module having at least two optical elements, each optical element being operable to isolate a different specific electromagnetic wavelength range, wherein the optical detection module is adapted to:
-acquiring a PCR data set for each optical element over an acquisition range of a PCR growth process, the acquisition range comprising a baseline region, a growth region and a plateau region; and
-simultaneously acquiring for each other optical element a crosstalk data set resulting from the overlap of at least two dye spectra in at least two filter ranges, the crosstalk data set being acquired over the acquisition range; and
an intelligence module adapted to process the acquired PCR data sets and crosstalk data sets to determine crosstalk coefficients using the PCR data sets and crosstalk data sets over the acquisition range,
wherein the intelligence module determines the crosstalk coefficient by minimizing a sum of squares of differences between each PCR data set and the crosstalk data set over the acquisition range.
13. A nucleic acid melting analysis system comprising:
-an optical detection module having at least two optical elements, each optical element being operable to isolate a different specific electromagnetic wavelength range, wherein the optical detection module is adapted to:
-for each optical element, acquiring a melting data set over a temperature acquisition range, and simultaneously for each other optical element, acquiring a crosstalk data set resulting from an overlap of at least two dye spectra in at least two filter ranges, the crosstalk data set being acquired over the temperature acquisition range; and
-determining a crosstalk coefficient by minimizing the sum of the squares of the differences between each melting data set and crosstalk data set over the acquisition range.
14. A method of determining a crosstalk coefficient for an optical detection system having at least two optical elements, each optical element operable to isolate a different specific range of electromagnetic wavelengths, the method comprising:
-acquiring a first data set over an acquisition range of the growth process for each optical element;
-simultaneously acquiring for each other optical element a crosstalk data set resulting from the overlap of at least two dye spectra in at least two filter ranges, the crosstalk data set being acquired over the acquisition range; and
-determining a crosstalk coefficient using the first data set and a crosstalk data set over the acquisition range,
wherein determining the crosstalk coefficient comprises minimizing a sum of squares of differences between the PCR data set and the crosstalk data set over the acquisition range.
15. The method of claim 14, wherein the data acquisition range represents a plurality of PCR cycles, and wherein determining the crosstalk coefficient comprises minimizing a sum of absolute values of differences between each PCR data set and the crosstalk data set over the acquisition range.
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US94706507P | 2007-06-29 | 2007-06-29 | |
| US60/947,065 | 2007-06-29 | ||
| PCT/EP2008/005255 WO2009003645A1 (en) | 2007-06-29 | 2008-06-27 | Systems and methods for determining cross-talk coefficients in pcr and other data sets |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| HK1140261A1 HK1140261A1 (en) | 2010-10-08 |
| HK1140261B true HK1140261B (en) | 2012-09-28 |
Family
ID=
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP5400768B2 (en) | System and method for determining crosstalk coefficients in PCR and other data sets | |
| US9506112B2 (en) | Increasing multiplex level by externalization of passive reference in polymerase chain reactions | |
| EP2163999B1 (en) | Real-time pcr elbow calling by equation-less algorithm | |
| EP2107470B1 (en) | PCR elbow determination using quadratic test for curvature analysis of a double sigmoid | |
| US7680868B2 (en) | PCR elbow determination by use of a double sigmoid function curve fit with the Levenburg-Marquardt algorithm and normalization | |
| EP1770172B1 (en) | Determination of the cycle threshold (Ct) value in a PCR amplification curve by cluster analysis with variable cluster endpoint | |
| US20080154512A1 (en) | Systems and methods for baselining and real-time pcr data analysis | |
| EP1804172B1 (en) | PCR elbow determination using curvature analysis of a double sigmoid | |
| EP2471007B1 (en) | Determination of elbow values for pcr for parabolic shaped curves | |
| US7991562B2 (en) | PCR elbow determination using curvature analysis of a double sigmoid | |
| KR20190004834A (en) | Method of detecting a target analyte in a sample using a signal variation data set | |
| US7844403B2 (en) | Temperature step correction with double sigmoid Levenberg-Marquardt and robust linear regression | |
| KR102165933B1 (en) | Detection of abnormal signals using more than one data set | |
| KR101771402B1 (en) | Methods for nucleic acid quantification | |
| US8170804B2 (en) | Determination of melting temperatures of DNA | |
| HK1140261B (en) | Systems and methods for determining cross-talk coefficients in pcr and other data sets | |
| US20090018776A1 (en) | System and method for normalizing data in nucleic acid amplification procedures |