[go: up one dir, main page]

NL2036275B1 - Systems and methods for morphometric analysis - Google Patents

Systems and methods for morphometric analysis Download PDF

Info

Publication number
NL2036275B1
NL2036275B1 NL2036275A NL2036275A NL2036275B1 NL 2036275 B1 NL2036275 B1 NL 2036275B1 NL 2036275 A NL2036275 A NL 2036275A NL 2036275 A NL2036275 A NL 2036275A NL 2036275 B1 NL2036275 B1 NL 2036275B1
Authority
NL
Netherlands
Prior art keywords
cell
cells
morphometric
model
features
Prior art date
Application number
NL2036275A
Other languages
Dutch (nl)
Inventor
L Luengo Hendriks Christian
Zhang Senzeyu
C Carelli Ryan
Original Assignee
Deepcell Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Deepcell Inc filed Critical Deepcell Inc
Priority to PCT/US2024/053235 priority Critical patent/WO2025096338A1/en
Application granted granted Critical
Publication of NL2036275B1 publication Critical patent/NL2036275B1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/69Microscopic objects, e.g. biological cells or cellular parts
    • G06V20/698Matching; Classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

Examples of the present disclosure provide systems and methods for assessing image data that includes extracting, using a Deep Learning (DL) model and image data of a plurality of cells, a vector for a cell of the plurality of cells, the vector comprising a set of machine—leaming (ML)—based features 5 and a set of cell morphometric features extracted using a computer vision model; and generating, using the DL model and using the set of ML—based features, a plurality of DL embeddings orthogonal to each other.

Description

SYSTEMS AND METHODS FOR MORPHOMETRIC ANALYSIS
BACKGROUND
{0001] Analysis of a cell (e.g., determination of a type or a state of the cell) can be accomplished by examining, for example, one or more images of the cell that is tagged (e.g., stained with a polypeptide, such as an antibody, against a target protein of interest within the cell; with a polynucleotide against a target gene of interest within the cell; with probes to analyze gene expression profile of the cell via polymerase chain reaction; or with a small molecule substrate that is modified by the target protein) or sequencing data of the cell {e.g., gene fragment analysis, whole-genome sequencing, whole-exome sequencing, RNA-seq, etc.). Such methods can be used to identify cell type (e.g., stem cell or differentiated cell) or cell state (e.g., healthy or disease state). Such methods can require treatment of the cell (e.g.. antibody staining, cell lysis or sequencing, etc.) that can be time- consuming and/or costly.
SUMMARY
[0002] In view of the foregoing, disclosed herein are alternative methods and systems for analyzing cells (e.g., previously uncharacterized or unknown cells). For example, recognized herein is a need for method for analyzing cells without pretreatment of the cells to, e.g., tag a target protein or gene of interest in the cells, obtain sequencing data of the cells, etc.
[0003] Example 1. A method of processing, comprising: extracting, using a Deep Learning (DL) model, a set of machine learning (ML)-based features from a cell image; generating, using the DL model and the set of ML-based features, a plurality of DL embeddings orthogonal to each other; extracting, using a computer vision model, a set of cell morphometric features from the cell image, the cell morphometric features being orthogonal to the plurality of DL embeddings; and generating a plurality of morphometric predictive embeddings using the set of ML-based features and the set of cell morphometric features.
[0004] Example 2. The method of example 1, wherein the generating the plurality of morphometric predictive embeddings comprises:
generating, using the computer vision model and the set of the cell morphometric features, a plurality of morphometric embeddings; and concatenating, using the DL model, the plurality of morphometric embeddings and the plurality of DL embeddings to a plurality of predictive multidimensional vectors comprising predictive probabilities morphometric features.
[0005] Example 3. The method of example 1, wherein the DL model is trained using a loss function comprising invariance, variance, covariance, morphometric decorrelation terms, or combinations thereof.
[0006] Example 4. The method of example 1, further comprising: predicting, using the DL model and the plurality of morphometric predictive embeddings, a plurality of morphometric blob features using at least a correlation coefficient greater than approximately 0.9.
[0007] Example 5. The method of example 1, further comprising: predicting, using the DL model and the plurality of morphometric predictive embeddings, a plurality of morphometric blob features using at least a correlation coefficient ranging between approximately 0.85 and approximately 0.95.
[0008] Example 6. The method of example 1, wherein the set of cell morphometric features comprises a plurality of blob features.
[0009] Example 7. The method of example 1, wherein the generating the plurality of morphometric predictive embeddings is in a high throughput setting.
[0010] Example 8. The method of example 1, wherein the plurality of DL embeddings comprises cell morphology information independent from a fixed set of rule-based morphometric features.
[0011] Example 9. The method of example 1, wherein the plurality of DL embeddings comprises cell morphology data orthogonal to a fixed set of rule-based morphometric features.
[0012] Example 10. The method of example 1, further comprising: separating, using the DL model, a plurality of de-correlated DL embeddings and the plurality of morphometric predictive embeddings into different dimensions.
[0013] Example 11. The method of example 1, wherein the DL model is a self-supervised machine learning (SSL) system, the method further comprising: de-correlating, using the DL model, the set of morphometric features from the DL embeddings such that the DL model is trained to acquire information not covered using the computer vision model.
[0014] Example 12. The method of example 1, wherein the cell image comprises a label free image.
[0015] Example 13. The method of example 1, further comprising: hosting the DL model and the computer vision model in a cloud computing environment.
[0016] Example 14. The method of example 1, wherein the method is performed in a cloud computing environment.
[0017] Example 15. The method of example 1, further comprising: generating an instruction to sort a cell of the cell image based on the plurality of morphometric predictive embeddings.
[0018] Example 16. The method of example 1, further comprising: sorting, using the plurality of morphometric predictive embeddings, a cell of the cell image; and feeding data from the sorting back to the DL model in order to train the DL model for future generating of the plurality of DL embeddings.
[0019] Example 17. A method for assessing image data, comprising: extracting, using a Deep Learning (DL) model and image data of a plurality of cells, a vector for a cell of the plurality of cells, the vector comprising a set of machine-learning (ML)-based features and a set of cell morphometric features extracted using a computer vision model; and generating, using the DL model and using the set of ML-based features, a plurality of DL embeddings orthogonal to each other.
[0020] Example 18. The method of example 17, wherein the DL model is trained using a loss function comprising invariance, variance, covariance, morphometric decorrelation terms, or a combination thereof. 10021] Example 19. The method of example 17, further comprising: generating, using the
DL model, a plurality of morphometric predictive embeddings using the set of ML-based features and the set of cell morphometric features.
[0022] Example 20. The method of example 19, wherein the generating the plurality of morphometric predictive embeddings comprises: generating, using the computer vision model and at least the set of the cell morphometric features, a plurality of morphometric embeddings; and concatenating, using the DL model, the plurality of morphometric embeddings and the plurality of DL embeddings to a plurality of predictive multidimensional vectors comprising predictive probabilities morphometric features. 10023] Example 21. The method of example 19, further comprising: predicting, using the DL model and at least the plurality of morphometric predictive embeddings, a plurality of morphometric blob features using a correlation ranging between approximately 0.85 to approximately 0.95.
[0024] Example 22. The method of example 17, wherein the plurality of DL embeddings comprise cell morphology information orthogonal to the set of morphometric features, and wherein the set of morphometric features are determined using at least a fixed set of rules.
[0025] Example 23. The method of example 17, further comprising: separating, using the
DL model, a plurality of de-correlated DL embeddings and the plurality of morphometric predictive embeddings into different dimensions.
[0026] Example 24. The method of example 17, wherein the DL model is a self-supervised machine learning (SSL) system, the method comprising: de-correlating, using the DL model, the set of morphometric features from the plurality of DL embeddings so that the DL model is trained to acquire information not covered using the computer vision model.
[0027] Example 25. The method of example 17, wherein the image data comprises a label free image of each cell of the plurality of cells.
[0028] Example 26. The method of example 17, further comprising: hosting the DL model and the computer vision model in a cloud computing environment.
[0029] Example 27. The method of example 17, further comprising: generating an instruction to sort the plurality of cells using the plurality of DL embeddings.
[0030] Example 28. The method of example 17, further comprising: sorting, using the plurality of morphometric predictive embeddings, a cell of the cell image; and
[0031] feeding data from the sorting to the DL model in order to train the DL model for future generating of the plurality of DL embeddings.
[0032] Example 29. A system for analyzing image data, the system comprising: at least one processor to execute instructions to perform operations comprising: extracting, using a Deep Learning (DL) model, a set of machine learning (ML)-based features from a cell image; generating, using the DL model and using at least the ML-based features, a plurality of DL embeddings orthogonal to each other; extracting, using a computer vision model, a set of cell morphometric features from the cell image that are orthogonal to the plurality of DL embeddings; and generating a plurality of morphometric predictive embeddings using the set of ML-based features and the set of cell morphometric features.
[0033] Example 30. The system of Claim 29, wherein the set of cell morphometric features comprises a plurality of blob features.
[0034] Example 31. The system of Claim 29, wherein the generating the plurality of morphometric predictive embeddings is in a high throughput setting. 5 [0035] Example 32. The system of Claim 29, wherein the plurality of DL embeddings comprise cell morphology data orthogonal to a fixed set of rule based morphometric features.
[0036] Example 33. The system of Claim 29, the operations further comprising: separating, using the DL model, a plurality of de-correlated DL embeddings and the plurality of morphometric predictive embeddings into different dimensions.
[0037] Example 34. The system of Claim 29, wherein the DL model is a self-supervised machine learning (SSL) system, the operations further comprising: de-correlating the set of morphometric features from the plurality of DL embeddings so that the plurality of DL model is trained to acquire information not covered using the computer vision model.
[0038] Example 35. The system of Claim 29, wherein the at least one processor is in a cloud computing environment.
[0039] Example 36. A non-transitory computer-readable medium storing instructions that, when executed by processor, cause the processor to perform operations for analyzing image data of a cell image, the operations comprising: extracting, using a trained Deep Learning (DL) model and from image data a plurality of cells, a vector for a cell of the plurality of cells, the vector comprising a set of machine-learning (ML)-based features and a set of cell morphometric features extracted by a trained computer vision model; generating, using the trained DL model and the ML-based features, a plurality of DL embeddings orthogonal to each other; and extracting, using the trained computer vision model, a set of cell morphometric features from the cell image that are orthogonal to the plurality of DL embeddings.
[0040] Example 37. A cloud-based computing system, the system comprising: at least one cloud-based processor to execute the instructions to perform operations comprising: extracting, using a Deep Learning (DL) model, a set of machine learning (ML)-based features from a cell image;
generating, using the DL model and using at least the set of ML-based features, a plurality of
DL embeddings orthogonal to each other; extracting, using a computer vision model, a set of cell morphometric features from the cell image. the cell morphometric features being orthogonal to the plarality of DL embeddings; and generating a plurality of morphometric predictive embeddings using the set of ML-based features and the set of cell morphometric features.
[0041] Example 38. The system of Claim 37, wherein the generating the plurality of morphometric predictive embeddings comprises: generating, using the computer vision model and at least the set of the cell morphometric features, a plurality of morphometric embeddings; and concatenating, using the DL model, the plarality of morphometric embeddings and the plurality of DL embeddings to a plurality of predictive multidimensional vectors comprising predictive probabilities morphometric features.
[0042] Example 39. The system of Claim 37, wherein the DL model is trained using a loss function comprising one or more invariance, variance, covariance, and morphometric decorrelation terms.
[0043] Example 40. The system of Claim 37, the operations further comprising: predicting, using the DL model and the plurality of morphometric predictive embeddings, a plurality of morphometric blob features using at least a correlation coefficient greater than approximately 0.9.
[0044] Example 41. The system of Claim 37, the operations further comprising: predicting, using the DL model and the plurality of morphometric predictive embeddings, a plurality of morphometric blob features using at least a correlation coefficient ranging between approximately 0.85 and approximately 0.95.
[0045] Example 42. The system of Claim 37, the operations further comprising: separating, using the DL model, a plurality of de-correlated DL embeddings and the plurality of morphometric predictive embeddings into different dimensions.
[0046] Example 43. The system of Claim 37, wherein the DL model is a self-supervised machine learning (SSL) system, the operations further comprising: de-correlating, using the DL model, the set of morphometric features from the DL embeddings such that the DL model is trained to acquire information not covered using the computer vision model.
[0047] Example 44. A method for cell sorting, comprising:
transporting a cell suspended in a fluid through a flow channel, wherein the flow channel is in fluid communication with a plurality of sub-channels; capturing one or more images of the cell as the cell is transported through the flow channel; extracting, by a Deep Learning (DL) model, a set of machine learning (ML)-based features from the one or more images; generating, using the DL model and the set of ML-based features, a plurality of DL embeddings orthogonal to each other; extracting, using a computer vision model, a set of cell morphometric features from the one or more images, the cell morphometric features being orthogonal to the plurality of DL embeddings; generating a plurality of morphometric predictive embeddings using the set of ML-based features and the set of cell morphometric features; and sorting the cell to a selected sub-channel of the plurality of sub-channels using the plurality of morphometric predictive embeddings.
[0048] Example 45. A method for cell sorting, comprising: transporting a cell suspended in a fluid through a flow channel, wherein the flow channel is in fluid communication with a plurality of sub-channels; capturing one or more images of the cell as the cell is transported through the flow channel; extracting, by a Deep Learning (DL) model, a set of machine learning (ML)-based features from the one or more images; generating, using the DL model and using the set of ML-based features. a plurality of DL embeddings orthogonal to each other; and sorting the cell to a selected sub-channel of the plurality of sub-channels using the plurality of DL embeddings.
[0049] Example 46. A cell sorting system, comprising: a flow channel configured to transport a cell through the flow channel; an imaging device configured to capture one or more images of the cell as the cell is transported through the flow channel; and a processor to execute instructions to perform operations comprising: extracting, using a Deep Learning (DL) model, a set of machine learning (ML)-based features from a cell image;
generating, using the DL model and using at least the ML-based features, a plurality of DL embeddings orthogonal to each other; extracting, using a computer vision model, a set of cell morphometric features from the cell image that are orthogonal to the plurality of DL embeddings; generating a plurality of morphometric predictive embeddings using the set of ML-based features and the set of cell morphometric features; and enabling sorting of the cell using at least one of the plurality of morphometric predictive embeddings, the plurality of DL embeddings, and the set of cell morphometric features.
[0050] Example 47. A cell sorting system, comprising: a flow channel configured to transport a cell through the flow channel; an imaging device configured to capture one or more images of the cell as the cell is transported through the flow channel; and a processor to execute instructions to perform operations comprising: extracting, using a Deep Learning (DL) model and image data of a plurality of cells, a vector for a cell of the plurality of cells, the vector comprising a set of machine-learning (ML)-based features and a set of cell morphometric features extracted using a computer vision model; generating, using the DL model and using the set of ML-based features, a plurality of DL embeddings orthogonal to each other; and enabling sorting of the cell using at least one of the plurality of DL embeddings and the set of cell morphometric features.
[0051] Additional aspects and advantages of the present disclosure will become readily apparent to those skilled in this art from the following detailed description, wherein only illustrative embodiments of the present disclosure are shown and described. As will be realized, the present disclosure is capable of other and different embodiments, and its several details are capable of modifications in various obvious respects, all without departing from the disclosure. Accordingly, the drawings and description are to be regarded as illustrative in nature, and not as restrictive.
BRIEF DESCRIPTION OF THE DRAWINGS
[0052] The novel features of the disclosure are set forth with particularity in the appended claims.
A better understanding of the features and advantages of the present disclosure will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the disclosure are utilized, and the accompanying drawings (also “Figure” and “FIG.” herein), of which:
[0053] FIG. 1 illustrates an example workflow of extracting features associated with cell morphology from cell images using the human foundation model, in accordance with some examples of the present disclosure.
[0054] FIG. 2A illustrates an example interaction between a microfluidics platform, the human foundation model, and the data suite, in accordance with some examples of the present disclosure.
[0055] FIG. 2B illustrates an example workflow from high-throughput imaging to cell characterization, classification and sorting based on cell morphology analysis, in accordance with some examples of the present disclosure.
[0056] FIG. 3 schematically illustrates an example method for classifying a cell.
[0057] FIG. 4 schematically illustrates, in one example, different ways of representing analysis data of image data of cells.
[0058] FIG. 5 schematically illustrates, in one example, different representations of analysis of image data of a population of cells.
[0059] FIG. 6 schematically illustrates, in one example, a method for a user to interact with a method for analyzing image data of cells.
[0060] FIG. 7 schematically illustrates, in one example, a cell analysis platform for analyzing image data of one or more cells.
[0061] FIGS. 8A-8B schematically illustrate, in one example, an example microfluidic system for sorting one or more cells.
[0062] FIG. 9 illustrates an example training architecture of the human foundation model.
[0063] FIG. 10 illustrates another example of the training architecture of the human foundation model shown in FIG. 9.
[0064] FIGs. 11A and 11B show examples of morphometric features of cellular images.
[0065] FIG. 12A illustrates cell classes, numbers of images used as training dataset to train the human foundation model, numbers of images processed by the human foundation model as test dataset, and corresponding representative cell images, in accordance with some examples of the present disclosure.
[0066] FIG. 12B illustrates an example confusion matrix between predicted cell classes classified by the human foundation model and actual cell classes, in accordance with some examples of the present disclosure.
[0067] FIG. 13 with views (a) to (f) schematically illustrate an example system for classifying and sorting one or more cells.
[0068] FIG. 14 with views (a) to (€) schematically illustrate operations that can be performed in an example method.
[0069] FIG. 15 shows, in one example, a computer system that is programmed or otherwise configured to implement methods provided herein.
[0070] FIG. 16 illustrates an example flow of operations in a method of processing images.
[0071] FIG. 17 illustrates an example flow of operations in a method of processing images.
INCORPORATION BY REFERENCE
[0072] All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference in their entirety, to the same extent as if each individual publication, patent, or patent application is specifically and individually indicated to be incorporated by reference. To the extent publications and patents or patent applications incorporated by reference contradict the disclosure contained in the specification, the specification is intended to supersede and/or take precedence over any such contradictory material.
DETAILED DESCRIPTION
[0073] While various examples provided herein have been shown and described herein, it will be obvious to those skilled in the art that such examples are provided by way of illustration only.
Numerous variations, changes, and substitutions may occur to those skilled in the art without departing from the present disclosure. It should be understood that various alternatives to the examples of the present disclosure described herein can be employed.
[0074] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the present disclosure belongs. In case of conflict, the present application including the definitions will control. Also, unless otherwise required by context, singular terms shall include pluralities and plural terms shall include the singular.
[0075] Whenever the term “at least,” “greater than,” or “greater than or equal to” precedes the first numerical value in a series of two or more numerical values, the term “at least,” “greater than” or it “greater than or equal to” applies to each of the numerical values in that series of numerical values.
For example, greater than or equal to about 1, about 2, or about 3 is equivalent to greater than or equal to about 1, greater than or equal to about 2, or greater than or equal to about 3.
[0076] Whenever the term “no more than,” “less than,” or “less than or equal to” precedes the first numerical value in a series of two or more numerical values, the term “no more than,” “less than,” or “less than or equal to” applies to each of the numerical values in that series of numerical values. For example, less than or equal to about 3, about 2, or about 1 1s equivalent to less than or equal to about 3, less than or equal to about 2, or less than or equal to about 1.
[0077] The term “morphology” or “morphological characteristic” or “morphological feature” of a cell as used herein generally refers to the form, structure, and/or configuration of the cell. The term “morphometric feature” is intended to refer to a quantitative representation of a morphological feature of a cell, and in some cases the terms “morphological feature” and “morphometric feature” are used interchangeably herein. The morphology of a cell can comprise one or more aspects of a cell's appearance, such as, for example, shape, size, arrangement, form, structure, pattern(s) of one or more internal and/or external parts of the cell, or shade (e.g., color, greyscale, etc.). Non-limiting examples of a shape of a cell can include, but are not limited to, circular, elliptic, shmoo-like, dumbbell, star- like, flat, scale-like, columnar, invaginated, having one or more concavely formed walls, having one or more convexly formed walls, prolongated, having appendices, having cilia, having angle(s), having corner(s), etc. A morphological feature of a cell can be visible with treatment of a cell (e.g., small molecule or antibody staining). In another example, the morphological feature of the cell need not require any treatment to be visualized in an image or video.
[0078] The terms “unstructured” or “unsorted,” as used interchangeably herein, generally refers to a mixture of cells (e.g., an initial mixture of cells) that is not substantially sorted (or rearranged) into separate partitions. An unstructured population of cells can comprise at least two types of cells that can be distinguished by exhibiting different properties (e.g., one or more physical properties, such as one or more different morphological characteristics as disclosed herein). The unstructured population of cells can be a random (or randomized) mixture of the at least two types of cells. The cells as disclosed herein can be viable cells. A viable cell, as disclosed herein, can be a cell that is not undergoing necrosis or a cell that is not in an early or late apoptotic state. Assays for determining cell viability can include, e.g., as using propidium iodide (PI) staining which can be detected by flow cytometry. In another example, the cells need not be viable (e.g., fixed cells).
[0079] A “viable cell” as disclosed herein can be characterized by exhibiting one or more characteristics (e.g., morphology, one or more gene expression profiles, etc.) that is substantially unaltered (or that is not substantially impacted by) by any operation or process of the methods disclosed herein (e.g., partitioning). In some examples, a characteristic of a viable cell can be a gene transcript accumulation rate, which can be characterized by a change in transcript levels of a same gene (e.g., a same endogenous gene) between mother and daughter cells over the time between cell divisions, as ascertained by single cell sequencing, polymerase chain reaction (PCR), etc.
[0080] The term, “high throughput”, when referring to a platform, system, model, and the like, means that such a platform, system, model, etc., is capable of generating an embedding for at least one
IO image within a desired time, such as but not limited to approximately 5 ms to 30 ms. In some aspects, a high-throughput setting can also include components to process approximately 10,000 frames/sec and/or approximately 1000 images/sec while being configured to correct per-pixel variation in background offset, camera gain, and/or illumination for the processed frames. As used herein, “high- throughput” systems can require relatively low-latency.
[0081] Relative terms, such as “about,” “substantially,” or “approximately” are used to include small variations with specific numerical values (e.g., +/- x%,), as well as including the situation of no variation (+/-0%). In some examples, the numerical value x is less than or equal to 10 — e.g., less than or equal to 5, to 2, to 1, or smaller.
[0082] The term “real time” or “real-time,” as used interchangeably herein, generally refers to an event (e.g., an operation, a process, a method, a technique, a computation, a calculation, an analysis, an optimization, etc.) that is performed using recently obtained (e.g., collected or received) data.
Examples of the event may include, but are not limited to, analysis of a one or more images of a cell to classify the cell, updating one or more deep learning algorithms (e.g., neural networks) for classification and sorting, controlling actuation of one or more valves by at a sorting bifurcation, etc.
In some examples, a real time event can be performed almost immediately or within a short enough time span, such as within at least about 0.0001 ms, at least about 0.0005 ms, at least about 0.001 ms, at least about 0.005 ms, at least about 0.01 ms, at least about 0.05 ms, at least about 0.1 ms, at least about 0.5 ms, at least about 1 ms, at least about 5 ms, at least about 0.01 seconds, at least about 0.05 seconds, at least about 0.1 seconds, at least about 0.5 seconds, at least about 1 second, or more. In some examples, a real time event can be performed almost immediately or within a short enough time span, such as within at most about 1 second, at most about 0.5 seconds, at most about 0.1 seconds, at most about 0.05 seconds, at most about 0.01 seconds, at most about 5 ms, at most about 1 ms, at most about 0.5 ms, at most about 0.1 ms, at most about 0.05 ms, at most about 0.01 ms, at most about 0.005 ms, at most about 0.001 ms, at most about 0.0005 ms, at most about 0.0001 ms, or less. In some examples, any of the operations of a computer processor as provided herein can be performed (e.g., automatically performed) in real-time.
[0083] As used herein, an “encoder” refers to a type of deep learning model that transforms or “encodes” an image into a vector.
[0084] Morphology is an important cell property associated with identity, state, and function, but in some instances it is characterized crudely in a few standard dimensions such as diameter, perimeter,
IO or area, or with subjective qualitative descriptions. In comparison, the present disclosure provides a method of processing that includes using a machine learning encoder to extract a set of ML-based features from a cell images, using a computer vision encoder to extract a set of cell morphometric features from the cell image, and using the set of ML-based features and the set of cell morphometric features to generate a feature vector that represents morphology of the cell. The feature vector can be used in a variety of practical applications, e.g., in a manner such as described in the nonlimiting examples provided herein.
[0085] In one example, tumors are composed of heterogeneous assortments of cells with distinct genetic and phenotypic characteristics that may drive therapeutic resistance, immune evasion, and disease progression. The advent of single cell technologies has enabled deep profiling of individual cells within a tumor microenvironment, leading to a better understanding of tumor biology and subsequently more effective cancer treatment strategies. While profiling technologies such as flow cytometry and single cell sequencing yields insight on tumor composition, cells are sometimes no longer amenable to additional downstream studies after being subjected to antibody staining or destructive analytical processes such as cell lysis. Current sorting methods such as fluorescence- activated cell sorting (FACS) rely on a limited set of biomarkers, which cannot cover the full extent or be readily available for all distinct cell properties. Additionally, dependence on antibodies, dyes/stains, and biomarkers to denote cell identity may inadvertently create sampling bias by depleting biomarker-negative but potentially biologically interesting cell populations.
[0086] Cell morphology information has historically been used for cell and disease characterization but has been difficult to objectively and reproducibly quantify. Cell morphology in many instances are studied qualitatively through microscopes, which can be inherently slow, difficult to scale, and relies on human interpretation.
[0087] The present disclosure provides multi-dimensional morphology analysis (e.g., profiling) enabled by machine learning and computer vision morphometrics. The present disclosure has the benefit of enabling higher resolution and biological insight while reducing labor-intensive cell processing manipulations. The multi-dimensional morphology profiling and sorting of unlabeled single cells using machine learning, advanced imaging, and microfluidics can be used to assess population heterogeneity beyond biomarkers.
[0088] In some examples, the present disclosure provides a method for cell morphology analysis.
In some examples, the method may combine deep learning and computer vision methods to extract features from cell images. A deep learning model used in the method may provide quantitative descriptions of cell features using one or more neural network. A computer vision model used in the method may provide a quantitative assessment of cell and biological features using discrete image analysis algorithms. The method as described herein may allow for extracting and interpreting cell morphology features with a multidimensional, unbounded, and quantitative assessment.
[0089] In some examples, the present disclosure provides a system for cell morphology analysis.
In some examples, the system may comprise a benchtop single-cell imaging and sorting system for high-dimensional morphology analysis. The system may combine label-free imaging, deep learning, computer vision morphometrics, and gentle cell sorting to leverage multidimensional single cell morphology as a quantitative readout. The system may capture high-resolution brightfield cell images, from which features (e.g., dimensional embedding vectors) can be extracted representing the morphology of the cells.
[0090] The system and method may combine label-free imaging, deep learning, computer vision morphometrics, and gentle cell sorting to harness multi-dimensional single cell morphology as a quantitative biological readout. The systems and methods disclosed herein have a variety of potential uses. For example, the combination of deep learning and computer vision morphometrics may allow cell characterization and sorting based on multi-dimensional morphometric and deep learning derived features, which can be used to identify and enrich cancer cells in heterogeneous populations.
Quantitative multi-dimensional morphology information at single cell level may provide additional information to resolve cancer heterogeneity. The system and method may extract pigmentation features from the morphology profiling and based on which, assess melanoma cells. Cell populations can be characterized with specific morphological profiles have distinct molecular profiles, and morphologically distinct cells (e.g., normal vs. tumor) can be distinguished over each other.
[0091] The system and method may provide a relatively fast workflow for cell morphology analysis. For example, it may only take a few hours from preparing cell samples to generating publishable figures representing cell morphology. The systems and methods can be used in a variety of applications including but not limited to cancer research, developmental biology, cell and gene therapies, and drug and functional screening. In some examples, the systems and methods as described herein can be used in drug and Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) perturbation screening, using cell morphology as a novel biomarker for the screening. The systems and methods as described herein can be used in sample-level profiling, including but not limited to heterogeneous sample evaluation and characterization, disease detection and enrichment, and sample clean-up. In other examples, the systems and methods as described herein can be used in cell-level phenotyping, including cell health status, cell state characterization, and multi-omic integration.
Human Foundation Model (HFM)
[0092] In some examples, the present disclosure may provide a human foundation model (“HFM”) for cell morphology analysis (e.g., profiling). The model may combine a deep learning model and a computer vision model and extract cell features from cell images. In some examples, the deep learning model may process cell images as input and provide quantitative descriptions of cell features. The deep learning model may extract deep learning features that are information-rich metrics of cell morphology with powerful discriminative capabilities. The deep learning features may not be human- interpretable. The computer vision model may process cell images as input and provide morphometric features that are human-interpretable, quantitative metrics of cell morphology including cell size, shape, texture, and intensity. The morphometrics can be computationally generated using discrete computer vision algorithms. When some of the morphometrics are too computationally intensive to compute in real time, the deep learning model may overcome the limitation of the computer vision model by imputing the most computationally intensive morphometrics into the human foundation model. By combining deep learning and morphometric features, the human foundation model as described herein may provide both accuracy and interpretability in real-time feature extraction, cell classification and sorting. The human foundation model may also have strong generalization capabilities that enable hypothesis-free sample exploration and efficient generation of application- specific models.
[0093] FIG. 1 illustrates an example workflow of extracting features associated with cell morphology from cell images using the human foundation model, in accordance with some examples of the present disclosure. The human foundation model may process cell images 110 and generate features therefrom. In some examples, cells that are under analysis can be unstained, and the cell images 110 can be brightfield cell images.
[0094] In some examples, the human foundation model may comprise a deep learning model 120 and a computer vision model 130. The deep learning model 120 may comprise a deep learning encoder, for example, a convolutional neural network. The deep learning model 120 may process cell images 110 as input and extract artificial intelligence (Al) features 140 therefrom. In some examples, the Al features 140 may comprise deep learning features 160, e.g., features that are extracted using a deep learning algorithm, such as a convolutional neural network, with other nonlimiting examples being provided elsewhere herein. In some examples, the dimensions of the deep learning features can be in a range of between about 1 and about 10, between about 1 and about 20, between about 1 and about 50, between about 1 and about 80, between about 1 and about 100, between about 1 and about 200, between about 1 and about 500, between about 1 and about 800, between about 1 and about 1,000, between about 1 and about 2,000, between about 1 and about 5,000, between about 1 and about 8,000, between about 1 and about 10,000, between about 1 and about 20,000, between about 1 and about 50,000, between about 1 and about 80,000, or between about 1 and about 100,000, or any value between any of the aforementioned numbers. Any range of numbers of the deep learning features can be contemplated, for example, for example, the number may be at least about 1 feature — e.g., at least about 5, at least about 10, at least about 50, at least about 100, at least about 500, at least about 1,000, at least about 5,000, at least about 10,000, at least about 50,000, at least about 100,000, or more, features. In another example, the number may be up to about 100,000 features — e.g., up to about 50.000 features, up to about 10,000, up to about 5,000, up to about 1,000, up to about 500, up to about 100, up to about 50, up to about 10, up to about 5, up to about 1, or smaller, features. Other suitable numbers are also possible. As one example, the deep learning model 120 may extract between about 5 and about 1000 deep learning features, e.g., between about 10 and about 500 deep learning features, e.g., between about 50 and about 100 deep learning features, from each cell image. In some examples, in a data set comprising a plurality of deep learning features of the cell(s), each feature can be referred to as a dimension (e.g., a deep learning dimension). Any range of dimensions of the deep learning features can be contemplated, for example from 1 through any number greater than about 100,000. As illustrated in FIG. 1, as one nonlimiting example, the deep learning model 120 generates about 64-
dimensional deep learning features 160.
[0095] In some examples, the computer vision model 130 may comprise a computer vision encoder including human-constructed algorithms, which in some cases can be referred to as “rule-based morphometrics.” The computer vision model 130 may process cell images 110 as input and extract cell features 150 therefrom. In some examples, the cell features 150 may comprise cell position, cell shape, pixel intensity, texture, focus, or combinations thereof. The cell features 150 may comprise morphometric features 170. Nonlimiting examples of morphometric features 170 are provided below in Table 1. The dimensions of the morphometric features 170 can be in a range of between about 1 and about 10, between about 1 and about 20, between about 1 and about 50, between about 1 and about 80, between about 1 and about 100, between about 1 and about 200, between about 1 and about 500, between about 1 and about 800, between about 1 and about 1,000, between about 1 and about 2,000, between about 1 and about 5,000, between about 1 and about 8,000, between about 1 and about 10,000, between about 1 and about 20,000, between about 1 and about 50,000, between about 1 and about 80.000, or between about 1 and about 100.000, or any value between any of the aforementioned numbers. The cell features 150 may include any suitable number of morphometric features 170, for example, at least about 1 feature, at least about 5 features, at least about 10 features, at least about 50 features, at least about 100 features, at least about 500 features, at least about 1,000 features, at least about 5,000 features, at least about 10,000 features, at least about 50,000 features, and at least about 100,000 features. In some cases, in a data set comprising a plurality of computer vision features of the cell(s), each feature can be referred to as a dimension (e.g., computer vision-based dimension). Any range of dimensions of the morphometric features can be contemplated, for example from 1 through any number greater than 100,000. As one example, the computer vision model 130 may extract between about 5 and about 1000 morphometric features, e.g., between about 10 and about 500 morphometric features, e.g., between about 50 and about 100 features, and any values in between any of the aforementioned ranges, from each cell image. As illustrated in FIG. 1, in one nonlimiting example, the computer vision model generates about 51-dimensional morphometric features 170.
[0096] In some examples, the human foundation model may encode the deep learning features 160 and morphometric features 170 into multidimensional numerical vectors representing the cell morphology. In one nonlimiting example, the 64-dimensional deep learning features 160 and 51- dimensional morphometric features 170 can be encoded into 115-dimensional embedding vectors representing the cell morphology. Further details regarding example data structures (e.g., multi-
dimensional vectors) for encoding morphometric features 170 are provided further below.
[0097] In some examples, the human foundation model may generate one or more morphology maps based on deep learning features 160, morphometric features 170, or combinations thereof and in some examples based on a plurality of deep learning features and a plurality of morphometric features {eg based on a multi-dimensional vector that represents morphology of a cell). A cell morphology map can be a visual (e.g., graphical) representation of one or more clusters of datapoints. The cell morphology map can be a 1-dimensional (1D) representation (e.g., based on one morphological property as one parameter or dimension) or a multi-dimensional representation, such as a 2- dimensional (2D) representation (e.g., based on two morphological properties as two parameters or dimensions), a 3-dimensional (3D) representation (e.g., based on three morphological properties as three parameters or dimensions), a 4-dimensional (4D) representation, etc. In some examples, one morphological property of a plurality of morphological properties used for blotting the cell morphology map can be represented as a non-axial parameter (e.g., non-X, y, or z axis), such as, distinguishable colors (e.g., heatmap), numbers, letters (e.g., texts of one or more languages), and/or symbols (e.g., a square, oval, triangle, square, etc.). For example, a heatmap can be used as colorimetric scale to represent the classifier prediction percentages for each cell against a cell class, cell type, or cell state.
[0098] The cell morphology map can be generated based on one or more morphological features (e.g., characteristics, profiles, fingerprints, etc.) from the processed image data. Non-limiting examples of one or more morphological properties of a cell, as disclosed herein, that can be extracted from one or more images of the cell may include, but are not limited to (a) shape, curvature, size (e.g., diameter, length, width, circumference), area, volume, texture, thickness, roundness, etc. of the cell or one or more components of the cell (e.g., cell membrane, nucleus, mitochondria, etc.), (ii) number or positioning of one or more contents {e.g., nucleus, mitochondria, etc.) of the cell within the cell (e.g., center, off-centered, etc.), and (iii) optical characteristics of a region of the image(s) (e.g., unique groups of pixels within the image(s)) that correspond to the cell or a portion thereof (e.g., light emission, transmission, reflectance, absorbance, fluorescence, luminescence, etc.).
[0099] One or more dimension of the cell morphology map can be represented by various approaches (e.g., dimensionality reduction approaches), such as, for example, principal component analysis (PCA), multidimensional scaling (MDS), t-distributed stochastic neighbor embedding (t-
SNE), and uniform manifold approximation and projection (UMAP). For example, UMAP can be a machine learning technique for dimension reduction. UMAP can be constructed from a theoretical framework based in Riemannian geometry and algebraic topology. UMAP can be utilized for a practical scalable algorithm that applies to real world data, such as morphological properties of one or more cells.
Training of Deep Learning Model of HFM
[0100] The deep learning model of the human foundation model can be trained using a plurality of cell images from different types of biological samples and thus, be able to detect differences in cell morphology without labeled training data. In some examples, the deep learning model 120 of the human foundation model may be trained using any suitable number of images of cells, for example between about 1 and about 200, about 1 to about 500, between about 1 and about 800, between about 1 and about 1,000, between about 1 and about 2,000, between about 1 and about 5,000, between about 1 and about 8,000, between about 1 and about 10,000, between about 1 and about 20,000, between about 1 and about 50,000, between about 1 and about 80,000, between about 1 and about 100,000, between about 1 and about 200,000, between about 1 and about 500,000, between about 1 and about 800,000, between about 1 and about 1,000,000, between about 1 and about 2,000,000, between about 1 and about 5,000,000, between about 1 and about 8,000,000, or between about 1 and about 10,000,000 images of cells. Any range of the number of cell images as training dataset can be contemplated, for example from about 1 through any number greater than about 10,000,000. As one example, the deep learning model 120 of the human foundation model is trained using a training dataset that includes at least about 10,000 images of cells —e.g., at least about 100,000 images of cells, at least about 1,000,000 images of cells, at least about 5,000,000 images of cells, at least about 10,000,000 images of cells, at least about 100,000,00 images of cells, at least about 1 billion, or more, images of cells. For example, the deep learning model 120 can be trained using between about 5,000,000 and about 1 billion images of cells. The training set may include, may consist of, or may consist essentially of (and in some examples may consist of), images of cells that are not physically stained and that are not computationally labeled in any manner. As such, in some examples the deep learning model 120 learns to recognize features from the cell images in a self-supervised manner.
[0101] In some examples, the human foundation model may comprise parameters in a range of between about 1 and about 1,000, between about 1 and about 2,000, between about 1 and about 5,000, between about 1 and about 8,000, between about 1 and about 10,000, between about 1 and about
20,000, between about 1 and about 50,000, between about 1 and about 80,000, between about 1 and about 100,000, between about 1 and about 200,000, between about 1 and about 500,000, between about 1 and about 800,000, between about 1 and about 1,000,000, between about 1 and about 2,000,000, between about 1 and about 5,000,000, between about 1 and about 8,000,000, between about 1 and about 10,000,000, between about 1 and about 20,000,000, between about 1 and about 50,000,000, between about 1 and about 80,000,000, between about 1 and about 100,000,000, or between about 1 and about 500,000,000. Any range of the number of parameters can be contemplated, for example from 1 through any number greater than 500,000,000. Some or all of the parameters can be optimized during the training process. For example, a neural network may include millions or billions of floating-point numbers connected by mathematical operations. These numbers in some instances can be called "parameters" or “weights”. In some examples, parameters are adjusted ("trained") to transform an image of a cell into a vector (for example classification probabilities, or feature vector, depending on the use-case of the neural network). In some examples, a neural network for computer vision applications such as provided herein may have a number of parameters ranging from 1 million to upwards of 10 billion.
[0102] In some examples, the deep learning model (e.g., backbone model) of the human foundation model, which extracts image features, can be based on a convolutional neural network architecture, a vision transformer architecture, or both. The training process may apply a self-supervised learning approach that learns image features without labels and generate deep learning embeddings (vectors) that are orthogonal to each other and orthogonal to morphometric features. As used herein, embeddings that are “orthogonal” can be perpendicular to another embedding vector or set of embedding vectors. For example, vectors are considered to be orthogonal to each other if they are at right angles in n-dimensional space, where 7 is the size or number of elements in each vector. In some examples, “orthogonal” embeddings can have a covariance of about 0 and can be perfectly or completely orthogonal (e.g., have exactly a covariance of 0) or be substantially orthogonal with a covariance that is greater than but close to 0. In some examples, “orthogonal” embeddings include features that are “independent” of one another, meaning, that the presence or absence of one feature does not affect the presence or absence of any of the other feature. For example, a vector is orthogonal if the dot product with another vector is zero.
Examples of Machine Learning Models for Feature Extraction
[0103] In some examples, analysis of imaging data as disclosed herein (e.g., particle imaging data, such as cell imaging data) can be performed using artificial intelligence, such as one or more machine learning algorithms.
[0104] The machine learning model (e.g., a metamodel) can be trained by using a learning model and applying learning algorithms (e.g., machine learning algorithms) on a training dataset (e.g., a dataset comprising unlabeled cell images). In some examples, given a set of training examples/cases, each marked for belonging to a specific class (e.g., specific cell type or class), a training algorithm may build a machine learning model capable of assigning features within images of cells into one category or the other, e.g., to make the model a non-probabilistic machine learning model. In some examples, the machine learning model can be used to create a new category to assign new examples/cases Into the new category. In some examples, a machine learning model can be the actual trained model that is generated based on the training model.
[0105] The machine learning algorithm as disclosed herein can be configured to extract one or more morphological features of a cell from the image data of the cell. The machine learning algorithm may form a new data set based on the extracted morphological features, and the new data set need not contain the original image data of the cell. In some examples, replicas of the original images in the image data can be stored in a database disclosed herein, e.g., prior to using any of the new images for training, e.g.. to keep the integrity of the images of the image data. In some examples, processed images of the original images in the image data can be stored in a database disclosed herein during or subsequent to the classifier training. In some examples, any of the newly extracted morphological features as disclosed herein can be utilized as new molecular markers for a cell or population of cells of interest to the user. A cell analysis platform as disclosed herein can be operatively coupled to one or more databases comprising non-morphological data of cells processed (e.g., genomics data, transcriptomics data, proteomics data, metabolomics data), a selected population of cells exhibiting the newly extracted morphological feature(s) can be further analyzed by their non-morphological properties to identify proteins or genes of interest that are common in the selected population of cells but not in other cells, thereby determining such proteins or genes of interest to be new molecular markers that can be used to identify such selected population of cells.
[0106] Non-limiting examples of machine learning algorithms for training a machine learning model may include supervised learning, unsupervised learning, semi-supervised learning, reinforcement learning, self-learning (also referred to as self-supervised learning), feature learning,
anomaly detection, association rules, etc. In some examples, a machine learning model can be trained by using one or more learning models on such training dataset. Non-limiting examples of learning models may include artificial neural networks (e.g., convolutional neural networks, U-net architecture neural network, etc.), backpropagation, boosting, decision trees, support vector machines, regression analysis, Bayesian networks, genetic algorithms, kernel estimators, conditional random field, random forest, ensembles of machine learning models, minimum complexity machines (MCM), probably approximately correct learning (PACT), etc.
[0107] In some examples, the neural networks are designed by the modification of neural networks such as AlexNet, VGGNet, GoogLeNet, ResNet (residual networks), DenseNet, and Inception networks. In some examples, the enhanced neural networks are designed by modification of ResNet (e.g. ResNet 18, ResNet 34, ResNet 50, ResNet 101, and ResNet 152) or inception networks. In some examples, the modification comprises a series of network surgery operations that are mainly carried out to improve including inference time and/or inference accuracy.
[0108] The neural network can be used together with a Vision Transformer. Vision Transformers and their use in encoding images are described in Dosovitskiy et al., “An image is worth 16x16 words:
Transformers for image recognition at scale,” International Conference on Learning Representations (ICLR) (2021) (21 pages available at arxiv.org/abs/2010.11929), the entire contents of which are incorporated by reference herein.
[0109] The machine learning algorithm as disclosed herein may utilize one or more clustering algorithms to determine that objects (e.g., features) in the same cluster can be more similar (in one or more morphological features) to each other than those in other clusters. Non-limiting examples of the clustering algorithms may include, but are not limited to, connectivity models (e.g., hierarchical clustering), centroid models (e.g. K-means algorithm), distribution models (e.g., expectation- maximization algorithm), density models (e.g., density-based spatial clustering of applications with noise (DBSCAN), ordering points to identify the clustering structure (OPTICS)), subspace models (e.g., biclustering), group models, graph-based models (e.g., highly connected subgraphs (HCS) clustering algorithms), single graph models, and neural models (e.g., using unsupervised neural network). The machine learning algorithm may utilize a plurality of models, e.g., in equal weights or in different weights. In some examples, the graph-based models may include graph-based clustering algorithms that use modularity, e.g., such as described in the following references, the entire contents of each of which are incorporated by reference herein: Blondel et al., “Fast unfolding of communities in large networks,” Journal of Statistical Mechanics: Theory and Experiment 2008: P10008 (2008); and Traag et al. “From Louvain to Leiden: guaranteeing well-connected communities,” Scientific
Reports 9: 5233, 12 pages, (2019).
[0110] In some examples, unsupervised and self-supervised approaches can be used to expedite labeling of image data of cells (extract features from cells). For the example of unsupervised, an embedding for a cell image can be generated. For example, the embedding can be a representation of the image in a space with reduced dimensions than the original image data. Such embeddings can be used to cluster images that are similar to one another. Thus, the labeler can be configured to batch- label the cells and increase the throughput as compared to manually labeling one or more cells.
[0111] In some examples, for the example of self-supervised learning, additional meta information (e.g., additional non-morphological information) about the sample (e.g., what disease is known or associated with the patient who provided the sample) can be used for labeling of image data of cells.
[0112] In some examples, embedding generation may use a neural net trained on predefined cell types. To generate the embeddings described herein, an intermediate layer of the neural net that is trained on predetermined image data (e.g., image data of known cell types and/or states) can be used.
[0113] In some examples, by providing enough diversity in image data or sample data to the trained model, this method may have a benefit of providing an accurate way to cluster future cells.
[0114] In some examples, embedding generation may use neural nets trained for different tasks. To generate the embeddings described herein, an intermediate layer of the neural net that is trained for a different task (e.g., a neural net that is trained on a canonical dataset such as ImageNet). Without wishing to be bound by any particular theory, this can allow the system to focus on features that matter for image classification {e.g., edges and curves) while removing a bias that may otherwise be introduced in labeling the image data.
[0115] In some examples, autoencoders can be used for embedding generation. To generate the embeddings described herein, autoencoders can be used, in which the input and the output can be substantially the same image and the squeeze layer can be used to extract the embeddings. The squeeze layer may force the model to learn a smaller representation of the image, which smaller representation may have sufficient information to recreate the image (e.g., as the output).
[0116] In some examples, for clustering-based labeling of image data or cells, as disclosed herein, an expanding training data set can be used. With the expanding training data set, one or more revisions of labeling (e.g., manual relabeling) can be needed to, for example, avoid the degradation of model performance due to the accumulated effect of mislabeled images. Such manual relabeling can be intractable on a large scale and ineffective when done on a random subset of the data. Thus, in some examples, to systematically surface images for potential relabeling, similar embedding-based clustering can be used to identify labeled images that may cluster with members of other classes. Such examples are likely to be enriched for incorrect or ambiguous labels, which can be removed (e.g., automatically or manually).
[0117] In some examples, adaptive image augmentation can be used. In order to make the models disclosed herein more robust to artifacts in the image data, (1) one or more images with artifacts can be identified, and (2) such images identified with artifacts can be added to training pipeline (e.g., for training the model). Identifying the image(s) with artifacts may comprise: (1a) while imaging cells, one or more additional sections of the image frame can be cropped, which frame(s) being expected to contain just the background without any cell; (2a) the background image can be checked for any change in one or more characteristics (e.g., optical characteristics, such as brightness); and (3a) flagging/labeling one or more images that have such change in the characteristic(s). Adding the identified images to training pipeline may comprise: (2a) adding the one or more images that have been flagged/labeled as augmentation by first calculating an average feature of the changed characteristic(s) (e.g., the background median color); (2b) creating a delta image by subtracting the average feature from the image data (e.g., subtracting the median for each pixel of the image); and (3c) adding the delta image to the training pipeline.
[0118] In any of the methods or platforms disclosed herein, the model(s) can be validated (e.g., for the ability to demonstrate accurate cell classification performance). Non-limiting examples of validation metrics that can be utilized may include, but are not limited to, threshold metrics (e.g., accuracy, F-measure, Kappa, Macro-Average Accuracy, Mean-Class-Weighted Accuracy, Optimized
Precision, Adjusted Geometric Mean, Balanced Accuracy, etc.), the ranking methods and metrics (e.g., receiver operating characteristics (ROC) analysis or “ROC area under the curve (ROC AUC)”), and the probabilistic metrics {e.g., root-mean-squared error). For example, the model(s) can be determined to be balanced or accurate when the ROC AUC is greater than about 0.5 - e.g., greater than about 0.55, greater than about 0.6, greater than about 0.65, greater than about 0.7, greater than about 0.75, greater than about 0.8, greater than about 0.85, greater than about 0.9, greater than about 0.91, greater than about 0.92, greater than about 0.93, greater than about 0.94, greater than about 0.95, greater than about 0.96, greater than about 0.97, greater than about 0.98, greater than about 0.99, or higher.
[0119] As noted further above, the output of the machine learning encoder (model) may include, may consist of, or may consist essentially of, at least one multidimensional vector (which may also be referred to herein as an embedding). Elements of the vector(s) for a given image may correspond to the values of respective features that the machine learning encoder extracted from that image. Table 1 below describes example machine learning dimensions (for example, deep learning dimensions), which correspond to different features that the machine learning encoder extracts from images. In some examples, the machine learning encoder extracts n ML-based features from each image (where n is a positive integer), and outputs an array of length 1, which array can be considered to be an n- dimensional vector. For the illustrative machine learning dimensions listed in Table 1, the output of the deep learning encoder may have the format: [V1 Va V3... Val, where the subscripts 1...n correspond to the respective deep learning dimension numbers, and wherein the letter V represents the value of the feature in that image that the deep learning encoder calculated.
The value of 1 can be in any suitable range, e.g., can be between about 5 and about 1000 - e.g., between about 10 and about 500, between about 50 and about 100, or range in between any of the aforementioned values. In the nonlimiting example shown in Table 1, 7 is equal to 64.
TABLE 1. EXAMPLE DEEP LEARNING DERIVED FEATURES GENERATED USING THE
HUMAN FOUNDATION MODEL. dimension number
[0120] In one example, the ML-based features are not human-interpretable. In one example, because the ML-based features are identified using machine learning, Al, or both, the features are not human-interpretable. For example, the elements of the vector generated by the machine learning encoder may have numeric values, such as [0.1 4 2.3 … 10], that correspond to the quantitative “amount” of certain features that the machine learning encoder has identified as being present or not in a given image. However, in some examples it may not be possible to identify the meaning of these features, for example as they may correspond to features that are identified by neurons of a convolutional neural network and it is not possible in these examples to reconstruct what those neurons considered to be a feature.
Cell Classification
[0121] In some examples, analysis of imaging data as disclosed herein (e.g., particle imaging data, such as cell imaging data) can be performed using artificial intelligence, such as one or more machine learning algorithms. In some examples, one or more machine learning models can be used to automatically sort or categorize particles (e.g., cells) in the imaging data into one or more classes (e.g., one or more physical characteristics or morphological features, as used interchangeably herein). In some examples, cell imaging data can be analyzed using the machine learning algorithm(s) to classify (e.g., sort) a cell {e.g., a single cell) in a cell image or video. In some examples, cell imaging data can be analyzed using the machine learning algorithm(s) to determine a focus score of a cell (e.g., a single cell) in a cell image or video. In some examples, cell imaging data can be analyzed using the machine learning algorithm(s) to determine a relative distance between (i) a first plane of cells exhibiting first similar physical characteristic(s) and (ii) a second plane of cells exhibiting second similar physical characteristic(s), which first and second planes denote fluid streams flowing substantially parallel to each other in a channel. In some examples, one or more cell morphology maps as disclosed herein can be used to train one or more machine learning models (e.g., at least about 1, at least about 2, at least about 3, at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, at least about 10, or more machine learning models) as disclosed herein. Each machine learning model can be trained to analyze one or more images of a cell (e.g., to extract one or more morphological features of the cell) and categorize (or classify) the cell into one or more determined class or categories of a cell (e.g., based on a type of state of the cell). In another example, the machine learning model can be trained to create a new category to categorize (or classify) the cell into the new category, e.g., when determining that the cell is morphologically distinct than any pre-existing categories of other cells.
[0122] In some examples, the entire process of cell focusing as disclosed herein (e.g., partitioning of cells into one or more planar currents flowing through the channel) can be accomplished based on de novo Al-mediated analysis of each cell (e.g., using analysis of one or more images of each cell using machine learning algorithm). This can be a complete Al or a full Al approach for cell sorting and analysis. In another example, a hybrid approach can be utilized, wherein Al-mediated analysis may analyze cells and one or more heterologous markers that are co-partitioned with the cells (e.g., into the same planar current flowing through the channel), confirm or determine the co-partitioning, after which a more conventional approach (e.g., imaging to detect presence of the heterologous markers, such as fluorescent imaging) can be utilized to sort a subsequent population of cells and the heterologous markers that are co-partitioned into the same planar current.
[0123] The machine learning model (e.g., a metamodel) can be trained by using a learning model and applying learning algorithms (e.g., machine learning algorithms) on a training dataset (e.g., a dataset comprising examples of specific classes). In some examples, given a set of training examples/cases, each marked for belonging to a specific class (e.g., specific cell type or class), a training algorithm may build a machine learning model capable of assigning new examples/cases (e.g., new datapoints of a cell or a group of cells) into one category or the other, e.g., to make the model a non-probabilistic machine learning model. In some examples, the machine learning model can be capable of creating a new category to assign new examples/cases into the new category. In some examples, a machine learning model can be the actual trained model that is generated based on the training model.
[0124] The machine learning algorithm as disclosed herein can be configured to extract one or more morphological features of a cell from the image data of the cell. The machine learning algorithm may form a new data set based on the extracted morphological features, and the new data set need not contain the original image data of the cell. In some examples, replicas of the original images in the image data can be stored in a database disclosed herein, e.g., prior to using any of the new images for training, e.g., to keep the integrity of the images of the image data. In some examples, processed images of the original images in the image data can be stored in a database disclosed herein during or subsequent to the classifier training. In some examples, any of the newly extracted morphological features as disclosed herein can be utilized as new molecular markers for a cell or population of cells of interest to the user. As cell analysis platform as disclosed herein can be operatively coupled to one or more databases comprising non-morphological data of cells processed (e.g., genomics data, transcriptomics data, proteomics data, metabolomics data), a selected population of cells exhibiting the newly extracted morphological feature(s) can be further analyzed by their non-morphological properties to identify proteins or genes of interest that are common in the selected population of cells but not in other cells, thereby determining such proteins or genes of interest to be new molecular markers that can be used to identify such selected population of cells.
[0125] In some examples, a machine learning model can be trained by applying machine learning algorithms on at least a portion of one or more cell morphology maps as disclosed herein as a training dataset. Non-limiting examples of machine learning algorithms for training a machine learning model may include supervised learning, unsupervised learning, semi-supervised learning, reinforcement learning, self-learning, feature learning, anomaly detection, association rules, etc. In some examples, a machine learning model can be trained by using one or more learning models on such training dataset. Non-limiting examples of learning models may include artificial neural networks (e.g., convolutional neural networks, U-net architecture neural network, etc.), backpropagation, boosting, decision trees, support vector machines, regression analysis, Bayesian networks, genetic algorithms, kernel estimators, conditional random field, random forest, ensembles of machine learning models, minimum complexity machines (MCM), probably approximately correct learning (PACT), etc.
[0126] In some examples, the neural networks are designed by the modification of neural networks such as AlexNet, VGGNet, GoogLeNet, ResNet (residual networks), DenseNet, and Inception networks. In some examples, the enhanced neural networks are designed by modification of ResNet (e.g. ResNet 18, ResNet 34, ResNet 50, ResNet 101, and ResNet 152) or inception networks. In some examples, the modification comprises a series of network surgery operations that are mainly carried out to improve including inference time and/or inference accuracy.
[0127] The machine learning algorithm as disclosed herein may utilize one or more clustering algorithms to determine that objects (e.g., cells) in the same cluster can be more similar (in one or more morphological features) to each other than those in other clusters. Non-limiting examples of the clustering algorithms may include, but are not limited to, connectivity models (e.g., hierarchical clustering), centroid models (e.g. K-means algorithm), distribution models (e.g., expectation- maximization algorithm), density models (e.g., density-based spatial clustering of applications with noise (DBSCAN), ordering points to identify the clustering structure (OPTICS)), subspace models {e.g., biclustering), group models, graph-based models (e.g., highly connected subgraphs (HCS) clustering algorithms), single graph models, and neural models {e.g., using unsupervised neural network). The machine learning algorithm may utilize a plurality of models, e.g., in equal weights or in different weights.
[0128] In some examples, unsupervised and self-supervised approaches can be used to expedite labeling of image data of cells. For the example of unsupervised, an embedding for a cell image can be generated. For example, the embedding can be a representation of the image in a space with reduced dimensions than the original image data. Such embeddings can be used to cluster images that are similar to one another. Thus, the labeler can be configured to batch-label the cells and increase the throughput as compared to manually labeling one or more cells.
[0129] In some examples, for the example of self-supervised learning, additional meta information {eg additional non-morphological information) about the sample (e.g., what disease is known or associated with the patient who provided the sample) can be used for labeling of image data of cells.
[0130] In some examples, embedding generation (e.g., see FIGS. 9-10 and 16-17) may use a neural net trained on predefined cell types. To generate the embeddings described herein, an intermediate layer of the neural net that is trained on predetermined image data (e.g., image data of known cell types and/or states) can be used. By providing enough diversity in image data/sample data to the trained model, this method may provide an accurate way to cluster future cells.
[0131] In some examples, embedding generation may use neural nets trained for different tasks. To generate the embeddings described herein, an intermediate layer of the neural net that is trained for a different task (e.g., a neural net that is trained on a canonical dataset such as ImageNet). Without wishing to be bound by any particular theory, this may allow to focus on features that matter for image classification (e.g., edges and curves) while removing a bias that may otherwise be introduced in labeling the image data.
[0132] In some examples, autoencoders can be used for embedding generation. To generate the embeddings described herein, autoencoders can be used, in which the input and the output can be substantially the same image and the squeeze layer can be used to extract the embeddings. The squeeze layer may force the model to learn a smaller representation of the image, which smaller representation may have sufficient information to recreate the image (e.g., as the output).
[0133] In some examples, for clustering-based labeling of image data or cells, as disclosed herein, an expanding training data set can be used. With the expanding training data set, one or more revisions of labeling (e.g., manual relabeling) can be needed to, e.g., avoid the degradation of model performance due to the accumulated effect of mislabeled images. Such manual relabeling can be intractable on a large scale and ineffective when done on a random subset of the data. Thus, to systematically surface images for potential relabeling, for example, similar embedding-based clustering can be used to identify labeled images that may cluster with members of other classes. Such examples are likely to be enriched for incorrect or ambiguous labels, which can be removed (e.g., automatically or manually).
[0134] In some examples, adaptive image augmentation can be used. In order to make the models disclosed herein more robust to artifacts in the image data, (1) one or more images with artifacts can be identified, and (2) such images identified with artifacts can be added to training pipeline {(e.g., for training the model). Identifying the image(s) with artifacts may comprise: (1a) while imaging cells, one or more additional sections of the image frame can be cropped, which frame(s) being expected to contain just the background without any cell; (2a) the background image can be checked for any change in one or more characteristics (e.g., optical characteristics, such as brightness); and (3a) flagging/labeling one or more images that have such change in the characteristics). Adding the identified images to training pipeline may comprise: (2a) adding the one or more images that have been flagged/labeled as augmentation by first calculating an average feature of the changed characteristic(s) (e.g., the background median color); (2b) creating a delta image by subtracting the average feature from the image data (e.g., subtracting the median for each pixel of the image); and (3c) adding the delta image to the training pipeline.
[0135] Because the ML-based features are identified using machine learning, the features may not be human-interpretable. For example, the elements of the vector generated using the machine learning encoder may have numeric values, such as [0.1 4 2.3 … 10], that correspond to the quantitative “amount” of certain features that the machine learning encoder has identified as being present or not in a given image. However, it may not necessarily be possible to identify the meaning of these features, for example as they may correspond to features that are identified by neurons of a convolutional neural network and it is not possible to reconstruct what those neurons considered to be a feature.
Computer Vision model of HFM
[0136] The computer vision model of the human foundation model may include a set of rules to identify cell morphometric features within an image, and to encode those features into a multidimensional vector. In some examples, the rules can be human defined, and may correspond to features that can be understood by a human.
[0137] As noted further above, the output of the computer vision encoder (model) may include, may consist of, or may consist essentially of, at least one multidimensional vector (which may also be referred to herein as an embedding). Elements of the vector(s) for a given image may correspond to the values of respective features that the computer vision encoder extracted from that image. Because the features are human defined, the features can be human-interpretable. Table 2 below describes example computer vision dimensions (morphometric features), which correspond to different features that the computer vision encoder may extract from images.
Table 2. Example morphometric features generated using the human foundation model. ‘>=’ denotes metrics that may also be referred to as blobs or granules.
Morphometric Morphometric
Category Dimension # Name Description Schematic
Position HFMv1:MDO001 Centroid X axis | X axis position of the
Features (um) cell relative to the camera’s field of view
HFMv1:MD002 Centroid Y axis | Y axis position of the (Lm) cell relative to the camera’s field of view
Cell Shape | HFMv1:MD003 Area (um?) Cell area (TTT
HFMv1:MD004 Perimeter (um) | Length of the cell EN outline ap J
HEMv1:MD005 Maximum Width of widest Pee caliper distance | possible box around / ; (um) the cell j
HFMv1:MD006 Minimum Width of narrowest ss
Lo 7 ] caliper distance | possible box around \ (4m) the cell Dnt
HFMv1:MD007 Maximum Largest radius from 7 radius (pm) center of cell-to-cell CC) border mn
HFMv1:MD008 Minimum Shortest radius from radius (um) center of cell-to-cell CD) border
HFMv1:MD009 | Long ellipse Long axis of best fit TTT axis (um) ellipse
HFMvi:MD010 Short ellipse Short axis of best fit ~~ = : axis (um) ellipse \ : kn — i 5
HFMv1:MD011 Ellipse Aspect ratio of best fit elongation ellipse Metric: 0 TTS (unitless) indicates a circle, 1 J indicates a line Ne LE
HFMvi:MD012 Ellipse Deviation from an similarity elliptical shape (unitless) Metric: 0 indicates a en, perfectly elliptical { ) shape Sa rad
HFMv1:MD013 Roundness Roundness is a (unitless) measure for circularity or compactness of the shape Metric: Range is
Otol, where lisa () perfect circle
HFMvi:MD014 Circle similarity | Deviation from a oe (unitless) circular shape Metric: | #7 De > 0 indicates a perfectly “| ij circular shape rE
HFMv1:MD015 Convex shape Ratio of the area of the (unitless) convex hull of the cell Te i % to the total area of the \ cell Me ”
Pixel HFMv1:MD016 Mean pixel Mean pixel grayscale
Intensity Intensity value, refers to how
Features (arbitrary units) | much light a cell absorbs and/or scatters
Metric: Range is - 1 to 1
HFMv1:MD017 Standard Standard deviation of deviation of pixel grayscale values pixel intensity gives an indication of the uniformity of pixel intensity within the cell
HFMv1:MD018 Pixel intensity | 25th percentile of 25th percentile | pixel grayscale values
HFMv1:MDO019 Pixel intensity | 75th percentile of 75th percentile | pixel grayscale values
HFMv1:MDO020 Positive fraction | Fraction of pixels with a grayscale value significantly above background
HFMv1:MDO021 Negative Fraction of pixels with fraction a grayscale value significantly below background
Texture HFMv1:MD022 Small set of Sum of pixel
Features connected intensities in regions bright pixels*, | identified as being . « integral small bright structures LL s
Small is described as ~8 pixels but smaller sets could be detected
HFMv1:MD023 Small set of Sum of pixel connected dark | intensities in regions pixels*, integral | identified as being « . © small dark structures De
Small is defined as ~8 pixels but smaller sets could be detected
HFMv1:MD024 Large set of Sum of pixel
N N connected intensities in regions - oo je . ok . ified as no RR bright pixels“, | identified as being LL integral large bright structures
Large is defined as ~32 pixels but larger sets could be detected
HFMv1:MD025 Large set of Sum of pixel connected dark | intensities in regions pixels*, integral | identified as being S$ ss large dark structures 1
Large is defined as | EE ~32 pixels bat larger sets could be detected
HFMvi:MD026- | Image moments | Two values that de-
HFMv1:MD027 (2 moments) scribe the weighted average or distribution | cen of the image pixel SG) 8 intensities within the cell in a rotation and scale-invariant manner
HFMv1:MDO028 - | Local binary The 10 Local Binary
HFMv1:MD037 patterns - center | Pattern (LBP) — (10 patterns) Center features determine the texture EN KE inside the cell and MES Nie NSS describe the appearance near the center of the cell
HFMv1:MDO038 - | Local binary The 10 Local Binary
HFMv1:MD047 patterns — Pattern (LBP) - periphery (10 Periphery features patterns) determine the texture ED ASN eR at the periphery of the | Nest Seas Sed cell and describe the appearance near the edge of the cell
Focus HFMv1:MD048 Image A measure of how
Features sharpness sharp or smooth the (arbitrary units) | image is. Typically, an out-of-focus image is less sharp than an in- focus image.
HFMv1:MD049 Image focus Estimate of the (um) distance of the cell to the focal plane of the microscope
HFMv1:MD050 Ring width The imaging modality (arbitrary units) | creates a dark or bright ring to appear around the cell, which is larger and more intense the more out of focus the cell is.
This feature estimates the width of the ring.
HFMvi:MDO51 Ring intensity The imaging modality (arbitrary units) | creates a dark or bright ring to appear around the cell, which is larger and more intense the more out of focus the cell is.
This feature estimates the intensity of the ring.
[0138] From Table 2, it can be understood that morphometric features can be categorized into different groups. For example, cell morphometric features can be selected from the group consisting of position features, cell shape features, pixel intensity features, texture features, and focus features.
In some examples, position features can be selected from the group consisting of: centroid X axis and centroid Y axis, where Table 2 provides respective descriptions for such features. In some examples, cell shape features can be selected from the group consisting of: area, perimeter, maximum caliper distance, minimum caliper distance, maximum radius, minimum radius, long ellipse axis, short ellipse axis, ellipse elongation, ellipse similarity, roundness, circle similarity, and convex shape, where Table 2 provides respective example descriptions for such features. In some examples, pixel intensity features are selected from the group consisting of: mean pixel intensity, standard deviation of pixel intensity, pixel intensity 25th percentile, pixel intensity 75th percentile, positive fraction, and negative fraction, where Table 2 provides respective example descriptions for such features. In some examples, texture features can be selected from the group consisting of: small set of connected bright pixels, integral; small set of connected dark pixels, integral; large set of connected bright pixels, integral; large set of connected dark pixels, integral; image moments; local binary patterns — center; local binary patterns — periphery; image sharpness; image focus; ring width; and ring intensity.
[0139] In some examples, the computer vision encoder extracts m morphometric features from each image (where m is a positive integer), and outputs an array of length m, which array can be considered to be an m-dimensional vector. For the illustrative computer vision dimensions listed in Table 2, the output of the computer vision encoder may have the format: [Wi Wz Wi... Wal, where the subscripts 1... correspond to the respective computer vision dimension numbers, and wherein the letter W represents the value of the feature in that image that the computer vision encoder calculated. The value of m can be in any suitable range, e.g., can be between about 5 and about 1000, e.g., between about 10 and about 500, e.g., between about 50 and about 100. In the nonlimiting example shown in Table 2, m is equal to 51.
[0140] Because the morphometric features represent features that are visible by both human and computer vision, the features can be human-interpretable. For example, the elements of the vector generated using the computer vision encoder may have numeric values, such as [5 0.8 1.4... 3.7}, that correspond to the quantitative “amount” of certain features that the computer vision encoder has identified as being present or not in a given image. The meanings of these features also can be understood by a human. For example, based on Table 2 it can be understood that the value of the first element (e.g., 5) is the centroid X axis in um of the image (meaning the X axis position of the cell relative to the camera’s field of view), the value of the third element (e.g., 1.4) is the cell area in um’, and so on.
[0141] The computer vision encoder can be implemented using any suitable combination of hardware and software. As one example, the system component which is implementing the HFM may include a processor and a non-volatile computer-readable medium that includes instructions for causing the processor to respectively process cell images using a computer vision encoder. The computer vision encoder can be configured to quantify the characteristics (e.g., to measure dimensions or intensities) of different features within respective cell images, and to output a vector the dimensions (elements) of which correspond to the measured values of those respective characteristics.
Encoding ML-Based Features and Cell Morphometric Features
[0142] As noted elsewhere herein, the set of ML-based features extracted using the machine learning encoder and the set of cell morphometric features extracted using the computer vision encoder can be used to respectively encode the set of ML-based features and the set of cell morphometric features into a plurality of multi-dimensional vectors that represent morphology of a cell in a cell image. In examples in which the machine learning encoder extracts n ML-based features, and the computer-vision encoder extracts m cell morphological features, the multi-dimensional vectors may have n+m dimensions, where n and m are positive integers. Within each of the multi-dimensional vectors, each dimension of the n+m dimensions can be an element of that multi-dimensional vector, e.g., a numeric value. As one example, continuing with the example provided above, the ML-based features and the cell morphometric features can be concatenated to generate a multi-dimensional vector having the format: [Vi Vz Vi..V, Wi Wz Ws... Wa), where, similarly as above, the subscripts 1...n correspond to the respective deep learning dimension numbers, the letter V represents the value of the feature in that image that the deep learning encoder calculated. the subscripts 1...m correspond to the respective computer vision dimension numbers, and the letter W represents the value of the feature in that image that the computer vision encoder calculated. Note that the value of m can be the same as the value of n, in which example the plurality of multi-dimensional vectors extracted using the machine learning encoder and the computer vision encoder may include a same number of each of the ML-based features and the cell morphological features. In another example, the value of m can be different than the value of », in which example the plurality of multi-dimensional vectors extracted using the machine learning encoder and the computer vision encoder may include a different number of each of the ML-based features and the cell morphological features. It will further be appreciated that in some examples, an array of length (n+), or a vector of length (n+m), can be interpreted as being a plurality of multi-dimensional vectors, each such vector having one or more elements.
[0143] In one example, in a manner such as noted further above, the ML-based features can be orthogonal to one another as explained more particularly in FIGs. 9-10. By this, it is meant that the
ML-based features may all be different than one another, and may all be uncorrelated to one another.
For example, ML-based feature Vi can be different than, and uncorrelated to, each of ML-based features Va... Vs. In some other examples, the ML-based features can be orthogonal to the cell morphological features.
System for Cell Morphology Analysis
[0144] Cell morphology can be highly indicative of a cell’s phenotype and function, but it is also highly dynamic and complex. Traditional analysis of cell morphology by human eyes has significant limitations. Other methods of assessing and characterizing cell morphology are also limited to imaging or sorting with cell labels.
[0145] Some examples of the present disclosure provides a quantitative, high-dimensional, unbiased platform to assess cell morphology and magnify insights into a cell’s phenotype and functions. In some examples, the system as described herein may provide imaging of single cells and label-free sorting in one platform. For example, the system may directly capture high-resolution brightfield images of cells in real time. The system may also enable cell sorting based on their morphology without involving any cell labels. The cells may remain viable and minimally perturbed after the sorting process. In addition, the system may allow collection of sorted cells for downstream analysis, for example. single-cell RNA sequencing.
[0146] The system may comprise, or be compatible with the human foundation model for high- dimensional morphological feature analysis. In some examples, the system may comprise or be compatible with a data suite that may allow users to store, visualize, and analyze images and high- dimensional data. Hence, the system may enable the end-to-end process including cell imaging, morphology analysis, sorting, and classification. In some examples, the system may comprise a microfluidic platform. When cells flow through the microfluidic platform, the system may capture high-resolution brightfield images of each individual cell. The images can be processed by the human foundation model for extracting high-dimensional features corresponding to the cells. The system may sort the cells in different categories, based on the distinct morphological features. The imaging. single- cell morphology analysis, sorting, and classification may occur in real time.
[0147] FIG. 2A illustrates system 100 which includes, and illustrates the interaction between, a microfluidics platform 20 (e.g., corresponding to any microfluidics platform of this disclosure), the human foundation model 60 (e.g., a deep learning model), and a data suite 40, in accordance with some examples of the present disclosure. The system 100 for cell morphology analysis may include microfluidics platform 20, which may include or be compatible with the human foundation model 60 and the data suite 40. Example interactions between the microfluidics platform 20, the human foundation model 60, and the data suite 40 will be described in further detail elsewhere herein, including below in accordance with FIG. 2B. In some aspects, brightfield images of single cells are analyzed in real-time by model 60 to generate quantitative AI embeddings that can include reproducible high-dimensional descriptions of cell morphology. In some aspects, morphologically distinct cell groups can be sorted in real-time by system 100 for downstream analysis, including up to approximately 6 populations per run.
[0148] FIG. 2B illustrates an example workflow from high-throughput imaging to cell characterization, classification and sorting based on cell morphology analysis, in accordance with some examples of the present disclosure. The system for cell morphology profiling as described herein may include a benchtop microfluidics platform that captures high-resolution brightfield images of single cells and sort cells in a label-free manner. The microfluidics platform may comprise or be compatible with the human foundation model and a data suite that may allow users to store, visualize, and analyze images and high-dimensional data. The workflow 200 can be streamlined, starting from preparing and loading cells onto the microfluidics platform (operation 210). In some examples of operation 210, samples from established human cell lines or dissociated tissue biopsies in single cell suspension are loaded onto a microfluidic chip. In some examples, the preparation of samples may comprise dissociation of cells into a single-cell suspension and loading the suspension onto the microfluidics platform. Subsequently, the system may capture images of the cells, and the human foundation model may characterize the cells in real time as they flow through the microfluidic chip (operation 220). In some examples of operation 220, images of single cells are captured and analyzed in real-time by the human foundation model to generate multi-dimensional quantitative morphological profiles (operation 230). The human foundation model may process the images of the cells and generate high-dimensional features reflecting the cell morphology. The images and extracted features can be stored in the data suite (operation 240). The human foundation model also visualizes the cell morphology data by, for example, generating user-defined cell clusters based on cell types {also operation 240). The data suite may also provide in-depth data analysis, including selecting cell populations of interest to sort on the microfluidics platform. The system may recover sorted cells in a plurality of collection wells (operation 250), which can be used for downstream analyses. In addition, the collected morphology data (referred to as embeddings) can be further analyzed as a unique modality, and users can continuously train customized models for specific applications.
[0149] FIG. 3 illustrates an example system for cell morphology analysis, in accordance with some examples of the present disclosure. The system 300 may comprise a benchtop microfluidics platform 310 that captures high-resolution brightfield images of single cells and sort cells in a label-free manner.
System 300 also may include data suite 330, which can be implemented using (and integrated with) microfluidics platform 310, or can be implemented using a separate device. Tables 3 and 4, provided further below, list example parameters, specifications, and components of system 300.
[0150] FIG. 3 schematically illustrates an example method for classifying a cell. The method can comprise processing image data 310 comprising tag-free images/videos of single cells {e.g., image data 310 consisting of tag-free images/videos of single cells). Various clustering analysis models 320 as disclosed herein can be used to process the image data 310 to extract one or more morphological properties of the cells from the image data 310, and generate a cell morphology map 330A based on the extracted one or more morphological properties. For example, the cell morphology map 330A can be generated based on two morphological properties as dimension 1 and dimension 2. The cell morphology map 330A can comprise one or more clusters (e.g., clusters A, B, and C) of datapoints, each datapoint representing an individual cell from the image data 310. The cell morphology map 330A and the clusters A-C therein can be used to train classifier(s) 350. Subsequently, a new image 340 of a new cell can be obtained and processed by the trained classifier(s) 350 to automatically extract and analyze one or more morphological features from the cellular image 340 and plot it as a datapoint on the cell morphology map 330A. Based on its proximity, correlation, or commonality with one or more of the morphologically-distinct clusters A-C on the cell morphology map 330A, the classifier(s) 350 can automatically classify the new cell. The classifier(s) 350 can determine a probability that the cell in the new image data 340 belongs to cluster C (e.g., the likelihood for the cell in the new image data 340 to share one or more commonalities and/or characteristics with cluster C more than with other clusters A/B). For example, the classifier(s) 350 can determine and report that the cell in the new image data 340 has a 95% probability of belonging to cluster C, 1% probability of belonging to cluster
B, and 4% probability of belong to cluster A, solely based on analysis of the tag-free image 340 and one or more morphological features of the cell extracted therefrom.
[0151] An image and/or video (e.g., a plurality of images and/or videos) of one or more cells as disclosed herein (e.g., that of image data 310 in FIG. 3) can be captured while the cell(s) is suspended in a fluid (e.g., an aqueous liquid, such as a buffer) and/or while the cell(s) is moving (e.g., transported across a microfluidic channel). For example, the cell need not be suspended is a gel-like or solid-like medium. The fluid can comprise a liquid that is heterologous to the cell(s)’s natural environment. For example, cells from a subject’s blood can be suspended in a fluid that comprises (i) at least a portion of the blood and (11) a buffer that is heterologous to the blood. The cell(s) may be not immobilized (e.g., embedded in a solid tissue or affixed to a microscope slide, such as a glass slide, for histology) or adhered to a substrate. The cell(s) can be isolated from the natural environment or niche (e.g., a part of the tissue the cell(s) it would be in if not retrieved from a subject by human intervention) when the image and/or video of the cell(s) is captured. For example, the image and/or video need not be from a histological imaging. The cell(s) need not be sliced or sectioned prior to obtaining the image and/or video of the cell, and, as such, the cell(s) may remain substantially intact as a whole during capturing of the image and/or video.
[0152] When the image data is processed, e.g., to extract one or more morphological features of a cell, each cell image can be annotated with the extracted one or more morphological features and/or with information that the cell image belongs to a particular cluster (e.g., a probability). {0153] The cell morphology map can be a visual (e.g., graphical) representation of one or more clusters of datapoints. The cell morphology map can be a 1-dimensional (1D) representation (e.g., based on one morphological property as one parameter or dimension) or a multi-dimensional representation, such as a 2-dimensional (2D) representation (e.g., based on two morphological properties as two parameters or dimensions), a 3-dimensional (3D) representation (e.g., based on three morphological properties as three parameters or dimensions), a 4-dimensional (4D) representation, etc. In some examples, one morphological property of a plurality of morphological properties used for blotting the cell morphology map can be represented as a non-axial parameter (e.g., non-x, y, or z axis), such as, distinguishable colors (e.g., heatmap), numbers, letters (e.g., texts of one or more languages), and/or symbols (e.g.. a square, oval, triangle, square, etc.). For example, a heatmap can be used as colorimetric scale to represent the classifier prediction percentages for each cell against a cell class, cell type, or cell state.
[0154] The cell morphology map can be generated based on one or more morphological features (e.g., characteristics, profiles, fingerprints, etc.) from the processed image data. Non-limiting examples of one or more morphological properties of a cell, as disclosed herein, that can be extracted from one or more images of the cell can include, but are not limited to (i) shape, curvature, size {e.g., diameter, length, width, circumference), area, volume, texture, thickness, roundness, etc. of the cell or one or more components of the cell (e.g., cell membrane, nucleus, mitochondria, etc.), (ii) number or positioning of one or more contents (e.g., nucleus, mitochondria, etc.) of the cell within the cell (e.g., center, off-centered, etc.), and (iii) optical characteristics of a region of the image(s) (e.g., unique groups of pixels within the image(s)) that correspond to the cell or a portion thereof (e.g., light emission, transmission, reflectance, absorbance, fluorescence, luminescence, etc.).
[0155] Non-limiting examples of clustering as disclosed herein can be hard clustering (e.g., determining whether a cell belongs to a cluster or not), soft clustering (e.g., determining a likelihood that a cell belongs to each cluster to a certain degree), strict partitioning clustering (e.g., determining whether each cell belongs to exactly one cluster), strict partitioning clustering with outliers (e.g, determining whether a cell can also belong to no cluster), overlapping clustering (e.g., determining whether a cell can belong to more than one cluster), hierarchical clustering (e.g., determining whether cells that belong to a child cluster can also belong to a parent cluster), and subspace clustering (e.g., determining whether clusters are not expected to overlap).
[0156] Cell clustering and/or generation of the cell morphology map, as disclosed herein, can be based on a single morphological property of the cells. In another example, cell clustering and/or generation the cell morphology map can be based on a plurality of different morphological properties of the cells. In some examples, the plurality of different morphological properties of the cells can have the same weight or different weights. A weight can be a value indicative of the importance or influence of each morphological property relative to one another in training the classifier or using the classifier to (i) generate one or more cell clusters, (ii) generate the cell morphology map, or (iii) analyze a new cellular image to classify the cellular image as disclosed herein. For example, cell clustering can be performed by having 50% weight on cell shape, 40% weight on cell area, and 10% weight on texture (e.g., roughness) of the cell membrane. In some examples, the classifier as disclosed herein can be configured to adjust the weights of the plurality of different morphological properties of the cells during analysis of new cellular image data, thereby to yield a most optimal cell clustering and cell morphology map. The plurality of different morphological properties with different weights can be utilized during the same analysis operation for cell clustering and/or generation of the cell morphology map. {0157] The plurality of different morphological properties can be analyzed hierarchically. In some examples, a first morphological property can be used as a parameter to analyze image data of a plurality of cells to generate an initial set of clusters. Subsequently, a second and different morphological property can be used as a second parameter to (i) modify the initial set of clusters (e.g., optimize arrangement among the initial set of clusters, re-group some clusters of the initial set of clusters, etc.) and/or (ii) generate a plurality of sub-clusters within a cluster of the initial set of clusters.
In some examples, a first morphological property can be used as a parameter to analyze image data of a plurality of cells to generate an initial set of clusters, to generate a ID cell morphology map.
Subsequently, a second morphological property can be used as a parameter to further analyze the clusters of the ID cell morphology map, to modify the clusters and generate a 2D cell morphology map (e.g., a first axis parameter based on the first morphological property and a second axis parameter based on the second morphological property).
[0158] In some examples of the hierarchical clustering as disclosed herein, an initial set of clusters can be generated based on an initial morphological feature that is extracted from the image data, and one or more clusters of the initial set of clusters can comprise a plurality of sub-clusters based on second morphological features or sub-features of the initial morphological feature. For example, the initial morphological feature can be cell type, such as stem cells (or not), and the sub-features can be different types of stem cells (e.g., embryonic stem cells, induced pluripotent stem cells, mesenchymal stem cells, muscle stem cells, etc.). In another example, the initial can be cancer cells (or not), and the sub-feature can be different types of cancer cells (e.g., sarcoma cells, sarcoma cells, leukemia cells, lymphoma cells, multiple myeloma cells, melanoma cells, etc.). In a different example, the initial can be cancer cells (or not), and the sub-feature can be different stages of the cancer cell (e.g., quiescent, proliferative, apoptotic, etc.).
[0159] Each datapoint can represent an individual cell or a collection of a plurality of cells (e.g., at least about 2 - e.g., at least about 3, at least about 4, at least about 5, at least about 6, at least about 7, atleast about 8, at least about 9, or at least about 10 cells Or more). Each datapoint can represent an individual image (e.g., of a single cell or a plurality of cells) or a collection of a plurality of images (e.g., at least about 2, at least about 3, at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, or at least about 10 images of the same single cell or different cells or more).
[0160] The cell morphology map can comprise at least about 1, or at least about 2, or at least about 3, or at least about 4, or at least about 5, or at least about 6, or at least about 7, or at least about 8, or at least about 9, or at least about 10, or at least about 15, or at least about 20, or at least about 30, or at least about 40, or at least about 50, or at least about 60, or at least about 70, or at least about 80, or at least about 90, or at least about 100, or at least about 150, or at least about 200, or at least about 300, or at least about 400, or at least about 500 clusters, or more. f0161] Each cluster as disclosed herein can comprise a plurality of sub-clusters, e.g., at least about 2, or at least about 3, or at least about 4, or at least about 5, or at least about 6, or at least about 7, or at least about 8, or at least about 9, or at least about 10, or at least about 15, or at least about 20, or at least about 30, or at least about 40, or at least about 50, or at least about 60, or at least about 70, or at least about 80, or at least about 90, or at least about 100, or at least about 150, or at least about 200, or at least about 300, or at least about 400, or at least about 500 sub-clusters.
[0162] A cluster (or sub-cluster) can comprise datapoints representing cells of the same type/state.
In another example, a cluster (or sub-cluster) can comprise datapoints representing cells of different types/states.
[0163] A cluster (or sub-cluster) can comprise at least about 1, at least about 2, at least about 3, at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, at least about 10, at least about 15, at least about 20, at least about 30, at least about 40, at least about 50, at least about 60, at least about 70, at least about 80, at least about 90, at least about 100, at least about 150, at least about 200, at least about 300, at least about 400, at least about 500, at least about 1,000, at least about 2,000, at least about 3,000, at least about 4,000, at least about 5,000, at least about 10000, at least about 50,000, or at least about 100,000 datapoints.
[0164] Two or more clusters may overlap in a cell morphology map. In another example, no clusters may not overlap in a cell morphology map. In some examples, an allowable degree of overlapping between two or more clusters can be adjustable (e.g., manually or automatically by a machine learning algorithm) depending on the quality, condition, or size of data in the image data being processed.
[0165] A cluster (or sub-cluster) as disclosed herein can be represented with a boundary (e.g., a solid line or a dashed line). In another example, a cluster or sub-cluster need not be represented with a boundary, and can be distinguishable from other cluster(s) sub-cluster(s) based on their proximity to one another.
[0166] A cluster (or sub-cluster) or a data comprising information about the cluster can be annotated based on one or more annotation schema (e.g.. predefined annotation schema). Such annotation can be manual (e.g., by a user of the method or system disclosed herein) or automatically (e.g., by any of the machine learning algorithms disclosed herein). The annotation of the clustering can be related the one or more morphological properties of the cells that have been analyzed (e.g., cell shape, cell area, optical characteristic(s), etc.) to generate the cluster or assign one or more datapoints to the cluster. In another example, the annotation of the clustering can be related to information that has not been used or analyzed to generate the cluster or assign one or more datapoints to the cluster (e.g., genomics, transcriptomics, or proteomics, etc.). In such example, the annotation can be utilized to add additional “layers” of information to each cluster.
[0167] In some examples, an interactive annotation tool can be provided that permits one or more users to modify any process of the method described herein. For example, the interactive annotation tool can allow a user to curate, verify, edit, and/or annotate the morphologically-distinct clusters. In another example, the interactive annotation tool can process the image data, extract one or more morphological features from the image data, and allow the user to select one or more of the extracted morphological features to be used as a basis to generate the clusters and/or the cell morphology map.
After the generation of the clusters and/or the cell morphology map, the interactive annotation tool can allow the user to annotate each cluster and/or the cell morphology map using (1) a predefined annotation schema or (ii) a new, user-defined annotation schema. In another example, the interactive annotation tool can allow user to assign different weights to different morphological features for the clustering and/or map plotting. In another example, the interactive annotation tool can allow user to select with imaging data (or which cells) to be used and/or which imaging data (or which cells, cell clumps, artifacts, or debris) to be discarded, for the clustering and/or map plotting. A user can manually identify incorrectly clustered cells, or the machine learning algorithm can provide probability or correlation value of cells within each cluster and identify any outlier (e.g., a datapoint that would change the outcome of the probability/correlation value of the cluster(s) by a certain percentage value).
Thus, the user can choose to move the outliers using the interactive annotation tool to further tune the cell morphology map, e.g., to yield a “higher resolution” map.
[0168] One or more cell morphology maps as disclosed herein can be used to train one or more classifiers (e.g., at least about 1, at least about 2, at least about 3, at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, at least about 10, or more classifiers) as disclosed herein. Each classifier can be trained to analyze one or more images of a cell (e.g., to extract one or more morphological features of the cell) and categorize (or classify) the cell into one or more determined class or categories of a cell (e.g., based on a type of state of the cell). In another example, the classifier can be trained to create a new category to categorize (or classify) the cell into the new category, e.g., when determining that the cell is morphologically distinct than any pre-existing categories of other cells.
[0169] The machine learning algorithm as disclosed herein can be configured to extract one or more morphological feature of a cell from the image data of the cell. The machine learning algorithm can form a new data set based on the extracted morphological features, and the new data set need not contain the original image data of the cell. In some examples, replicas of the original images in the image data can be stored in a database disclosed herein, e.g., prior to using any of the new images for training, e.g., to keep the integrity of the images of the image data. In some examples, processed images of the original images in the image data can be stored in a database disclosed herein during or subsequent to the classifier training. In some examples, any of the newly extracted morphological features as disclosed herein can be utilized as new molecular markers for a cell or population of cells of interest to the user. As cell analysis platform as disclosed herein can be operatively coupled to one or more databases comprising non-morphological data of cells processed (e.g.. genomics data, transcriptomics data, proteomics data, metabolomics data), a selected population of cells exhibiting the newly extracted morphological feature(s) can be further analyzed by their non-morphological properties to identify proteins or genes of interest that are common in the selected population of cells but not in other cells, thereby determining such proteins or genes of interest to be new molecular markers that can be used to identify such selected population of cells.
[0170] In some examples, a classifier can be trained by applying machine learning algorithms on at least a portion of one or more cell morphology maps as disclosed herein as a training dataset. Non- limiting examples of machine learning algorithms for training a classifier can include supervised learning, unsupervised learning, semi-supervised learning, reinforcement learning, self-learning, feature learning, anomaly detection, association rules, etc. In some examples, a classifier can be trained by using one or more learning models on such training dataset. Non-limiting examples of learning models can include artificial neural networks (e.g., convolutional neural networks, U-net architecture neural network, etc.), backpropagation, boosting, decision trees, support vector machines, regression analysis, Bayesian networks, genetic algorithms, kernel estimators, conditional random field, random forest, ensembles of classifiers, minimum complexity machines (MCM), probably approximately correct learning (PACT), etc.
[0171] In some examples, the neural networks are designed by the modification of neural networks such as AlexNet, VGGNet, GoogLeNet, ResNet (residual networks), DenseNet, and Inception networks. In some examples, the enhanced neural networks are designed by modification of ResNet (e.g. ResNet 18, ResNet 34, ResNet 50, ResNet 101, and ResNet 152) or inception networks. In some examples, the modification comprises a series of network surgery operations that are mainly carried out to improve including inference time and/or inference accuracy.
[0172] The machine learning algorithm as disclosed herein can utilize one or more clustering algorithms to determine that objects in the same cluster can be more similar (in one or more morphological features) to each other than those in other clusters. Non-limiting examples of the clustering algorithms can include, but are not limited to, connectivity models (e.g., hierarchical clustering), centroid models (e.g. K-means algorithm), distribution models (e.g., expectation-
maximization algorithm), density models (e.g., density-based spatial clustering of applications with noise (DBSCAN), ordering points to identify the clustering structure (OPTICS)), subspace models (e.g., biclustering), group models, graph-based models (e.g., highly connected subgraphs (HCS) clustering algorithms), single graph models, and neural models (e.g., using unsupervised neural network). The machine learning algorithm can utilize a plurality of models, e.g., in equal weights or in different weights.
[0173] In some examples, unsupervised and self-supervised approaches can be used to expedite labeling of image data of cells. For the example of unsupervised, an embedding for a cell image can be generated. For example, the embedding can be a representation of the image in a space with reduced dimensions than the original image data. Such embeddings can be used to cluster images that are similar to one another. Thus, the labeler can be configured to batch-label the cells and increase the throughput as compared to manually labeling one or more cells.
[0174] In some examples, for the example of self-supervised learning, additional meta information (e.g., additional non-morphological information) about the sample (e.g., what disease is known or associated with the patient who provided the sample) can be used for labeling of image data of cells.
[0175] In some examples, embedding generation can use a neural net trained on predefined cell types. To generate the embeddings described herein, an intermediate layer of the neural net that is trained on predetermined image data (e.g., image data of known cell types and/or states) can be used.
By providing enough diversity in image data/sample data to the trained model/classifier, this method can provide an accurate way to cluster future cells. {0176] In some examples, embedding generation can use neural nets trained for different tasks. To generate the embeddings described herein, an intermediate layer of the neural net that is trained for a different task (e.g., a neural net that is trained on a canonical dataset such as ImageNet). Without wishing to be bound by any particular theory, this can allow to focus on features that matter for image classification (e.g., edges and curves) while removing a bias that may otherwise be introduced in labeling the image data.
[0177] In some examples, autoencoders can be used for embedding generation. To generate the embeddings described herein, autoencoders can be used, in which the input and the output can be substantially the same image and the squeeze layer can be used to extract the embeddings. The squeeze layer can force the model to learn a smaller representation of the image, which smaller representation may have sufficient information to recreate the image (e.g., as the output).
[0178] In some examples, for clustering-based labeling of image data or cells, as disclosed herein, an expanding training data set can be used. With the expanding training data set, one or more revisions of labeling (e.g., manual relabeling) can be needed to, e.g., avoid the degradation of model performance due to the accumulated effect of mislabeled images. Such manual relabeling can be intractable on a large scale and ineffective when done on a random subset of the data. Thus, to systematically surface images for potential relabeling, for example, similar embedding-based clustering can be used to identify labeled images that may cluster with members of other classes. Such examples are likely to be enriched for incorrect or ambiguous labels, which can be removed (e.g., automatically or manually).
[0179] In some examples, adaptive image augmentation can be used. In order to make the models and classifiers disclosed herein more robust to artifacts in the image data, (1) one or more images with artifacts can be identified, and (2) such images identified with artifacts can be added to training pipeline (e.g., for training the model/classifier). Identifying the image(s) with artifacts can comprise: (la) while imaging cells, one or more additional sections of the image frame can be cropped, which frame(s) being expected to contain just the background without any cell; (2a) the background image can be checked for any change in one or more characteristics (e.g., optical characteristics, such as brightness); and (3a) flagging/labeling one or more images that have such change in the characteristic(s). Adding the identified images to training pipeline can comprise: (2a) adding the one or more images that have been flagged/labeled as augmentation by first calculating an average feature of the changed characteristics) (e.g., the background median color); (2b) creating a delta image by subtracting the average feature from the image data (e.g., subtracting the median for each pixel of the image); and (3c) adding the delta image to the training pipeline.
[0180] One or more dimension of the cell morphology map can be represented by various approaches (e.g., dimensionality reduction approaches), such as, for example, principal component analysis (PCA), multidimensional scaling (MDS), t-distributed stochastic neighbor embedding (t-
SNE), and uniform manifold approximation and projection (UMAP). For example, UMAP can be a machine learning technique for dimension reduction. UMAP can be constructed from a theoretical framework based in Riemannian geometry and algebraic topology. UMAP can be utilized for a practical scalable algorithm that applies to real world data, such as morphological properties of one or more cells.
[0181] The cell morphology map as disclosed herein can comprise an ontology of the one or more morphological features. The ontology can be an alternative medium to represent a relationship among various datapoints (e.g., each representing a cell) analyzed from an image data. For example, an ontology can be a data structure of information, in which nodes can be linked by edges. An edge can be used to define a relationship between two nodes. For example, a cell morphology map can comprise acluster comprising sub-clusters, and the relationship between the cluster and the sub-clusters can be represented in an nodes/edges ontology (e.g., an edge can be used to describe the relationship as a subclass of, genus of, part of, stem cell of, differentiated from, progeny of, diseased state of, targets, recruits, interacts with, same tissue, different tissue, etc.).
[0182] In some examples, one-to-one morphology to genomics mapping can be utilized. An image
IO of a single cell or images of multiple “similar looking” cells can be mapped to its/their molecular profile(s) (e.g., genomics, proteomics, transcriptomics, etc.). In some examples, classifier-based barcoding can be performed. Each sorting event (e.g., positive classifier) can push the sorted cell(s) into an individual well or droplet with a unique barcode (e.g., nucleic acid or small molecule barcode).
The exact barcode(s) used for that individual classifier positive event can be recorded and tracked.
Following, the cells can be lysed and molecularly analyzed together with the barcode(s). The result of the molecular analysis can then be mapped (e.g., one-to-one) to the image(s) of the individual (or ensemble of) sorted cell(s) captured while the cell(s) are flowing in the flow channel. In some examples, class-based sorting can be utilized. Cells that are classified in the same class based at least on their morphological features can be sorted into a single well or droplet with a pre-determined barcoded material, and the cells can be lysed, molecularly analyzed, then any molecular information can be used for the one-to-one mapping as disclosed herein.
[0183] FIG. 4 schematically illustrates different ways of representing analysis data of image data of cells. Tag-free image data 410 of cells (e.g., circular cells and square cells) having different nuclei (e.g., small nucleus and large nucleus) can be analyzed by any of the methods disclosed herein (e.g., based on extraction of one or more morphological features). For example, any of the classifier(s) disclosed herein can be used to analyze and plot the image data 410 into a cell morphology map 420, comprising four distinguishable clusters: cluster A (circular cell, small nucleus), cluster B (circular cell, large nucleus), cluster C (square cell, small nucleus), and cluster D (square cell, large nucleus).
The classifier(s) can also represent the analysis in a cell morphological ontology 430, in which a top node (“cell shape”) can be connected to two sub-nodes (“circular cell” and “rectangular cell”) using an edge (“is a subclass of”) to define the relationship between the nodes. Each sub-node can also connected to its own sub-nodes (“small nucleus” and “large nucleus”) using an edge (“is a part of’) to define their relationships. The sub-nodes (e.g., “small nucleus” and “large nucleus”) can also be connected using one or more edges (“are similar”) to further define their relationship.
[0184] The cell morphology map or cell morphological ontology as disclosed herein can be further annotated with one or more non-morphological data of each cell. As shown in FIG. 3, the ontology 430 from FIG. 4 can be further annotated with information about the cells that may not be extractable from the image data used to classify the cells (e.g., molecular profiles obtained using molecular barcodes, as disclosed herein). Non-limiting examples of such non-morphological data can be from additional treatment and/or analysis, including, but not limited to, cell culture (e.g.. proliferation, differentiation, etc.), cell permeabilization and fixation, cell staining by a probe, mass cytometry, multiplexed ion beam imaging (MIBI), confocal imaging, nucleic acid (e.g., DNA, RNA) or protein extraction, polymerase chain reaction (PCR), target nucleic acid enrichment, sequencing, sequence mapping, etc.
[0185] Examples of the probe used for cell staining (or tagging) may include, but are not limited to, a fluorescent probe (e.g., for staining chromosomes such as X, Y, 13, 18 and 21 in fetal cells), a chromogenic probe, a direct immunoagent (e.g. labeled primary antibody), an indirect immunoagent (e.g., unlabeled primary antibody coupled to a secondary enzyme), a quantum dot, a fluorescent nucleic acid stain (such as DAPI, Ethidium bromide, Sybr green, Sybr gold, Sybr blue, Ribogreen,
Picogreen, YoPro-1, YoPro-2 YoPro-3, YOYo, Oligreen acridine orange, thiazole orange, propidium iodine, or Hoeste), another probe that emits a photon, or a radioactive probe.
[0186] In some examples, the instrument(s) for the additional analysis may comprise a computer executable logic that performs karyotyping, in situ hybridization (ISH) (e.g., florescence in situ hybridization (FISH), chromogenic in situ hybridization (CISH), nanogold in situ hybridization (NISH)), restriction fragment length polymorphism (RFLP) analysis, polymerase chain reaction (PCR) techniques, flow cytometry, electron microscopy, quantum dot analysis, or detects single nucleotide polymorphisms (SNPs) or levels of RNA.
[0187] Analysis of the image data (e.g., extracting one or more morphological features form the image data, determining clustering and/or cell morphology map based on the image data, etc.) can be performed (e.g., automatically) within less than about 1 hour- e.g., less than about 50 minutes, or less than about 40 minutes, or less than about 30 minutes, or less than about 25 minutes, or less than about 20 minutes, or less than about 15 minutes, or less than about 10 minutes, or less than about 9 minutes,
or less than about 8 minutes, or less than about 7 minutes, or less than about 6 minutes, or less than about 5 minutes, or less than about 4 minutes, or less than about 3 minutes, or less than about 2 minutes, or less than about 1 minute, or less than about 50 seconds, or less than about 40 seconds, or less than about 30 seconds, or less than about 20 seconds, or less than about 10 seconds, or less than about 5 seconds, about 1 second, or less. In some examples, such analysis can be performed in real-time.
[0188] One or more morphological features utilized for generating the clusters or the cell morphology map, as disclosed herein, can be selected automatically (e.g., by one or more machine learning algorithms) or, alternatively, selected manually by a user using a user interface (e.g., graphical user interface (GUI)). The GUI can show visualization of, for example, (i) the one or more morphological parameters extracted from the image data (e.g., represented as images, words, symbols, predefined codes, etc.), (i1) the cell morphology map comprising one or more clusters, or (iii) the cell morphological ontology. The user can select, using the GUI, which morphological parameter(s) to be used to generate the clusters and the cell morphological map prior to actual generation of the clusters and the cell morphological map. The user can, upon seeing or receiving a report about the generated clusters and the cell morphological map, retroactively modify the types of morphological parameter(s) to use, thereby to (i) modify the clustering or the cell morphological mapping and/or (ii) create new claster(s) or new cell morphological map(s). In some examples, the user can select one or more regions to be excluded or included for further analysis or further processing of the cells (e.g., sorting in the future or in real-time). For example, a microfluidic system as disclosed herein can be utilized to capture image(s) of each cell from a population of cells, and any of the methods disclosed herein can be utilized to analyze such image data to generate a cell morphology map comprising clusters representing the population of cells. The user can select one or more clusters or sub-clusters to be sorted, and the input can be provided to the microfluidic system to sort at least a portion of the cells into one or more sub- channels of the microfluidic system (e.g., in real-time) accordingly. In another example, the user can select one or more clusters or sub-clusters to be excluded during sorting (e.g., to get rid of artifacts, debris, or dead cells), and the input can be provided to the microfluidic system to sort at least a portion of the cells into one or more sub-channels of the microfluidic system (e.g., in real-time) accordingly without such artifacts, debris, or dead cells.
[0189] The cell morphology map or cell morphological ontology as disclosed herein can be further annotated with one or more non -morphological data of each cell. As shown in FIG. 5, the ontology 430 from FIG. 4 can be further annotated with information about the cells that may not be extractable from the image data used to classify the cells (e.g., molecular profiles obtained using molecular barcodes, as disclosed herein). Non-limiting examples of such non-morphological data can be from additional treatment and/or analysis, including, but not limited to, cell culture (e.g., proliferation, differentiation, etc.), cell permeabilization and fixation, cell staining by a probe, mass cytometry, multiplexed ion beam imaging (MIBI), confocal imaging, nucleic acid (e.g., DNA, RNA) or protein extraction, polymerase chain reaction (PCR), target nucleic acid enrichment, sequencing, sequence mapping, etc.
[0190] Examples of the probe used for cell staining (or tagging) may include, but are not limited to, a fluorescent probe (e.g., for staining chromosomes such as X, Y, 13, 18 and 21 in fetal cells), a chromogenic probe, a direct immunoagent (e.g. labeled primary antibody), an indirect immunoagent (e.g., unlabeled primary antibody coupled to a secondary enzyme), a quantum dot, a fluorescent nucleic acid stain (such as DAPI, Ethidium bromide, Sybr green, Sybr gold, Sybr blue, Ribogreen,
Picogreen, YoPro-1, YoPro-2 YoPro-3, YOYo, Oligreen acridine orange, thiazole orange, propidium iodine, or Hoeste), another probe that emits a photon, or a radioactive probe.
[0191] In some examples, the instrument(s) for the additional analysis may comprise a computer executable logic that performs karyotyping, in situ hybridization (ISH) (e.g., florescence in situ hybridization (FISH), chromogenic in situ hybridization (CISH), nanogold in situ hybridization (NISH)), restriction fragment length polymorphism (RFLP) analysis, polymerase chain reaction (PCR) techniques, flow cytometry, electron microscopy, quantum dot analysis, or detects single nucleotide polymorphisms (SNPs) or levels of RNA.
[0192] Analysis of the image data (e.g., extracting one or more morphological features form the image data, determining clustering and/or cell morphology map based on the image data, etc.) can be performed (e.g., automatically) within less than about | hour, about 50 minutes, about 40 minutes, about 30 minutes, about 25 minutes, about 20 minutes, about 15 minutes, about 10 minutes, about 9 minutes, about 8 minutes, about 7 minutes, about 6 minutes, about 5 minutes, about 4 minutes, about 3 minutes, about 2 minutes, about 1 minute, about 50 seconds, about 40 seconds, about 30 seconds, 20 seconds, about 10 seconds, about 5 seconds, about | second, or less. In some examples, such analysis can be performed in real-time.
[0193] One or more morphological features utilized for generating the clusters or the cell morphology map, as disclosed herein, can be selected automatically (e.g., by one or more machine learning algorithms) or, alternatively, selected manually by a user using a user interface (e.g., graphical user interface (GUI). The GUI can show visualization of, for example, (i) the one or more morphological parameters extracted from the image data (e.g., represented as images, words, symbols, predefined codes, etc.), (ii) the cell morphology map comprising one or more clusters, or (iii) the cell morphological ontology. The user can select, using the GUI, which morphological parameter(s) to be used to generate the clusters and the cell morphological map prior to actual generation of the clusters and the cell morphological map. The user can, upon seeing or receiving a report about the generated clusters and the cell morphological map, retroactively modify the types of morphological parameter(s) to use, thereby to (i) modify the clustering or the cell morphological mapping and/or (ii) create new cluster(s) or new cell morphological map(s). In some examples, the user can select one or more regions
IO to be excluded or included for further analysis or further processing of the cells (e.g., sorting in the future or in real-time). For example, a microfluidic system as disclosed herein can be utilized to capture image(s) of each cell from a population of cells, and any of the methods disclosed herein can be utilized to analyze such image data to generate a cell morphology map comprising clusters representing the population of cells. The user can select one or more clusters or sub-clusters to be sorted, and the input can be provided to the microfluidic system to sort at least a portion of the cells into one or more sub- channels of the microfluidic system (e.g., in real-time) accordingly. In another example, the user can select one or more clusters or sub-clusters to be excluded during sorting (e.g., to get rid of artifacts, debris, or dead cells), and the input can be provided to the microfluidic system to sort at least a portion of the cells into one or more sub channels of the microfluidic system (e.g., in real-time) accordingly without such artifacts, debris, or dead cells.
[0194] FIG. 6 schematically illustrates a method for a user to interact (e.g., using GUI) with any one of the methods disclosed herein. Image data 610 of a plurality of cells can be processed, using any one of the methods disclosed herein, to generate a cell morphology map 620A that represents the plurality of cells as datapoints in different clusters A, B, C, and D. The cell morphology map 620A can be displayed to the user using the GUI 630. The user can select each cluster or a datapoint within each cluster to visualize one or more images 650a, b, ¢, or d of the cells classified into the cluster.
Upon visualization of the images, the user can draw a box 640 (e.g., using any user-defined shape and/or size) around one or more datapoints or around a cluster. For example, the user can draw a box 640 around a cluster of “debris” datapoints, to, e.g., remove the selected cluster and generate a new cell morphology map 620B. The user input can be used to update cell classifying algorithms, mapping algorithms, cell flowing mechanism (e.g., velocity of cells, positioning of the cells within a flow channel, adjusting imaging focal length/plane of one or more sensors/cameras of an imaging module (also referred to as an imaging device herein) that captures one or more images/videos of cells flowing through the cartridge, etc.), cell sorting mechanisms in the flow channel, cell sorting instructions in the flow channel, etc. For example, upon the user’s selection, the classifier can be trained to identify one or more common morphological features within the selected datapoints (e.g., features that distinguish the selected datapoints from the unselected data). Features of the selected group can be used to further identify other cells from other samples having similar feature(s) for further analysis or discard cells having similar feature(s), e.g., for cell sorting.
[0195] The present disclosure also describes a cell analysis platform, e.g., for analyzing or classifying a cell. The cell analysis platform can be a product of any one of the methods disclosed herein. In another example, or in addition to, the cell analysis platform can be used as a basis to execute any one of the methods disclosed herein. For example, the cell analysis platform can be used to process image data comprising tag-free images of single cells to generate a new cell morphology map of various cell clusters. In another example, the cell analysis platform can be used to process image data comprising tag-free images of single cells to compare the cell to pre-determined (e.g., pre-analyzed) images of known cells or cell morphology map(s), such that the single cells from the image data can be classified, e.g., for cell sorting. FIG. 7 illustrates an example cell analysis platform (e.g., machine learning/artificial intelligence platform) for analyzing image data of one or more cells. The cell analysis platform 700 can comprise a cell morphology atlas (CMA) 705. The CMA 705 can comprise a database 710 having a plurality of annotated single cell images that are grouped into morphologically-distinct clusters (e.g., represented a texts, as cell morphology map(s), or cell morphological ontology(ies)) corresponding to a plurality of classifications (e.g., predefined cell classes). The CMA 705 can comprise a modeling unit comprising one or more models (e.g., modeling library 720 comprising, such as, one or more machine learning algorithms disclosed herein) that are trained and validated using datasets from the CMA 705, to process image data comprising images/videos of one or more cells to identify different cell types and/or states based at least on morphological features. The CMA 705 can comprise an analysis module 730 comprising one or more classifiers as disclosed herein. The classifier(s) can uses one or more of the models from the modeling library 720 to, e.g., (1) classify one or more images taken from a sample, (2) assess a quality or state of the sample based on the one or more images, (3) map one or more datapoints representing such one or more images onto a cell morphology map (or cell morphological ontology) using a mapping module
740. The CMA 705 can be operatively coupled to one or more additional database 770 to receive the image data comprising the images/videos of one or more cells. For example, the image data from the database 770 can be obtained from an imaging module 792 of a cartridge 790, which can also be operatively coupled to the CMA 705. The cartridge can direct flow of a sample comprising or suspected of comprising a target cell, and capture one or more images of contents (e.g., cells) within the sample by the imaging module 792. Any image data obtained by the imaging module 792 can be transmitted directly to the CMA 705 and/or to the new image database 770. In another example, the
CMA 705 can be operatively coupled to one or more additional databases 780 comprising non- morphological data of any of the cells (e.g., genomics, transcriptomics, or proteomics, etc.), e.g., to further annotate any of the datapoint, cluster, map, ontology, images, as disclosed herein. The CMA 705 can be operatively coupled to a user device 750 (e.g., a computer or a mobile device comprising a display) comprising a GUI 760 for the user to receive information from and/or to provide input {e.g., instructions to modify or assist any portion of the method disclosed herein). Any classification made by the CMA and/or the user can be provided as an input to the sorting module 794 of the cartridge 790. Based on the classification, the sorting module can determine, for example, (i) when to activate one or more sorting mechanisms at the sorting junction of the cartridge 790 to sort one or more cells of interest, (ii) which sub-channel of a plurality of sub channels to direct each single cell for sorting.
In some examples, the sorted cells can be collected for further analysis, e.g., downstream molecular assessment and/or profiling, such as genomics, transcriptomics, proteomics, metabolomics, etc. Any of the methods or platforms disclosed herein can be used as a tool that permits a user to train one or more models (e.g., from the modeling library) for cell clustering and/or cell classification. For example, a user may provide initial image dataset of a sample to the platform, and the platform may process the initial set of image data. Based on the processing, the platform can determine a number of labels and/or an amount of data that the user needs to train the one or more models, based on the initial image dataset of the sample. In some examples, the platform can determine that the initial set of image data can be insufficient to provide an accurate cell classification or cell morphology map. For example, the platform can plot an initial cell morphology map and recommend to the user the number of labels and/or the amount of data needed to for enhanced processing, classification, and/or sorting, based on proximity (or separability), correlation, or commonality of the datapoints in the map (e.g., whether there is no distinguishable clusters within the map, whether the clusters within the map are too close to each other, etc.). In another example, the platform can allow the user to select different model {(e.g.,
clustering model) or classifier, different combinations of models or classifiers, to re-analyze the initial set of image data.
[0196] Any of the methods or platforms disclosed herein can be used to determine quality or state of the image(s) of the cell, that of the cell, or that of a sample comprising the cell. The quality or state of the cell can be determined at a single cell level. In another example, the quality or state of the cell can be determined at an aggregate level (e.g., as a whole sample, or as a portion of the sample). The quality or state can be determined and reported based on, e.g., a number system (e.g., a number scale from about 1 to about 10, a percentage scale from about 1% to about 100%), a symbolic system, or a color system. For example, the quality or state can be indicative of a preparation or priming condition of the sample (e.g., whether the sample has a sufficient number of cells, whether the sample has too much artifacts, debris, etc.) or indicative of a viability of the sample (e.g., whether the sample has an amount of “dead” cells above a predetermined threshold).
[0197] Any of the methods or platforms disclosed herein can be used to sort cells in silico (e.g., prior to actual sorting of the cells using a microfluidic channel). The in silico sorting can be, e.g., to discriminate among and/or between, e.g., multiple different cell types (e.g., different types of cancer cells, different types of immune cells, etc.), cell states, cell qualities. The methods and platforms disclosed herein can utilize pre-determined morphological properties (e.g., provided in the platform) for the discrimination. In another example, newly abstracted morphological properties can be abstracted (e.g., generated) based on the input data for the discrimination. In some examples, new model(s) and/or classifier(s) can be trained or generated to process the image data. In some examples, the newly abstracted morphological properties can be used to discriminate among and/or between, e.g., multiple different cell types, cell states, cell qualities that are known. In another example, the newly abstracted morphological properties can be used to create new class (or classifications) to sort the cells (e.g., in silico or via the microfluidic system). The newly abstracted morphological properties as disclosed herein may enhance accuracy or sensitivity of cell sorting (e.g. in silico or via the microfluidic system).
[0198] Subsequent to the in silico sorting of the cells, the actual cell sorting of the cells (e.g., via the microfluidic system or cartridge) based on the in silico sorting can be performed within less than about 1 hours, about 50 minutes, about 40 minutes, about 30 minutes, about 25 minutes, about 20 minutes, about 15 minutes, about 10 minutes, about 9 minutes, about 8 minutes, about 7 minutes, about 6 minutes, about 5 minutes, about 4 minutes, about 3 minutes, about 2 minutes, about 1 minute, about
50 seconds, about 40 seconds, about 30 seconds, about 20 seconds, about 10 seconds, about 5 seconds, about 1 second, or less. In some examples, the in silico sorting and the actual sorting can occur in real- time.
[0199] In any of the methods or platforms disclosed herein, the model(s) and/or classifier(s) can be validated (e.g., for the ability to demonstrate accurate cell classification performance). Non-limiting examples of validation metrics that can be utilized can include, but are not limited to, threshold metrics (e.g., accuracy, F-measure, Kappa, Macro-Average Accuracy, Mean-Class-Weighted Accuracy,
Optimized Precision, Adjusted Geometric Mean, Balanced Accuracy, etc.), the ranking methods and metrics (e.g., receiver operating characteristics (ROC) analysis or “ROC area ander the curve (ROC
AUC)”), and the probabilistic metrics (e.g., root-mean-squared error). For example, the model(s) or classifier(s) can be determined to be balanced or accurate when the ROC AUC is greater than 0.5, greater than about 0.55, greater than about 0.6, greater than about 0.65, greater than about 0.7, greater than about 0.75, greater than about 0.8, greater than about 0.85, greater than about 0.9, greater than about 0.91, greater than about 0.92, greater than about 0.93, greater than about 0.94, greater than about 0.95, greater than about 0.96, greater than about 0.97, greater than about 0.98, greater than about 0.99, or more.
[0200] In any of the methods or platforms disclosed herein, the image(s) of the cell(s) can be obtained when the cell(s) are prepared and diluted in a sample (e.g., a buffer sample). The cell(s) can be diluted, e.g., in comparison to real-life concentrations of the cell in the tissue (e.g., solid tissue, blood, serum, spinal fluid, urine, etc.) to a dilution concentration. The methods or platforms disclosed herein can be compatible with a sample (e.g., a biological sample or derivative thereof) that is diluted by a factor of about 500 to about 1,000,000. The methods or platforms disclosed herein can be compatible with a sample that is diluted by a factor of at least about 500. The methods or platforms disclosed herein can be compatible with a sample that is diluted by a factor of at most about 1,000,000.
The methods or platforms disclosed herein can be compatible with a sample that is diluted by a factor of about 500 to about 1,000, about 500 to about 2,000, about 500 to about 5,000, about 500 to about 10,000, about 500 to about 20,000, about 500 to about 50,000, about 500 to about 100,000, about 500 to about 200,000, about 500 to about 500,000, about 500 to about 1,000,000, about 1,000 to about 2,000, about 1,000 to about 5,000, about 1,000 to about 10,000, about 1,000 to about 20,000, about 1,000 to about 50,000, about 1,000 to about 100,000, about 1,000 to about 200,000, about 1,000 to about 500,000, about 1,000 to about 1,000,000, about 2,000 to about 5,000, about 2,000 to about
10,000, about 2,000 to about 20,000, about 2,000 to about 50,000, about 2,000 to about 100,000, about 2,000 to about 200,000, about 2,000 to about 500.000, about 2,000 to about 1,000,000, about 5,000 to about 10,000, about 5,000 to about 20,000, about 5,000 to about 50,000, about 5,000 to about 100,000, about 5,000 to about 200,000, about 5,000 to about 500,000, about 5,000 to about 1,000,000, about 10,000 to about 20,000, about 10,000 to about 50,000, about 10,000 to about 100,000, about 10,000 to about 200,000, about 10,000 to about 500,000, about 10,000 to about 1,000,000, about 20,000 to about 50,000, about 20,000 to about 100,000, about 20,000 to about 200,000, about 20,000 to about 500,000, about 20,000 to about 1,000,000, about 50,000 to about 100,000, about 50,000 to about 200.000, about 50,000 to about 500,000, about 50,000 to about 1,000,000, about 100,000 to about 200,000, about 100,000 to about 500,000, about 100,000 to about 1,000,000, about 200,000 to about 500,000, about 200,000 to about 1,000,000, or about 500,000 to about 1,000,000. The methods or platforms disclosed herein can be compatible with a sample that is diluted by a factor of about 500, about 1,000, about 2,000, about 5,000, about 10,000, about 20,000, about 50,000, about 100,000, about 200,000, about 500,000, or about 1,000,000.
[0201] In any of the methods or platforms disclosed herein, the classifier can generate a prediction probability (e.g., based on the morphological clustering and analysis) that an individual cell or a cluster of cells belongs to a cell class (e.g., within a predetermined cell class provided in the CMA as disclosed herein), e.g., via a reporting module. The reporting module can communicate with the user via a GUI as disclosed herein. In another example, the classifier can generate a prediction vector that an individual cell or a cluster of cells belongs to a plurality of cell classes (e.g., a plurality of all of predetermined cell classes from the CMA as disclosed herein). The vector can be ID (e.g., a single row of different cell classes), 2D (e.g., two dimensions, such as tissue origin vs. cell type), 3D, etc. In some examples, based on processing and analysis of image data obtained from a sample, the classifier can generate a report showing a composition of the sample, e.g., a distribution of one or more cell types, each cell type indicated with a relative proportion within the sample. Each cell of the sample can also be annotated with a most probable cell type and one or more less probably cell types.
[0202] Any one of the methods and platforms disclosed herein can be capable of processing image data of one or more cells to generate one or more morphometric maps of the one or more cells. Non- limiting examples of morphometric models can be utilized to analyze one or more images of single cells (or cell clusters) can include, e.g., simple morphometries (e.g.. based on lengths, widths, masses, angles, ratios, areas, etc.), landmark-based geometric morphometries (e.g., spatial information,
intersections, etc. of one or more components of a cell), procrustes-based geometric morphometries (e.g., by removing non-shape information that is altered by translation, scaling, and/or rotation from the image data), Euclidean distance matrix analysis, diffeomorphometry, and outline analysis. The morphometric map(s) can be multi-dimensional (e.g., 2D, 3D, etc.). The morphometric map(s) can be reported to the user via the GUL
[0203] Any of the methods or platforms disclosed herein (e.g., the analysis module) can be used to process, analyze, classify, and/or compare two or more samples (e.g., at least about 2, about 3, about 4, about 5, about 6, about 7, about 8, about 9, about 10, or more test samples). The two or more samples can each be analyzed to determine a morphological profile (e.g., a cell morphology map) of each sample. For example, the morphological profiles of the two or more samples can be compared for identifying a disease state of a patient’s sample in comparison to a health cohort’s sample or a sample of image data representative of a disease of interest. In another example, the morphological profiles of the two or more samples can be compared to monitor a progress of a condition of a subject, e.g., comparing first image data of a first set of cells from a subject before a treatment (e.g., a test drug candidate, chemotherapy, surgical resection of solid tumors, etc.) and second image data of a second set of cells from the subject after the treatment. The second set of cells can be obtained from the subject at least about 1 week, at least about 2 weeks, at least about 3 weeks, at least about 4 weeks, at least about 2 months, or at least about 3 months subsequent to obtaining the first set of cells from the subject.
In a different example, the morphological profiles of the two or more samples can be compared to monitor effects of two or more different treatment options (e.g., different test drugs) in two or more different cohorts (e.g., human subjects, animal subjects, or cells being tested in vitro/ex vivo).
Accordingly, the systems and methods disclosed herein can be utilized (e.g., using sorting or enrichment of a cell type of interest or a cell exhibiting a characteristic of interest) to select a drug and/or a therapy that yields a desired effect (e.g., a therapeutic effect greater than equal to a threshold value).
[0204] Any of the platforms disclosed herein (e.g., cell analysis platform) can provide an inline end-to-end pipeline solution for continuous labeling and/or sorting of multiple different cell types and/or states based at least in part on (e.g., based solely on) morphological analysis of imaging data provided. A modeling library used by the platform can be scalable for large amount of data, extensible (e.g., one or more models or classifiers modified), and/or generalizable (e.g., more resistant to data perturbations - such as artifacts, debris, random objects in the background, image/video distortions -
between samples). Any of the modeling library can be removed or updated with new model automatically by the machine learning algorithms or artificial intelligence, or by the user.
[0205] Any of the methods and platforms disclosed herein can adjust one or more parameters of the microfluidic system as disclosed herein. As cells are flowing through a flow channel, an imaging module (e.g., sensors, cameras) can capture image(s)/video(s) of the cells and generate new image data. The image data can be processed and analyzed {e.g., in real-time) by the methods and platforms of the present disclosure to train a model (e.g., machine learning model) to determine whether or not one or more parameters of the microfluidic system.
[0206] In some examples, the model(s) can determine that the cells are flowing too fast or too slow, and send an instruction to the microfluidic system to adjust (1) the velocity of the cells (e.g., using adjusting velocity of the fluid medium carrying the cells) and/or (11) image recording rate of a camera that is capturing images/videos of cells flowing through the flow channel.
[0207] In some examples, the model(s) can determine that the cells are in-focus or out-of-focus in the images/videos, and send an instruction to the microfluidic system to (i) adjust a positioning of the cells within the cartridge (e.g., move the cell towards or away from the center of the flow channel via, for example, hydrodynamic focusing and/or inertial focusing) and/or (11) adjust a focal length/plane of the camera that is capturing images/videos of cells flowing through the flow channel. Adjusting the focal length/plane can be performed for the same cell that has been analyzed (e.g., adjusting focal length/plane of a camera that is downstream) or a subsequent cell. Adjusting the focal length/plane can enhance clarity or reduce blurriness in the images. The focal length/plane can be adjusted based on a classified type or state of the cell. In some examples, adjusting the focal length/plane can allow enhanced focusing/clarity on all parts of the cell. In some examples, adjusting the focal length/plane can allow enhanced focusing/clarity on different portions (but not all parts) of the cell. Without wishing to be bound by any particular theory, out-of-focus images can be usable for any of the methods disclosed herein to extract morphological feature(s) of the cell that otherwise may not be abstracted from in-focus images, or vice versa. Thus, in some examples, instructing the imaging module to capture both in-focus and out-of-focus images of the cells can enhance accuracy of any of the analysis of cells disclosed herein.
[0208] In another example, the model(s) can send an instruction to the microfluidic system to modify the flow and adjust an angle of the cell relative to the camera, to adjust focus on different portions of the cell or a subsequent cell. Different portions as disclosed herein can comprise an upper portion, a mid portion, a lower portion, membrane, nucleus, mitochondria, etc. of the cell.
[0209] In order to image cells at the right focus (with respect to height or z dimension), what is conventionally done is to calculate the “focus measure” of an image using information theoretic methods like Fourier Transform or Laplace transform.
[0210] In some examples, bi-directional out-of-focus (OOF) images cells (e.g., one or more first images that are OOF in a first direction, and one or more second images that are OOF in as second direction that 1s different — such as opposite — from the first direction). For example, images that are
OOF in two opposite directions can be called “bright OOF” image(s) and “dark OOF” image(s), which can be obtained by changing the z-focus bi-directionally. A classifier as disclosed herein can be trained
IO with a image data comprising both bright OOF image(s) and dark OOF image(s). The trained classifiers can be used to run inferences (e.g., in real-time) on new image data of cells to classify each image as bright OOF image, dark OOF image, and optionally image that is not OOF (e.g., not OOF relative to the bright/dark OOF images). The classifier can also measure a percentage of bright OOF image, a percentage of dark OOF image, or a percentage of both bright and dark OOF images within the image data. For example, if any of the percentage of bright OOF image, the percentage of dark
OOF image, or the percentage of both bright and dark OOF images is above a threshold value (e.g., a predetermined threshold value), then the classifier can determine that the imaging device (e.g., by the microfluidic system as disclosed herein) may not be imaging cells at the right focal length/plane. The classifier can instruct the user, via GUI of a user device, to adjust the imaging device’s focal length/plane. In some examples, the classifier can determine, based on analysis of the image data comprising OOF images, direction and degree of adjustment of focal length/plane that can be required to adjust the imaging device, to yield a reduced amount of OOF imaging. In some examples, the classifier and the microfluidic device can be operatively coupled to a machine leaming/artificial intelligence controller, such that the focal length/plane of the imaging device can be adjusted automatically upon determination of the classifier. {0211] A threshold (e.g., a predetermined threshold) of a percentage of OOF images (e.g., bright OOF, dark OOF, or both) can be about 0.1 % to about 20 %. A threshold (e.g., a predetermined threshold) of a percentage of OOF images (e.g., bright OOF, dark OOF, or both) can be at least about 0.1 %. A threshold (e.g., a predetermined threshold) of a percentage of OOF images (e.g., bright OOF, dark
OOF, or both) can be at most about 20 %. A threshold (e.g., a predetermined threshold) of a percentage of OOF images (e.g., bright OOF, dark OOF, or both) can be about 0.1 % to about 0.5 %, about 0.1 %
to about 1 %, about 0.1 % to about 2 %, about 0.1 % to about 4 %, about 0.1 % to about 6 So, about 0.1 % to about 8 %, about 0.1 % to about 10 %, about 0.1 % to about 15 %, about 0.1 % to about 20 %, about 0.5 % to about 1 %, about 0.5 % to about 2 %, about 0.5 So to about 4 %, about 0.5 % to about 6 %, about 0.5 % to about 8 %, about 0.5 % to about 10 %, about 0.5 % to about 15 %, about 0.5 % to about 20 %, about 1 % to about 2 %, about 1 % to about 4 %, about 1 % to about 6 %, about 1 % to about 8 %, about 1 % to about 10 %, about 1 % to about 15 Se, about 1 % to about 20 %, about 2 % to about 4 %, about 2 % to about 6 %, about 2 % to about 8 Ze, about 2 % to about 10 %, about 2 % to about 15 %, about 2 % to about 20 %, about 4 % to about 6 %, about 4 % to about 8 So, about 4 % to about 10 %, about 4 % to about 15 %, about 4 % to about 20 %, about 6 % to about 8 %, about 6 % to about 10 %, about 6 % to about 15 %, about 6 % to about 20 %, about 8 % to about 10 %, about 8 % to about 15 %, about 8 % to about 20 %, about 10 % to about 15 %, about 10 % to about 20 %, or about 15 % to about 20 %. A threshold (e.g., a predetermined threshold) of a percentage of OOF images (e.g., bright OOF, dark OOF, or both) can be at least about 0.1 %, or at least about 0.5 %, or at least about 1 %, or at least about 2 %, or at least about 4 %, or at least about 6 %, or at least about 8 %, or at least about 10 %, or at least about 15 %, or at least or about 20 %.
[0212] In some examples, the model(s) can determine that images of different modalities are needed for any of the analysis disclosed herein. Images of varying modalities can comprise a bright field image, a dark field image, a fluorescent image (e.g., of cells stained with a dye), an in-focus image, an out-of-focus image, a greyscale image, a monochrome image, a multi-chrome image, etc.
[0213] Any of the models or classifiers disclosed herein can be trained on a set of image data that is annotated with one imaging modality. In another example, the models/classifiers can be trained on set of image data that is annotated with a plurality of different imaging modalities (e.g., about 2, about 3, about 4, about 5, or more different imaging modalities). Any of the models/classifiers disclosed herein can be trained on a set of image data that is annotated with a spatial coordinate indicative of a position or location within the flow channel. Any of the models/classifiers disclosed herein can be trained on a set of image data that is annotated with a timestamp, such that a set of images can be processed based on the time they are taken.
[0214] An image of the image data can be processed in various image processing methods, such as horizontal or vertical image flips, orthogonal rotation, gaussian noise, contrast variation, or noise introduction to mimic microscopic particles or pixel-level aberrations. One or more of the processing methods can be used to generate replicas of the image or analyze the image. In some examples, the image can be processed into a lower-resolution image or a lower-dimension image (e.g., by using one or more deconvolution algorithm).
[0215] In any of the methods disclosed herein, processing an image or video from image data can comprise identifying, accounting for, and/or excluding one or more artifacts from the image/video, either automatically or manually by a user. Upon identification, the artifact(s) can be fed into any of the models or classifiers, to train image processing or image analysis. The artifact(s) can be accounted for when classifying the type or state of one or more cells in the image/video. The artifact(s) can be excluded from any determination of the type or state of the cell(s) in the image/video. The artifact(s) can be removed in silico by any of the models/classifiers disclosed herein, and any new replica or modified variant of the image/video excluding the artifact(s) can be stored in a database as disclosed herein. The artifact(s) can be, for example, from debris (e.g., dead cells, dust, etc.), optical conditions during capturing the image/video of the cells (e.g., lighting variability, over- saturation, under- exposure, degradation of the light source, etc.), external factors (e.g., vibrations, misalignment of the microfluidic chip relative to the lighting or optical sensor/camera, power surges/fluctuations, etc.), and changes to the microfluidic system (e.g., deformation/shrinkage/expansion of the microfluidic channel or the microfluidic chip as a whole). The artifacts can be known. The artifacts can be unknown, and the models or classifiers disclosed herein can be configured to define one or more parameters of a new artifact, such that the new artifact can be identified, accounted for, and/or excluded in image processing and analysis.
[0216] In some examples, a plurality of artifacts disclosed herein can be identified, accounted for, and/or excluded during image/video processing or analysis. The plurality of artifacts can be weighted the same (e.g., determined to have the same degree of influence on the image/video processing or analysis) or can have different weights (e.g., determined to have different degrees of influence on the image/video processing or analysis). Weight assignments to the plurality of artifacts can be instructed manually by the user or determined automatically by the models/classifiers disclosed herein.
[0217] In some examples, one or more reference images or videos of the flow channel (e.g., with or without any cell) can be stored in a database and used as a frame of reference to help identify, account for, and/or exclude any artifact. The reference image(s)/video(s) can be obtained before use of the microfluidics system. The reference image(s)/video(s) can be obtained during the use of the microfluidics system. The reference image(s)/video(s) can be obtained periodically during the use of the microfluidics system, such as, each time the optical sensor/camera captures at least about 5, at least about 10, at least about 20, at least about 50, at least about 100, at least about 200, at least about 500, at least about 1,000, at least about 2,000, at least about 5,000, at least about 10,000, at least about 20,000, at least about 50,000, or at least about 100,000 images.
The reference image(s)/video(s) can be obtained periodically during the use of the microfluidics system, such as, each time the microfluidics system passes at least about 5, at least about 10, at least about 20, at least about 50, at least about 100, at least about 200, at least about 500, at least about 1,000, at least about 2,000, at least about 5,000, at least about 10,000, at least about 20,000, at least about 50,000, at least about 100,000 cells.
The reference image(s)/video(s) can be obtained at landmark periods during the use of the microfluidics system, such as, when the optical sensor/camera captures at least about 5, at least about 10, at least about 20, at least about 50, at least about 100, at least about 200, at least about 500, at least about 1,000, at least 2,000, at least about 5,000, at least about 10,000, at least about 20,000, at least about 50,000, at least about 100,000 images.
The reference image(s)/video(s) can be obtained at landmark periods during the use of the microfluidics system, such as, when the microfluidics system passes at least about 5, at least about 10, at least about 20, at least about 50, at least about 100, at least about 200, at least about 500, at least about 1,000, at least about 2,000, at least about 5,000, at least about 10,000, at least about 20,000, at least about 50,000, at least about 100,000 images. {0218] The method and the platform as disclosed herein can be utilized to process (e.g., modify, analyze, classify) the image data at a rate of about 1,000 images/second to about 100,000,000 images/second.
The rate of image data processing can be at least about 1,000 images/second.
The rate of image data processing can be at most about 100,000,000 images/second.
The rate of image data processing can be about 1,000 images/second to about 5,000 images/second, about 1,000 images/second to about 10,000 images/second, about 1,000 images/second to about 50,000 images/second, about 1,000 images/second to about 100,000 images/second, about 1,000 images/second to about 500,000 images/second, about 1,000 images/second to about 1,000,000 images/second, about 1,000 images/second to about 5,000,000 images/second, about 1,000 images/second to about 10,000,000 images/second, about 1.000 images/second to about 50,000,000 images/second, about 1,000 images/second to about 100,000,000 images/second, about 5,000 images/second to about 10,000 images/second, about 5,000 images/second to about 50,000 images/second, about 5,000 images/second to about 100,000 images/second, about 5,000 images/second to about 500,000 images/second, about 5,000 images/second to about 1,000,000 images/second, about 5,000 images/second to about 5,000,000 images/second, about 5,000 images/second to about 10,000,000 images/second, about 5,000 images/second to about 50,000,000 images/second, about 5,000 images/second to about 100,000,000 images/second, about 10,000 images/second to about 50,000 images/second, about 10,000 images/second to about 100,000 images/second, about 10,000 images/second to about 500,000 images/second, about 10,000 images/second to about 1,000,000 images/second, about 10,000 images/second to about 5.000.000 images/second, about 10,000 images/second to about 10,000,000 images/second, about 10,000 1mages/second to about 50,000,000 images/second, about 10,000 images/second to about 100,000,000 images/second, about 50,000 images/second to about 100,000 images/second, about 50,000 images/second to about 500,000 images/second, about 50,000 images/second to about 1,000,000 images/second, about 50,000 images/second to about 5,000,000 images/second, about 50,000 images/second to about 10,000,000 images/second, about 50,000 images/second to about 50,000,000 images/second, about 50,000 images/second to about 100,000,000 images/second, about 100,000 images/second to about 500,000 images/second, about 100,000 images/second to about 1,000,000 images/second, about 100,000 images/second to about 5,000,000 images/second, about 100,000 images/second to about 10,000,000 images/second, about 100,000 images/second to about 50,000,000 1mages/second, about 100,000 images/second to about 100,000,000 images/second, about 500,000 images/second to about 1,000,000 images/second, about 500,000 images/second to about 5,000,000 images/second, about 500,000 images/second to about 10,000,000 images/second, about 500,000 images/second to about 50,000,000 images/second, about 500,000 images/second to about 100,000,000 images/second, about 1,000,000 images/second to about 5,000,000 images/second, about 1,000,000 images/second to about 10,000,000 images/second, about 1,000,000 images/second to about 50,000,000 images/second, about 1,000,000 images/second to about 100,000,000 images/second, about 5,000,000 images/second to about 10,000,000 images/second, about 5,000,000 images/second to about 50,000,000 images/second, about 5,000,000 images/second to about 100,000,000 images/second, about 10,000,000 images/second to about 50,000,000 images/second, about 10,000,000 images/second to about 100,000,000 images/second, or about 50,000,000 images/second to about 100,000,000 images/second.
The rate of image data processing can be about 1.000 images/second, about 5,000 images/second, about 10,000 images/second, about 50,000 images/second, about 100,000 images/second, about 500,000 images/second, about 1,000,000 mmages/second, about 5,000,000 images/second, about 10,000,000 1images/second, about 50,000,000 images/second, or about 100,000,000 images/second.
[0219] The method and the platform as disclosed herein can be utilized to process {e.g., modify, analyze, classify) the image data at a rate of about 1,000 cells/second to about 100,000,000 cells/second.
The rate of image data processing can be at least about 1,000 cells/second.
The rate of image data processing can be at most about 100,000,000 cells/second.
The rate of image data processing can be about 1,000 cells/second to about 5,000 cells/second, about 1,000 cells/second to about 10,000 cells/second, about 1,000 cells/second to about 50,000 cells/second, about 1,000 cells/second to about 100,000 cells/second, about 1,000 cells/second to about 500,000 cells/second, about 1,000 cells/second to about 1,000,000 cells/second, about 1,000 cells/second to about 5,000,000 cells/second, about 1,000 cells/second to about 10,000,000 cells/second, about 1,000 cells/second to about 50,000,000 cells/second, about 1,000 cells/second to about 100,000,000 cells/second, about 5,000 cells/second to about 10,000 cells/second, about 5,000 cells/second to about 50,000 cells/second, about 5,000 cells/second to about 100,000 cells/second, about 5,000 cells/second to about 500,000 cells/second, about 5,000 cells/second to about 1,000,000 cells/second, about 5,000 cells/second to about 5,000,000 cells/second, about 5,000 cells/second to about 10,000,000 cells/second, about 5,000 cells/second to about 50,000,000 cells/second, about 5,000 cells/second to about 100,000,000 cells/second, about 10,000 cells/second to about 50,000 cells/second, about 10,000 cells/second to about 100,000 cells/second, about 10,000 cells/second to about 500,000 cells/second, about 10,000 cells/second to about 1,000,000 cells/second, about 10,000 cells/second to about 5,000,000 cells/second, about 10,000 cells/second to about 10,000,000 cells/second, about 10,000 cells/second to about 50,000,000 cells/second, about 10,000 cells/second to about 100,000,000 cells/second, about 50,000 cells/second to about 100,000 cells/second, about 50,000 cells/second to about 500,000 cells/second, about 50,000 cells/second to about 1,000,000 cells/second, about 50,000 cells/second to about 5,000,000 cells/second, about 50,000 cells/second to about 10,000,000 cells/second, about 50,000 cells/second to about 50,000,000 cells/second, about 50,000 cells/second to about 100,000,000 cells/second, about 100,000 cells/second to about 500,000 cells/second. about 100,000 cells/second to about 1,000,000 cells/second, about 100,000 cells/second to about 5,000,000 cells/second, about 100,000 cells/second to about 10,000,000 cells/second, about 100,000 cells/second to about 50,000,000 cells/second, about 100,000 cells/second to about 100,000,000 cells/second, about 500,000 cells/second to about 1,000,000 cells/second, about 500,000 cells/second to about 5,000,000 cells/second, about 500,000 cells/second to about 10,000,000 cells/second, about 500,000 cells/second to about 50,000,000 cells/second, about 500,000 cells/second to about 100,000,000 cells/second, about
1,000,000 cells/second to about 5,000,000 cells/second, about 1,000,000 cells/second to about 10,000,000 cells/second, about 1,000,000 cells/second to about 50,000,000 cells/second, about 1,000,000 cells/second to about 100,000,000 cells/second, about 5,000,000 cells/second to about 10,000,000 cells/second, about 5,000,000 cells/second to about 50,000,000 cells/second, about 5,000,000 cells/second to about 100,000,000 cells/second, about 10,000,000 cells/second to about 50,000,000 cells/second, about 10,000,000 cells/second to about 100,000,000 cells/second, or about 50,000,000 cells/second to about 100,000,000 cells/second. The rate of image data processing can be about 1,000 cells/second, about 5,000 cells/second, about 10,000 cells/second, about 50,000 cells/second, about 100,000 cells/second, about 500,000 cells/second, about 1,000,000 cells/second, about 5,000,000 cells/second, about 10,000,000 cells/second, about 50,000,000 cells/second, or about 100,000,000 cells/second.
[0220] The method and the platform as disclosed herein can be utilized to process (e.g., modify, analyze, classify) the image data at a rate of about 1,000 datapoints/second to about 100,000,000 datapoints/second. The rate of image data processing can be at least about 1,000 datapoints/second.
The rate of image data processing can be at most about 100,000,000 datapoints/second. The rate of image data processing can be about 1,000 datapoints/second to about 5,000 datapoints/second, about 1,000 datapoints/second to about 10,000 datapoints/second, about 1,000 datapoints/second to about 50.000 datapoints/second, about 1,000 datapoints/second to about 100,000 datapoints/second, about 1,000 datapoints/second to about 500,000 datapoints/second, about 1,000 datapoints/second to about 1,000,000 datapoints/second, about 1,000 datapoints/second to about 5,000,000 datapoints/second, about 1,000 datapoints/second to about 10,000,000 datapoints/second, about 1,000 datapoints/second to about 50,000,000 datapoints/second, about 1,000 datapoints/second to about 100,000,000 datapoints/second, about 5,000 datapoints/second to about 10,000 datapoints/second, about 5,000 datapoints/second to about 50,000 datapoints/second, about 5,000 datapoints/second to about 100,000 datapoints/second, about 5,000 datapoints/second to about 500,000 datapoints/second, about 5,000 datapoints/second to about 1,000,000 datapoints/second, about 5,000 datapoints/second to about 5,000,000 datapoints/second, about 5,000 datapoints/second to about 10,000,000 datapoints/second, about 5,000 datapoints/second to about 50,000,000 datapoints/second, about 5,000 datapoints/second to about 100,000,000 datapoints/second, about 10,000 datapoints/second to about 50,000 datapoints/second, about 10,000 datapoints/second to about 100,000 datapoints/second, about 10,000 datapoints/second to about 500,000 datapoints/second, about 10,000 datapoints/second to about
1,000,000 datapoints/second, about 10,000 datapoints/second to about 5,000,000 datapoints/second, about 10.000 datapoints/second to about 10,000,000 datapoints/second, about 10,000 datapoints/second to about 50,000,000 datapoints/second, about 10,000 datapoints/second to about 100,000,000 datapoints/second, about 50,000 datapoints/second to about 100,000 datapoints/second, about 50,000 datapoints/second to about 500,000 datapoints/second, about 50,000 datapoints/second to about 1,000,000 datapoints/second, about 50,000 datapoints/second to about 5,000,000 datapoints/second, about 50,000 datapoints/second to about 10,000,000 datapoints/second, about 50,000 datapoints/second to about 50,000,000 datapoints/second, about 50,000 datapoints/second to about 100,000,000 datapoints/second, about 100,000 datapoints/second to about 500,000 datapoints/second, about 100,000 datapoints/second to about 1,000,000 datapoints/second, about 100,000 datapoints/second to about 5,000,000 datapoints/second, about 100,000 datapoints/second to about 10,000,000 datapoints/second, about 100,000 datapoints/second to about 50,000,000 datapoints/second, about 100,000 datapoints/second to about 100,000,000 datapoints/second, about 500,000 datapoints/second to about 1,000,000 datapoints/second, about 500,000 datapoints/second to about 5,000,000 datapoints/second, about 500,000 datapoints/second to about 10,000,000 datapoints/second, about 500.000 datapoints/second to about 50,000,000 datapoints/second, about 500,000 datapoints/second to about 100,000,000 datapoints/second, about 1,000,000 datapoints/second to about 5,000,000 datapoints/second, about 1,000,000 datapoints/second to about 10,000,000 datapoints/second, about 1,000,000 datapoints/second to about 50,000,000 datapoints/second, about 1,000,000 datapoints/second to about 100,000,000 datapoints/second, about 5,000,000 datapoints/second to about 10,000,000 datapoints/second, about 5,000,000 datapoints/second to about 50,000,000 datapoints/second, about 5,000,000 datapoints/second to about 100,000,000 datapoints/second, about 10,000,000 datapoints/second to about 50,000,000 datapoints/second, about 10,000,000 datapoints/second to about 100,000,000 datapoints/second, or about 50,000,000 datapoints/second to about 100,000,000 datapoints/second. The rate of image data processing can be about 1,000 datapoints/second, about 5,000 datapoints/second, about 10,000 datapoints/second, about 50,000 datapoints/second, about 100,000 datapoints/second, about 500,000 datapoints/second, about 1,000,000 datapoints/second, about 5,000,000 datapoints/second, about 10,000,000 datapoints/second, about 50,000,000 datapoints/second, or about 100,000,000 datapoints/second.
[0221] Any of the methods or platforms disclosed herein can be operatively coupled to an online crowdsourcing platform. The online crowdsourcing platform can comprise any of the database disclosed herein. For example, the database can store a plurality of single cell images that are grouped into morphologically-distinct clusters corresponding to a plurality of cell classes (e.g., predetermined cell types or states). The online crowdsourcing platform can comprise one or more models or classifiers as disclosed herein (e.g., a modeling library comprising one or more machine learning models/classifiers as disclosed herein). The online crowdsourcing platform can comprise a web portal for a community of users to share contents, e.g., (1) upload, download, search, curate, annotate, or edit one or more existing images or new images into the database, (2) train or validate the one or more model(s)/classifier(s) using datasets from the database, and/or (3) upload new models into the modeling library. In some examples, the online crowdsourcing platform can allow users to buy, sell, share, or exchange the model(s)/classifier(s) with one another.
[0222] In some examples, the web portal can be configured to generate incentives for the users to update the database with new annotated cell images, model(s), and/or classifier(s). Incentives can be monetary. Incentives can be additional access to the global CMA, model(s), and/or classified s). In some examples, the web portal can be configured to generate incentives for the users to download, use, and review (e.g., rate or leave comments) any of the annotated cell images, model(s), and/or classifier(s) from, e.g., other users.
[0223] In some examples, a global cell morphology atlas (global CMA) can be generated using collecting (1) annotated cell images, (ii) cell morphology maps or ontologies, (iii), and/or (iv) classifiers from the users using the web portal. The global CMA can then be shared with the users via the web portal. All users can have access to the global CMA. In another example, specifically defined users can have access to specifically defined portions of the global CMA. For example, cancer centers can have access to “cancer cells” portion of the global CMA, e.g., using a subscription based service.
In a similar fashion, global models or classifiers can be generated based on the annotated cell images, model(s), and/or classifiers that are collected from the users using the web portal.
Microfluidic Systems and Methods Thereof
[0224] FIG. 8A shows a schematic illustration of the cell sorting system, as disclosed herein, with a cartridge design (e.g., a microfluidic design), with further details illustrated in FIG. 8B. The cell sorting system can be operatively coupled to a machine learning or artificial intelligence controller.
Such ML/AI controller can be configured to perform any of the methods disclosed herein. Such ML/AI controller can be operatively coupled to any of the platforms disclosed herein. In operation, a sample 802 is prepared and injected by a pump 804 (e.g., a syringe pump) into a cartridge 805, or flow-through device. In some examples, the cartridge 805 is a microfluidic device. Although FIG. 8A illustrates a classification and/or sorting system utilizing a syringe pump, any of a number of perfusion systems can be used such as (but not limited to) gravity feeds, peristalsis, or any of a number of pressure systems. In some examples, the sample is prepared by fixation and staining. In some examples, the sample comprises live cells. As can readily be appreciated, the specific manner in which the sample is prepared is largely dependent upon the requirements of a specific application.
[0225] Examples of the pump or other suitable flow unit may be, but are not limited to, a syringe
IO pump, a vacuum pump, an actuator (e.g., linear, pneumatic, hydraulic, etc.), a compressor, or any other suitable device to exert pressure (positive, negative, alternating thereof, etc.) to a fluid that may or may not comprise one or more particles (e.g., one or more cells to be classified, sorted, and/or analyzed). The pump or other suitable flow unit may be configured to raise, compress, move, and/or transfer fluid into or away from the microfluidic channel. In some examples, the pump or other suitable flow unit may be configured to deliver positive pressure, alternating positive pressure and vacuum pressure, negative pressure, alternating negative pressure and vacuum pressure, and/or only vacuum pressure. The cartridge of the present disclosure may comprise (or otherwise be in operable communication with) at least about 1- e.g., at least about 2, at least about 3, at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, at least about 10, or more, pumps or other flow units. The flow cell may comprise at most about 10, at most about 9, at most about 8, at most about 7, at most about 6, at most about 5, at most about 4, at most about 3, at most about 2, or at most about 1 pumps or other suitable flow units.
[0226] Each pump or other suitable flow unit can be in fluid communication with at least about 1, at least about 2, at least about 3, at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, at least about 10, or more sources of fluid. Each flow unit may be in fluid communication with at most about 10, at most about 9, at most about 8, at most about 7, at most about 6, at most about 5, at most about 4, at most about 3, at most about 2, or at most about 1 fluid.
The fluid may contain the particles (e.g., cells). In another example, the fluid may be particle-free. The pump or other suitable flow unit may be configured to maintain, increase, and/or decrease a flow velocity of the fluid within the microfluidic channel of the flow unit. Thus, the pump or other suitable flow unit may be configured to maintain, increase, and/or decrease a flow velocity (e.g., downstream of the microfluidic channel) of the particles. The pump or other suitable flow unit may be configured to accelerate or decelerate a flow velocity of the fluid within the microfluidic channel of the flow unit, thereby accelerating or decelerating a flow velocity of the particles.
[0227] The fluid can be liquid or gas (e.g., air, argon, nitrogen, etc.). The liquid can be an aqueous solution (e.g., water, buffer, saline, etc.). In another example, the liquid can be oil. In some examples, only one or more aqueous solutions can be directed through the microfluidic channels. In another example, only one or more oils can be directed through the microfluidic channels. In another alternative, both aqueous solution(s) and oil(s) can be directed through the microfluidic channels. In some examples, (1) the aqueous solution may form droplets (e.g., emulsions containing the particles) that are suspended in the oil, or (i1) the oil may form droplets (e.g., emulsions containing the particles) that are suspended in the aqueous solution. As can readily be appreciated, any perfusion system, including but not limited to peristalsis systems and gravity feeds, appropriate to a given classification and/or sorting system can be utilized.
[0228] As noted above, the cartridge 805 can be implemented as a fluidic device that focuses cells from the sample into a single streamline that is imaged continuously. In the illustrated example, the cell line is illuminated by a light source 806 (e.g., a lamp, such as an arc lamp) and an optical system 810 that directs light onto an imaging region 838 of the cartridge 805. An objective lens system 812 magnifies the cells by directing light toward the sensor of a high-speed camera system 814.
[0229] In some examples, a 10x, 20x, 40x, 60x, 80x, 100x, or 200x objective is used to magnify the cells. In some examples, a 10x, objective is used to magnify the cells. In some examples, a 20x objective is used to magnify the cells. In some examples, a 40x objective is used to magnify the cells.
In some examples, a 60x objective is used to magnify the cells. In some examples, a 80x objective is used to magnify the cells. In some examples, a 100x objective is used to magnify the cells. In some examples, a 200x objective is used to magnify the cells. In some examples, a 10x to a 200x objective is used to magnify the cells, for example a 10x-20x, a 10x-40x, a 10x-60x, a 10x-80x, or 10x-100x objective is used to magnify the cells. As can readily be appreciated by a person having ordinary skill in the art, the specific magnification utilized can vary greatly and is largely dependent upon the requirements of a given imaging system and cell types of interest.
[0230] In some examples, one or more imaging devices can be used to capture images of the cell.
In some examples, the imaging device is a high-speed camera. In some examples, the imaging device is a high-speed camera with a micro-second exposure time. In some instances, the exposure time is about 1 millisecond. In some instances, the exposure time is between about 1 millisecond (ms) and about 0.75 millisecond. In some instances, the exposure time is between about 1 ms and about 0.50 ms. In some instances, the exposure time is between about 1 ms and about 0.25 ms. In some instances, the exposure time is between about 0.75 ms and about 0.50 ms. In some instances, the exposure time is between about 0.75 ms and about 0.25 ms. In some instances, the exposure time is between about 0.50 ms and about 0.25 ms. In some instances, the exposure time is between about 0.25 ms and about 0.1 ms. In some instances, the exposure time is between about 0.1 ms and about 0.01 ms. In some instances, the exposure time is between about 0.1 ms and about 0.001 ms. In some instances, the exposure time is between about 0.1 ms and about 1 microsecond (ps). In some examples, the exposure time is between about 1 ps and about 0.1 ps. In some examples, the exposure time is between about 1 ps and about 0.01 ps. In some examples, the exposure time is between about 0.1 ps and about 0.01 ps.
In some examples, the exposure time is between about 1 ps and about 0.001 ps. In some examples, the exposure time is between about 0.1 ps and about 0.001 ps. In some examples, the exposure time is between about 0.01 ps and about 0.001 ps.
[0231] In some examples, the cartridge 805 may comprise at least about 1 — e.g., at least about 2, at least about 3, at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, at least about 10, or more, imaging devices (e.g., the high-speed camera system 814) on or adjacent to the imaging region 838. In some examples, the cartridge may at most about 10 —e.g., at most about 9, at most about 8, at most about 7, at most about 6, at most about 5, at most about 4, at most about 3, at most about 2, or at most about 1 imaging device on or adjacent to the imaging region 838. In some examples, the cartridge 805 may comprise a plurality of imaging devices. Each of the plurality of imaging devices may use light from a same light source. In another example, each of the plurality of imaging devices may use light from different light sources. The plurality of imaging devices can be configured in parallel and/or in series with respect to one another. The plurality of imaging devices can be configured on one or more sides (e.g.. two adjacent sides or two opposite sides) of the cartridge 805. The plurality of imaging devices can be configured to view the imaging region 838 along a same axis or different axes with respect to (i) a length of the cartridge 805 (e.g., a length of a straight channel of the cartridge 805) or (ii) a direction of migration of one or more particles (e.g., one or more cells) in the cartridge 805,
[0232] One or more imaging devices of the present disclosure can be stationary while imaging one or more cells, e.g., at the imaging region 838. In another example, one or more imaging devices may move with respect to the flow channel (e.g., along the length of the flow channel, towards and/or away from the flow channel, tangentially about the circumference of the flow channel, etc.) while imaging the one or more cells. In some examples, the one or more imaging devices can be operatively coupled to one or more actuators, such as, for example, a stepper actuator, linear actuator, hydraulic actuator, pneumatic actuator, electric actuator, magnetic actuator, and mechanical actuator (e.g., rack and pinion, chains, etc.).
[0233] In some examples, the cartridge 805 may comprise at least about 1 — e.g., at least about 2, about 3, about 4, about 5, about 6, about 7, about 8, about 9, about 10, or more, imaging regions {e.g., the imaging region 838). In some examples, the cartridge 805 may comprise at most about 10 — e.g., at most about 9, about 8, about 7, about 6, about 5, about 4, about 3, about 2, or about 1 imaging region. In some examples, the cartridge 815 may comprise a plurality of imaging regions, and the plurality of imaging regions can be configured in parallel and/or in series with respect to each another.
The plurality of imaging regions may or may not be in fluid communication with each other. In an example, a first imaging region and a second imaging region can be configured in parallel, such that a first fluid that passes through the first imaging region does not pass through a second imaging region.
In another example, a first imaging region and a second imaging region can be configured in series, such that a first fluid that passes through the first imaging region also passes through the second imaging region.
[0234] The imaging device(s) (e.g., the high-speed camera) of the imaging system can comprise an electromagnetic radiation sensor {e.g., IR sensor, color sensor, etc.) that detects at least a portion of the electromagnetic radiation that is reflected by and/or transmitted from the cartridge or any content (e.g., the cell) in the cartridge. The imaging device can be in operative communication with one or more sources (e.g., at least about 1, about 2, about 3, about 4, about 5, or more) of the electromagnetic radiation. The electromagnetic radiation can comprise one or more wavelengths from the electromagnetic spectrum including, but not limited to x-rays (about 0.1 nanometers (nm) to about 10.0 nm; or about 10'® Hertz (Hz) to about 10'® Hz), ultraviolet (UV) rays (about 10.0 nm to about 380 nm; or about 8x 10'® Hz to about 10/5 Hz), visible light (about 380 nm to about 750 nm; or about 8x 107 Hz to about 4x 10'* Hz), infrared (IR) light (about 750 nm to about 0.1 centimeters (cm); or about 4x 10" Hz to about 5x 10! Hz), and microwaves (about 0.1 cm to about 100 cm; or about 10® Hz to about 5x 10! Hz). In some examples, the source(s) of the electromagnetic radiation can be ambient light, and thus the cell sorting system may not have an additional source of the electromagnetic radiation. f0235] The imaging device(s) can be configured to take a two-dimensional image (e.g., one or more pixels) of the cell and/or a three-dimensional image (e.g., one or more voxels) of the cell.
[0236] As can readily be appreciated, the exposure times can differ across different systems and can largely be dependent upon the requirements of a given application or the limitations of a given system such as but not limited to flow rates. Images are acquired and can be analyzed using an image analysis algorithm.
[0237] In some examples, the images are acquired and analyzed post-capture. In some examples, the images are acquired and analyzed in real-time continuously. Using object tracking software, single cells can be detected and tracked while in the field of view of the camera.
[0238] Background subtraction can then be performed. In a number of examples, the cartridge 806 causes the cells to rotate as they are imaged, and multiple images of each cell are provided to a computing system 816 for analysis. In some examples, the multiple images comprise images from a plurality of cell angles.
[0239] The flow rate and channel dimensions can be determined to obtain multiple images of the same cell from a plurality of different angles (i.e, a plurality of cell angles). A degree of rotation between an angle to the next angle can be uniform or non-uniform. In some examples, a full 360° view of the cell is captured. In some examples, 4 images are provided in which the cell rotates 90° between successive frames. In some examples, 8 images are provided in which the cell rotates 45° between successive frames. In some examples, 24 images are provided in which the cell rotates 15° between successive frames. In some examples, at least three or more images are provided in which the cell rotates at a first angle between a first frame and a second frame, and the cell rotates at a second angle between the second frame and a third frame, wherein the first and second angles are different. In some examples, less than the full 360° view of the cell can be captured, and a resulting plurality of images of the same cell can be sufficient to classify the cell (e.g., determine a specific type of the cell).
[0240] The cell can have a plurality of sides. The plurality of sides of the cell can be defined with respect to a direction of the transport (flow) of the cell through the channel. In some examples, the cell can comprise a stop side, a bottom side that is opposite the top side, a front side (e.g., the side towards the direction of the flow of the cell), a rear side opposite the front side, a left side, and/or a right side opposite the left side. In some examples, the image of the cell can comprise a plurality of images captured from the plurality of angles, wherein the plurality of images comprise: (1) an image captured from the top side of the cell, (2) an image captured from the bottom side of the cell, (3) an image captured from the front side of the cell, (4) an image captured from the rear side of the cell, (5) an image captured from the left side of the cell, and/or (6) an image captured from the right side of the cell.
[0241] In some examples, a two-dimensional “hologram” of a cell can be generated using superimposing the multiple images of the individual cell. The “hologram” can be analyzed to automatically classify characteristics of the cell based upon features inclading but not limited to the morphological features of the cell,
[0242] In some examples, at least about 1, at least about 2, at least about 3. at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, or at least about 10 images are captured for each cell. In some examples, about 5 or more images are captured for each cell. In some examples, from about 5 to about 10 images are captured for each cell. In some examples, 10 or more images are captured for each cell. In some examples, from about 10 to about 20 images are captured for each cell. In some examples, about 20 or more images are captured for each cell. In some examples, from about 20 to about 50 images are captured for each cell. In some examples, about 50 or more images are captured for each cell. In some examples, from about 50 to about 100 images are captured for each cell. In some examples, at least about 1, at least about 2, at least about 3, at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, at least about 10, at least about 15, at least about 20, at least about 30, at least about 40, at least about 50, or more images may be captured for each cell at a plurality of different angles. In some examples, at most 50, at most about 40, at most about 30, at most about 20, at most about 15, at most about 10, at most about 9, at most about 8, at most about 7, at most about 6, at most about 5, at most about 4, at most about 3, or at most about 2 images can be captured for each cell at a plurality of different angles.
[0243] In some examples, the imaging device is moved so as to capture multiple images of the cell from a plurality of angles. In some examples, the images are captured at an angle between 0 and 90 degrees to the horizontal axis. In some examples, the images are captured at an angle between 90 and 180 degrees to the horizontal axis. In some examples, the images are captured at an angle between 180 and 270 degrees to the horizontal axis. In some examples, the images are captured at an angle between 270 and 360 degrees to the horizontal axis. In some examples, multiple imaging devices (for e.g. multiple cameras) are used wherein each device captures an image of the cell from a specific cell angle. In some examples, at least about 2, at least about 3, at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, or at least about 10 cameras are used. In some examples, more than about 10 cameras are used, wherein each camera images the cell from a specific cell angle.
[0244] As can readily be appreciated, the number of images that are captured is dependent upon the requirements of a given application or the limitations of a given system. In several examples, the cartridge has different regions to focus, order, and/or rotate cells. Although the focusing regions, ordering regions, and cell rotating regions are discussed as affecting the sample in a specific sequence, a person having ordinary skill in the art would appreciate that the various regions can be arranged differently, where the focusing, ordering, and/or rotating of the cells in the sample can be performed in any order. Regions within a microfluidic device implemented in accordance with an example of the disclosure are illustrated in FIG. 8B. Cartridge 805 may include a filtration region 830 to prevent channel clogging by aggregates/debris or dust particles. Cells pass through a focusing region 832 that focuses the cells into a single streamline of cells that are then spaced by an ordering region 834. In some examples, the focusing region utilizes “inertial focusing” to form the single streamline of cells.
In some examples, the focusing region utilizes “hydrodynamic focusing” to focus the cells into the single streamline of cells. Optionally, prior to imaging, rotation can be imparted upon the cells by a rotation region 836. The optionally spinning cells can then pass through an imaging region 838 in which the cells are illuminated for imaging prior to exiting the cartridge. These various regions are described and discussed in further detail below. In some examples, the rotation region 836 may precede the imaging region 838. In some examples, the rotation region 836 can be a part (e.g., a beginning portion, a middle portion, and/or an end portion with respect to a migration of a cell within the cartridge) of the imaging region 838. In some examples, the imaging region 838 can be a part of the rotation region 836.
[0245] In some examples, a single cell is imaged in a field of view of the imaging device, e.g. camera.
In some examples, multiple cells are imaged in the same field of view of the imaging device. In some examples, at least about 1, at least about 2, at least about 3, at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, or at least about 10 cells are imaged in the same field of view of the imaging device. In some examples, up to about 100 cells are imaged in the same field of view of the imaging device. In some instances, about 10 to about 100 cells are imaged in the field of view, for example, about 10 to 20 cells, about 10 to about 30 cells, about 10 to about 40 cells, about 10 to about 50 cells, about 10 to about 60 cells, about 10 to about 80 cells, about 10 to about 90 cells, about 20 to about 30 cells, about 20 to about 40 cells, about 20 to about 50 cells, about 20 to about 60 cells, about 20 to about 70 cells, about 20 to about 80 cells, about 20 to about 90 cells, about 30 to about 40 cells, about 40 to about 50 cells, about 40 to about 60 cells, about 40 to about 70 cells, about 40 to about 80 cells, about 40 to about 90 cells, about 50 to about 60 cells, about 50 to about 70 cells, about 50 to about 80 cells, about 50 to about 90 cells, about 60 to about 70 cells, about 60 to about 80 cells, about 60 to about 90 cells, about 70 to about 80 cells, about 70 to about 90 cells, or about 90 to about 100 cells are imaged in the same field of view of the imaging device.
[0246] In some examples, only a single cell can be allowed to be transported across a cross-section of the flow channel perpendicular to the axis of the flow channel. In some examples, a plurality of cells (e.g. at least about 2, about 3, about 4, about 5, or more cells; at most about 5, about 4, about 3, about 2, or about 1 cell) can be allowed to be transported simultaneously across the cross-section of the flow channel perpendicular to the axis of the flow channel. In such a example, the imaging device (or the processor operatively linked to the imaging device) can be configured to track each of the plurality of cells as they are transported along the flow channel.
[0247] The imaging system can include, among other things, a camera, an objective lens system and a light source. In a number of examples, cartridges similar to those described above can be fabricated using standard 2D microfluidic fabrication techniques, requiring minimal fabrication time and cost.
[0248] Although specific classification and/or sorting systems, cartridges, and microfluidic devices are described above with respect to FIGS. 8A-8B, classification and/or sorting systems can be implemented in any of a variety of ways appropriate to the requirements of specific applications in accordance with various examples of the disclosure. Specific elements of microfluidic devices that can be utilized in classification and/or sorting systems in accordance with some examples of the disclosure are discussed further below. 10249] Insome examples, examples, the microfluidic system can comprise a microfluidic chip (e.g., comprising one or more microfluidic channels for flowing cells) operatively coupled to an imaging device (e.g., one or more cameras). A microfluidic device can comprise the imaging device, and the chip can be inserted into the device, to align the imaging device to an imaging region of a channel of the chip. To align the chip to the precise location for the imaging, the chip can comprise one or more positioning identifiers (e.g., pattem(s), such as numbers, letters, symbols, or other drawings) that can be imaged to determine the positioning of the chip (and thus the imaging region of the channel of the chip) relative to the device as a whole or relative to the imaging device. For image-based alignment (e.g., auto-alignment) of the chip within the device, one or more images of the chip can be capture upon its coupling to the device, and the image(s) can be analyzed by any of the methods disclosed herein (e.g., using any model or classifier disclosed herein) to determine a degree or score of chip alignment. The positioning identifier(s) can be a “guide” to navigate the stage holding the chip within the device to move within the device towards a correct position relative to the imaging unit.
[0250] In some examples, rule-based image processing can be used to navigate the stage to a precise range of location or a precise location relative to the image unit.
[0251] In some examples, machine learning/artificial intelligence methods as disclosed herein can be modified or trained to identify the pattern on the chip and navigate the stage to the precise imaging location for the image unit, to increase resilience.
[0252] In some examples, machine learning/artificial intelligence methods as disclosed herein can be modified or trained to implement reinforcement learning based alignment and focusing. The alignment process for the chip to the instrument or the image unit can involve moving the stage holding the chip in, e.g., either X or Y axis and/or moving the imaging plane on the Z axis. In the training process, (1) the chip can start at a X, Y, and Z position (e.g., randomly selected), (ii) based on one or more image(s) of the chip and/or the stage holding the chip, a model can determine a movement vector for the stage and a movement for the imaging plane, (iii) depending on whether such movement vector may take the chip closer to the optimum X, Y, and Z position relative to the image unit, an error term can be determined as a loss for the model, and (iv) the magnitude of the error can be either constant or be proportional to how far the current X, Y, and Z position is from an optimal X, Y, and Z position (e.g., can be predetermined). Such trained model can be used to determine, for example, the movement vector and/or movement of the movement for the imaging plane, to enhance relative alignment between the chip and the image unit (e.g., one or more sensors). The alignment can occur subsequent to capturing of the image(s). In another example, the alignment can occur real-time while capturing images/videos of the positioning identifier(s) of the chip.
[0253] One or more flow channels of the cartridge of the present disclosure may have various shapes and sizes. For example, referring to FIGS. 8A-8B, at least a portion of the flow channel (e.g., the focusing region 832, the ordering region 834, the rotation region 836, the imaging region 838, connecting region therebetween, etc.) may have a cross-section that is circular, triangular, square, rectangular, pentagonal, hexagonal, or any partial shape or combination of shapes thereof.
[0254] In some examples, the system of the present disclosure comprises straight channels with rectangular or square cross-sections. In some examples, the system of the present disclosure comprises straight channels with round cross-sections. In some examples, the system comprises straight channels with half-ellipsoid cross-sections. In some examples, the system comprises spiral channels. In some examples, the system comprises round channels with rectangular cross-sections. In some examples, the system comprises round channels with rectangular channels with round cross-sections. In some examples, the system comprises round channels with half-ellipsoid cross-sections. In some examples, the system comprises channels that are expanding and contracting in width with rectangular cross- sections. In some examples, the system comprises channels that are expanding and contracting in width
IO with round cross-sections. In some examples, the system comprises channels that are expanding and contracting in width with half-ellipsoid cross-sections.
[0255] The flow channel can comprise one or more walls that are formed to focus one or more cells into a streamline. The flow channel can comprise a focusing region comprising the wall(s) to focus the cell(s) into the streamline. Focusing regions on a microfluidic device can take a disorderly stream of cells and utilize a variety of forces (for e.g. inertial lift forces (wall effect and shear gradient forces) or hydrodynamic forces) to focus the cells within the flow into a streamline of cells. In some examples, the cells are focused in a single streamline. In some examples, the cells are focused in multiple streamlines, for example at least about 2, at least about 3, at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, or at least about 10 streamlines.
[0256] The focusing region receives a flow of randomly arranged cells using an upstream section.
The cells flow into a region of contracted and expanded sections in which the randomly arranged cells are focused into a single streamline of cells. The focusing can be driven by the action of inertial lift forces (wall effect and shear gradient forces) acting on cells.
[0257] In some examples, the focusing region is formed with curvilinear walls that form periodic patterns. In some examples, the patterns form a series of square expansions and contractions. In other examples, the patterns are sinusoidal. In further examples, the sinusoidal patterns are skewed to form an asymmetric pattern. The focusing region can be effective in focusing cells over a wide range of flow rates. In the illustrated example, an asymmetrical sinusoidal-like structure is used as opposed to square expansions and contractions. This helps prevent the formation of secondary vortices and secondary flows behind the particle flow stream. In this way, the illustrated structure allows for faster and more accurate focusing of cells to a single lateral equilibrium position. Spiral and curved channels can also be used in an inertia regime; however, these can complicate the integration with other modules. Finally, straight channels where channel width is greater than channel height can also be used for focusing cells onto single lateral position. However, in this case, since there will be more than one equilibrium position in the z-plane, imaging can become problematic, as the imaging focal plane is preferably fixed. As can readily be appreciated, any of a variety of structures that provide a cross section that expands and contracts along the length of the microfluidic channel or are capable of focusing the cells can be utilized as appropriate to the requirements of specific applications.
[0258] The cell sorting system can be configured to focus the cell at a width and/or a height within the flow channel along an axis of the flow channel. The cell can be focused to a center or off the center of the cross-section of the flow channel. The cell can be focused to a side (e.g., a wall) of the cross- section of the flow channel. A focused position of the cell within the cross-section of the channel can be uniform or non-uniform as the cell is transported through the channel.
[0259] While specific implementations of focusing regions within microfluidic channels are described above, any of a variety of channel configurations that focus cells into a single streamline can be utilized as appropriate to the requirements of a specific application in accordance with various examples of the disclosure.
[0260] Microfluidic channels can be designed to impose ordering upon a single streamline of cells formed by a focusing region in accordance with several examples of the disclosure. Microfluidic channels in accordance with some examples of the disclosure include an ordering region having pinching regions and curved channels. The ordering region orders the cells and distances single cells from each other to facilitate imaging. In some examples, ordering is achieved by forming the microfluidic channel to apply inertial lift forces and Dean drag forces on the cells.
[0261] Different geometries, orders, and/or combinations can be used. In some examples, pinching regions can be placed downstream from the focusing channels without the use of curved channels.
Adding the curved channels can help with more rapid and controlled ordering, as well as increasing the likelihood that particles follow a single lateral position as they migrate downstream. As can readily be appreciated, the specific configuration of an ordering region is largely determined based upon the requirements of a given application.
[0262] Architecture of the microfluidic channels of the cartridge of the present disclosure can be controlled (e.g.. modified, optimized, etc.) to modulate cell flow along the microfluidic channels.
Examples of the cell flow may include (i) cell focusing {e.g., into a single streamline) and (ii) rotation of the one or more cells as the cell(s) are migrating (e.g., within the single streamline) down the length of the microfluidic channels. In some examples, microfluidic channels can be configured to impart rotation on ordered cells in accordance with a number of examples of the disclosure. One or more cell rotation regions (e.g.. the cell rotation region 836) of microfluidic channels in accordance with some examples of the disclosure use co-flow of a particle-free buffer to induce cell rotation by using the co- flow to apply differential velocity gradients across the cells. In some examples, a cell rotation region may introduce co-flow of at least about 1, about 2, about 3, about 4, about 5, or more buffers (e.g., particle-free, or containing one or more particles, such as polymeric or magnetic particles) to impart rotation on one or more cells within the channel. In some examples, a cell rotation region may introduce co-flow of at most about 5, about 4, about 3, about 2, or about 1 buffer to impart the rotation of one or more cells within the channel. In some examples, the plurality of buffers can be co-flown at a same position along the length of the cell rotation region, or sequentially at different positions along the length of the cell rotation region. In some examples, the plurality of buffers can be the same or different. In several examples, the cell rotation region of the microfluidic channel is fabricated using a two-layer fabrication process so that the axis of rotation is perpendicular to the axis of cell downstream migration and parallel to cell lateral migration.
[0263] Cells can be imaged in at least a portion of the cell rotating region, while the cells are tumbling and/or rotating as they migrate downstream. In another example, the cells can be imaged in an imaging region that is adjacent to or downstream of the cell rotating region. In some examples, the cells can be flowing in a single streamline within a flow channel, and the cells can be imaged as the cells are rotating within the single streamline. A rotational speed of the cells can be constant or varied along the length of the imaging region. This may allow for the imaging of a cell at different angles (e.g., from a plurality of images of the cell taken from a plurality of angles due to rotation of the cell), which may provide more accurate information concerning cellular features than can be captured in a single image or a sequence of images of a cell that is not rotating to any significant extent. This also allow a 3D reconstruction of the cell using available software since the angles of rotation across the images are known. In another example, every single image of the sequence of image many be analyzed individually to analyze (e.g., classify) the cell from each image. In some examples, results of the individual analysis of the sequence of images can be aggregated to determine a final decision (e.g., classification of the cell).
[0264] In some examples, a cell rotation region of a microfluidic channel incorporates an injected co-flow prior to an imaging region in accordance with an example of the disclosure. Co-flow can be introduced in the z plane (perpendicular to the imaging plane) to spin the cells. Since the imaging is done in the x-y plane, rotation of cells around an axis parallel to the y-axis provides additional information by rotating portions of the cell that may have been occluded in previous images into view in each subsequent image. Due to a change in channel dimensions, at point xo, a velocity gradient is applied across the cells, which can cause the cells to spin. The angular velocity of the cells depends on channel and cell dimensions and the ratio between Q1 (main channel flow rate) and Q2 (co-flow rate) and can be configured as appropriate to the requirements of a given application. In some examples, a cell rotation region incorporates an increase in one dimension of the microfluidic channel to initiate a change in the velocity gradient across a cell to impart rotation onto the cell. In some examples, a cell rotation region of a microfluidic channel incorporates an increase in the z-axis dimension of the cross section of the microfluidic channel prior to an imaging region in accordance with an example of the disclosure. The change in channel height can initiate a change in velocity gradient across the cell in the z axis of the microfluidic channel, which can cause the cells to rotate as with using co flow.
[0265] In some examples, the system and methods of the present disclosure focuses the cells in microfluidic channels. The term focusing as used herein broadly means controlling the trajectory of cell/cells movement and comprises controlling the position and/or speed at which the cells travel within the microfluidic channels. In some examples controlling the lateral position and/or the speed at which the particles travel inside the microfluidic channels, allows to accurately predict the time of arrival of the cell at a bifurcation. The cells may then be accurately sorted. The parameters critical to the focusing of cells within the microfluidic channels include, but are not limited to channel geometry, particle size, overall system throughput, sample concentration, imaging throughput, size of field of view, and method of sorting.
[0266] In some examples the focusing is achieved using inertial forces. In some examples, the system and methods of the present disclosure focus cells to a certain height from the bottom of the channel using inertial focusing. In these examples, the distance of the cells from the objective is equal and images of all the cells will be clear. As such, cellular details, such as nuclear shape, structure, and size appear clearly in the outputted images with minimal blur. In some examples, the system disclosed herein has an imaging focusing plane that is adjustable. In some examples, the focusing plane is adjusted by moving the objective or the stage. In some examples, the best focusing plane is found by recording videos at different planes and the plane wherein the imaged cells have the highest Fourier magnitude, thus, the highest level of detail and highest resolution, is the best plane.
[0267] Insome examples, the system and methods of the present disclosure utilize a hydrodynamic- based z focusing system to obtain a consistent z height for the cells of interests that are to be imaged.
In some examples, the design comprises hydrodynamic focusing using multiple inlets for main flow and side flow. In some examples, the hydrodynamic-based z focusing system is a triple-punch design.
In some examples, the design comprises hydrodynamic focusing with three inlets, wherein the two side flows pinch cells at the center. For certain channel designs, dual z focus points can be created, wherein a double-punch design similar to the triple-punch design can be used to send objects to one of the two focus points to get consistent focused images. In some examples, the design comprises hydrodynamic focusing with 2 inlets, wherein only one side flow channel is used and cells are focused near channel wall. In some examples, the hydrodynamic focusing comprises side flows that do not contain any cells and a middle inlet that contains cells. The ratio of the flow rate on the side channel to the flow rate on the main channel determines the width of cell focusing region. In some examples, the design is a combination of the above. In all examples, the design is integrable with the bifurcation and sorting mechanisms disclosed herein. In some examples, the hydrodynamic-based z focusing system is used in conjunction with inertia-based z focusing.
[0268] In some examples, the cell is a live cell. In some examples, the cell is a fixed cell (e.g., in methanol or paraformaldehyde). In some examples, one or more cells can be coupled (e.g., attached covalently or non-covalently) to a substrate (e.g., a polymeric bead or a magnetic bead) while flowing through the cartridge. In some examples, the cell(s) may not be coupled to any substrate while flowing through the cartridge.
[0269] A variety of techniques can be utilized to classify images of cells captured by classification and/or sorting systems in accordance with various examples of the disclosure. In some examples, the image captures are saved for future analysis/classification either manually or by image analysis software. Any suitable image analysis software can be used for image analysis. In some examples, image analysis is performed using OpenCV. In some examples, analysis and classification is performed in real time.
[0270] In some examples, the system and methods of the present disclosure comprise collecting a plurality of images of objects in the flow. In some examples, the plurality of images comprises at least 20 images of cells. In some examples, the plurality of images comprises at least about 19, about 18,
about 17, about 16, about 15, about 14, about 13, about 12, about 11, about 10, about 9, about 8, about 7, about 6, about 5, about 4, about 3, or about 2 images of cells. In some examples, the plurality of images comprises images from multiple cell angles. In some examples, the plurality of images, comprising images from multiple cell angles, help derive extra features from the particle which would be hidden if the particle is imaged from a single point-of-view. In some examples, without wishing to be bound by any particular theory, the plurality of images, comprising images from multiple cell angles, help derive extra features from the particle which would be hidden if a plurality of images are combined into a multi-dimensional reconstruction (e.g., a two-dimensional hologram or a three- dimensional reconstruction).
[0271] In some examples, the systems and methods of present disclosure allow for a tracking ability, wherein the system and methods track a particle (e.g., cell) under the camera and maintain the knowledge of which frames belong to the same particle. In some examples, the particle is tracked until it has been classified and/or sorted. In some examples, the particle can be tracked by one or more morphological (e.g., shape, size, area, volume, texture, thickness, roundness, etc.) and/or optical (e.g., light emission, transmission, reflectance, absorbance, fluorescence, luminescence, etc.) characteristics of the particle. In some examples, each particle can be assigned a score {(e.g., a characteristic score) based on the one or more morphological and/or optical characteristics, thereby to track and confirm the particle as the particle travels through the microfluidic channel.
[0272] In some examples, the systems and methods of the disclosure comprise imaging a single particle in a particular field of view of the camera. In some examples, the same instrument that performs imaging operations can also perform sorting operations. In some examples, the system and methods of the present disclosure image multiple particles in the same field of view of camera.
Imaging multiple particles in the same field of view of the camera can provide additional advantages, for example it will increase the throughput of the system by batching the data collection and transmission of multiple particles. In some instances, at least about 2, at least about 3, at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, at least about 10, at least about 20, at least about 30, at least about 40, at least about 50, at least about 60, at least about 70, at least about 80, at least about 90, at least about 100, or more particles are imaged in the same field of view of the camera. In some instances, about 100 to about 200 particles are imaged in the same field of view of the camera. In some instances, at most about 100, at most about 90, at most about 80, at most about 70, at most about 60, at most about 50, at most about 40, at most about 30, at most about
20, at most about 10, at most about 9, at most about 8, at most about 7, at most about 6, at most about 5, at most about 4, at most about 3, or at most about 2 particles are imaged in the same field of view of the camera. In some examples, the number of the particles (e.g., cells) that are imaged in the same field of view may not be changed throughout the operation of the cartridge. In another example, the number of the particles (e.g., cells) that are imaged in the same field of view can be changed in real- time throughout the operation of the cartridge, e.g., to increase speed of the classification and/or sorting process without negatively affecting quality or accuracy of the classification and/or soring process.
[0273] The imaging region maybe downstream of the focusing region and the ordering region.
Thus, the imaging region may not be part of the focusing region and the ordering region. In an example, the focusing region may not comprise or be operatively coupled to any imaging device that is configured to capture one or more images to be used for particle analysis (e.g., cell classification).
[0274] In some examples, the systems and the methods of the present disclosure actively sorts a stream of particles. The term sort or sorting as used herein refers to physically separating particles, for e.g. cells, with one or more desired characteristics. The desired characteristic(s) can comprise a feature of the cell(s) analyzed and/or obtained from the image(s) of the cell. {0275] Examples of the morphometric feature of the cell(s) can comprise a size, shape, volume, electromagnetic radiation absorbance and/or transmittance (e.g., fluorescence intensity, luminescence intensity, etc.), or viability (e.g., when live cells are used).
[0276] The flow channel can branch into a plurality of channels, and the cell sorting system can be configured to sort the cell by directing the cell to a selected channel of the plurality of channels based on the analyzed image of the cell. The analyzed image can be indicative of one or more features of the cell, wherein the feature(s) are used as parameters of cell sorting. In some examples, one or more channels of the plurality of channels can have a plurality of sub channels, and the plurality of sub- channels can be used to further sort the cells that have been sorted once.
[0277] Cell sorting may comprise isolating one or more target cells from a population of cells. The target cell(s) can be isolated into a separate reservoir that keeps the target cell(s) separate from the other cells of the population. Cell sorting accuracy can be defined as a proportion (e.g., a percentage) of the target cells in the population of cells that have been identified and sorted into the separate reservoir. In some examples, the cell sorting accuracy of the cartridge provided herein can be at least about 80 %, at least about 81 %, at least about 82 %, at least about 83 %, at least about 84 ©, at least about 85 %, at least about 86 %, at least about 87 %, at least about 88 %, at least about 89 %, at least about 90 %, at least about 91 %, at least about 92 %, at least about 93 %, at least about 94 4, at least about 95 %, at least about 96 %, a at least bout 97 %, at least about 98 %, at least about 99 %, or more (e.g., about 99.9% or about 100%). In some examples, the cell sorting accuracy of the cartridge provided herein may be at most about 100 %, at most about 99 %, at most about 98 %, at most about 97 %, at most about 96 %, at most about 95 So, at most about 94 9%, at most about 93 %, at most about 92 Ge, at most about 91 %, at most about 90 %, at most about 89 Ze, at most about 88 %, at most about 87 %, at most about 86 %, at most about 85 %, at most about 84 %, at most about 83 %, at most about 82 %, at most about 81 Se, or at most about 80 %, or less.
[0278] In some examples, cell sorting may be performed at a rate of at least about 1 cell/second, at least about 5 cells/second, at least about 10 cells/second, at least about 50 cells/second, at least about 100 cells/second, at least about 500 cells/second, at least about 1,000 cells/second, at least about 5,000 cells/second, at least about 10,000 cells/second, at least about 50,000 cells/second, or more. In some examples, cell sorting may be performed at a rate of at most about 50,000 cells/second, at most about 10,000 cells/second, at most about 5,000 cells/second, at most about 1,000 cells/second, at most about 500 cells/second, at most about 100 cells/second, at most about 50 cells/second, at most about 10 cells/second, at most about 5 cells/second, or at most about 1 cell/second, or less.
[0279] In some examples, the systems and methods disclosed herein use an active sorting mechanism. In various examples, the active sorting is independent from analysis and decision making platforms and methods. In various examples the sorting is performed by a sorter, which receives a signal from the decision making unit (e.g. a classifier), or any other external unit, and then sorts cells as they arrive at the bifurcation. The term bifurcation as used herein refers to the termination of the flow channel into two or more channels, such that cells with the one or more desired characteristics are sorted or directed towards one of the two or more channels and cell without the one or more desired characteristics are directed towards the remaining channels. In some examples, the flow channel terminates into at least about 2, at least about 3, at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, at least about 10, or more channels. In some examples, the flow channel terminates into at most about 10, at most about 9, at most about 8, at most about 7, at most about 6, at most about 5, at most about 4, at most about 3, or at most about 2 channels. In some examples, the flow channel terminates in two channels and cells with one or more desired characteristics are directed towards one of the two channels (the positive channel), while cells without the one or more desired characteristics are directed towards the other channel (the negative channel).
In some examples, the flow channel terminates in three channels and cells with a first desired characteristic are directed to one of the three channels, cells with a second desired characteristic are directed to another of the three channels, and cells without the first desired characteristic and the second desired characteristic are directed to the remaining of the three channels.
[0280] In some examples, the sorting is performed by a sorter. The sorter may function by predicting the exact time at which the particle will arrive at the bifurcation. To predict the time of particle arrival, the sorter can use any applicable method. In some examples, the sorter predicts the time of arrival of the particle by using (i) velocity of particles (e.g.. downstream velocity of a particle along the length of the microfluidic channel) that are upstream of the bifurcation and (ii) the distance between velocity measurement/calculation location and the bifurcation. In some examples, the sorter predicts the time of arrival of the particles by using a constant delay time as an input.
[0281] In some examples, prior to the cell’s arrival at the bifurcation, the sorter may measure the velocity of a particle (e.g., a cell) at least about 1, at least about 2, at least about 3, at least about 4, or atleast about 5, or more times. In some examples, prior to the cell’s arrival at the bifurcation, the sorter may measure the velocity of the particle at most about 5, at most about 4, at most about 3, at most about 2, or at most about 1 time. In some examples, the sorter may use at least about 1, at least about 2, at least about 3, at least about 4, or at least about 5, or more sensors. In some examples, the sorter may use at most about 5, at most about 4, at most about 3, at most about 2, or at most about 1 sensor.
Example of the sensor(s) can be an imaging device (e.g., a camera such as a high-speed camera), one- or multi-point light (e.g., laser) detector, etc. Referring to FIGS. 8 A-8B the sorter may use any one of the imaging devices (e.g., the high-speed camera system 814) disposed at or adjacent to the imaging region 838. In some examples, the same imaging device(s) can be used to capture one or more images of a cell as the cell is rotating and migrating within the channel, and the one or more images can be analyzed to (i) classify the cell and (ii) measure a rotational and/or lateral velocity of the cell within the channel and predict the cell’s arrival time at the bifurcation. In some examples, the sorter may use one or more sensors that are different than the imaging devices of the imaging region 838. The sorter may measure the velocity of the particle (i) upstream of the imaging region 838, (ii) at the imaging region 838, and/or (iii) downstream of the imaging region 838.
[0282] The sorter may comprise or be operatively coupled to a processor, such as a computer processor. Such processor can be the processor 816 that is operatively coupled to the imaging device
814 or a different processor. The processor can be configured to calculate the velocity of a particle (rotational and/or downstream velocity of the particle) an predict the time of arrival of the particle at the bifurcation. The processor can be operatively coupled to one or more valves of the bifurcation.
The processor can be configured to direct the valve(s) to open and close any channel in fluid communication with the bifurcation. The processor can be configured to predict and measure when operation of the valve(s) (e.g., opening or closing) is completed.
[0283] In some examples, the sorter may comprise a self-included unit (e.g., comprising the sensors, such as the imaging device(s)) which is capable of (i) predicting the time of arrival of the articles and/or (ii) detecting the particle as it arrives at the bifurcation. In order to sort the particles, the order at which the particles arrive at the bifurcation, as detected by the self-included unit, can be matched to the order of the received signal from the decision making unit (e.g. a classifier). In some examples, controlled particles are used to align and update the order as necessary. In some examples, the decision making unit may classify a first cell, a second cell, and a third cell, respectively, and the sorter may confirm that the first cell, the second cell, and the third cell are sorted, respectively in the same order. If the order is confirmed, the classification and sorting mechanisms (or deep learning algorithms) may remain the same. If the order is different between the classifying and the sorting, then the classification and/or sorting mechanisms (or deep learning algorithms) can be updated or optimized, either manually or automatically. In some examples, the controlled particles can be cells (e.g., live or dead cells). 10284] In some examples, the controlled particles can be special calibration beads (e.g., plastic beads, metallic beads, magnetic beads, etc.). In some examples the calibration beads used are polystyrene beads with size ranging between about 1 mM to about 50 mM. In some examples the calibration beads used are polystyrene beads with size of least about 1 pM. In some examples the calibration beads used are polystyrene beads with size of at most about 50 pM. In some examples the calibration beads used are polystyrene beads with size ranging between about 1 pM to about 3 pM, about 1 pM to about 5 pM, about 1 pM to about 6 pM, about 1 pM to about 10 pM, about 1 pM to about 15 pM. about 1 pM to about 20 pM, about | pM to about 25 pM, about 1 pM to about 30 pM, about 1 pM to about 35 pM, about 1 pM to about 40 pM, about 1 pM to about 50 pM, about 3 pM to about 5 pM, about 3 pM to about 6 pM, about 3 pM to about 10 pM, about 3 pM to about 15 pM, about 3 pM to about 20 pM, about 3 pM to about 25 pM, about 3 pM to about 30 pM, about 3 pM to about pM, about 3 pM to about 40 pM, about 3 pM to about 50 pM, about 5 pM to about 6 pM, about 5 pM to about 10 pM, about 5 pM to about 15 pM, about 5 pM to about 20 pM, about 5 pM to about 25 pM, about 5 pM to about 30 pM, about 5 pM to about 35 pM. about 5 pM to about 40 mM, about 5 mM to about 50 mM, about 6 mM to about 10 mM, about 6 mM to about 15 mM, about 6 mM to about 20 mM, about 6 mM to about 25 mM, about 6 mM to about 30 mM, about 6 mM to about 35 mM, about 6 mM to about 40 mM, about 6 mM to about 50 mM, about 10 mM to about 15 mM, about 10 mM to about 20 mM, about 10 mM to about 25 mM, about 10 mM to about 30 mM, about 10 mM to about 35 mM, about 10 mM to about 40 mM, about 10 mM to about 50 mM, about 15 mM to about 20 mM, about 15 mM to about 25 mM, about 15 mM to about 30 mM, about 15 mM to about 35 mM, about 15 mM to about 40 mM, about 15 mM to about 50 mM, about 20 mM to about 25 mM, about 20 mM to about 30 mM, about 20 mM to about 35 mM, about 20 mM to about 40 mM, about 20 mM to about 50 mM, about 25 mM to about 30 mM, about 25 mM to about 35 mM, about 25 mM to about 40 mM, about 25 mM to about 50 mM, about 30 mM to about 35 mM, about 30 mM to about 40 mM, about 30 mM to about 50 mM, about 35 mM to about 40 mM, about 35 mM to about 50 mM, or about 40 mM to about 50 mM. In some examples the calibration beads used are polystyrene beads with size of about 1 mM, about 3 mM, about 5 mM, about 6 mM, about 10 mM, about 15 mM, about 20 mM, about 25 mM, about 30 mM, about 35 mM, about 40 mM, or about 50 mM.
[0285] In some examples, the sorter {or an additional sensor disposed at or adjacent to the bifurcation) can be configured to validate arrival of the particles (e.g., the cells) at the bifurcation. In some examples, the sorter can be configured to measure an actual arrival time of the particles (e.g, the cells) at the bifurcation. The sorter may analyze (e.g., compare) the predicted arrival time, the actual arrival time, the velocity of the particles downstream of the channel prior to any adjustment of the velocity, and/or a velocity of the particles downstream of the channel subsequent to such adjustment of the velocity. Based on the analyzing, the sorter may modify any operation (e.g., cell focusing, cell rotation, controlling cell velocity, cell classification algorithms, valve actuation processes, etc.) of the cartridge. The validation by the sorter can be used for closed-loop and real-time update of any operation of the cartridge.
[0286] In some examples, to predict the time of arrival of one or more cells for sorting, the systems, methods, and platforms disclosed herein can dynamically adjust a delay time (e.g., a constant delay time) based on imaging of the cell(s) or based on tracking of the cell(s) with light (e.g., laser). By detecting changes (e.g., flow rates, velocity of aggregate of multiple cells, the lateral location of cells in the channel, etc.) the delay time (e.g., time at which the cells arrive at the bifurcation) can be predicted and adjusted in real-time (e.g., every few milliseconds). A feedback loop can be designed that can constantly read such changes and adjust the delay time accordingly. In another example, the delay time can be adjusted for each cell/particle. The delay time can be calculated separately for each individual cell, based on, e.g., its velocity, lateral position in the channel, and/or time of arrival at specific locations along the channel (e.g., using tracking based on lasers or other methods). The calculated delay time can then be applied to the individual cell/particle (e.g., if the cell is a positive cell or a target cell, the sorting can be performed according to its specific delay time or a predetermined delay time). In some examples, the sorters used in the systems and methods disclosed herein are self- learning cell sorting systems or intelligent cell sorting systems, as disclosed herein.
[0287] These sorting systems can continuously learn based on the outcome of sorting. For example, a sample of cells is sorted, the sorted cells are analyzed, and the results of this analysis are fed back to the classifier. In some examples, the cells that are sorted as “positive” (i.e, target cells or cells of interest) can be analyzed and validated. In some examples, the cells that are sorted as “negative” (i.e, non-target cells or cells not of interest) can be analyzed and validated. In some examples, both positive and negative cells can be validated. Such validation of sorted cells (e.g., based on secondary imaging and classification) can be used for closed-loop and real-time update of the primary cell classification algorithms.
[0288] In some examples, a flush mechanism can be used during sorting. The flush mechanism can ensure that the cell which has been determined to be sorted to a specific bucket or well will end up there (e.g.. not be stuck in various parts of the channel or outlet). The flush mechanism can ensure that the channel and outlets stay clean and debris-free for maximum durability. The flush mechanism can inject additional solutions/reagents (e.g., cell lysis buffers, barcoded reagents, etc.) to the well or droplet that the cell is being sorted into. The flush mechanism can be supplied by a separate set of channels and/or valves which are responsible to flow a fluid at a predefined cadence in the direction of sorting.
[0289] In some examples, the methods and systems disclosed herein can use any sorting technique to sort particles. At least a portion of the collection reservoir may or may not be pre filled with a fluid, e.g., a buffer. In some examples, the sorting technique comprises closing a channel on one side of the bifurcation to collect the desired cell on the other side. In some examples, the closing of the channels can be carried out by employing any known technique. In some examples, the closing is carried out by application of a pressure. In some instances, the pressure is pneumatic actuation. In some examples,
the pressure can be positive pressure or negative pressure. In some examples, positive pressure is used.
In some examples, one side of the bifurcation is closed by applying pressure and deflecting the soft membrane between top and bottom layers. Other examples of systems and methods of particle (e.g., cell) imaging, analysis, and sorting are further described in International Application No.
PCT/US2017/033676 and International Application No. PCT/US2019/046557, each of which is incorporated herein by reference in its entirety.
[0290] In various examples, the systems and methods of the present disclosure comprise one or more reservoirs designed to collect the particles after the particles have been sorted. In some examples, the number of cells to be sorted is about 1 cell to about 1,000,000 cells. In some examples, the number
IO of cells to be sorted is at least about 1 cell. In some examples, the number of cells to be sorted is at most about 1,000,000 cells. In some examples, the number of cells to be sorted is about 1 cell to about 100 cells, about 1 cell to about 500 cells, about 1 cell to about 1,000 cells, about 1 cell to about 5,000 cells, about 1 cell to about 10,000 cells, about 1 cell to about 50,000 cells, about 1 cell to about 100,000 cells, about 1 cell to about 500,000 cells, about 1 cell to about 1,000,000 cells, about 100 cells to about 500 cells, about 100 cells to about 1,000 cells, about 100 cells to about 5,000 cells, about 100 cells to about 10,000 cells, about 100 cells to about 50,000 cells, about 100 cells to about 100,000 cells, about 100 cells to about 500,000 cells, about 100 cells to about 1,000,000 cells, about 500 cells to about 1,000 cells, about 500 cells to about 5,000 cells, about 500 cells to about 10,000 cells, about 500 cells to about 50,000 cells, about 500 cells to about 100,000 cells, about 500 cells to about 500,000 cells, about 500 cells to about 1,000,000 cells, about 1,000 cells to about 5,000 cells, about 1,000 cells to about 10,000 cells, about 1,000 cells to about 50,000 cells, about 1,000 cells to about 100,000 cells, about 1,000 cells to about 500,000 cells, about 1,000 cells to about 1,000,000 cells, about 5,000 cells to about 10,000 cells, about 5,000 cells to about 50,000 cells, about 5,000 cells to about 100,000 cells, about 5,000 cells to about 500,000 cells, about 5,000 cells to about 1,000,000 cells, about 10,000 cells to about 50,000 cells, about 10,000 cells to about 100,000 cells, about 10,000 cells to about 500,000 cells, about 10,000 cells to about 1,000,000 cells, about 50,000 cells to about 100,000 cells, about 50,000 cells to about 500,000 cells, about 50,000 cells to about 1,000,000 cells, about 100,000 cells to about 500,000 cells, about 100,000 cells to about 1,000,000 cells, or about 500,000 cells to about 1,000,000 cells. In some examples, the number of cells to be sorted is about 1 cell, about 100 cells, about 500 cells, about 1,000 cells, about 5,000 cells, about 10,000 cells, about 50,000 cells, about 100,000 cells, about 500,000 cells, or about 1,000,000 cells.
[0291] In some examples, the number of cells to be sorted is about 100 to about 500 cells, about 200 to about 500 cells, about 300 to about 500 cells, about 350 to about 500 cells, about 400 to about 500 cells, or about 450 to about 500 cells. In some examples, the reservoirs can be milliliter scale reservoirs. In some examples, the one or more reservoirs are pre-filled with a buffer and the sorted cells are stored in the buffer. Using the buffer helps to increase the volume of the cells, which can then be easily handled, for example a pipetted. In some examples, the buffer is a phosphate buffer, for example phosphate-buffered saline (PBS).
[0292] In some examples, the system and methods of the present disclosure comprise a cell sorting technique wherein pockets of buffer solution containing no negative objects are sent to the positive output channel in order to push rare objects out of the collection reservoir. In some examples, additional buffer solution is sent to the positive output channel to flush out all positive objects at the end of a run, once the channel is flushed clean (e.g., using the flush mechanism as disclosed herein).
[0293] In some examples, the system and methods of the present disclosure comprise a cell retrieving technique, wherein sorted cells can be retrieved for downstream analysis (e.g., molecular analysis). Non-limiting examples of the cell retrieving technique can include: retrieval by centrifugation; direct retrieval by pipetting: direct lysis of cells in well; sorting in a detachable tube; feeding into a single cell dispenser to be deposited into 96 or 384 well plates; etc.
[0294] In some examples, the system and methods of the present disclosure comprise a combination of techniques, wherein a graphics processing unit (GPU) and a digital signal processor (DSP) are used to run artificial intelligence (Al) algorithms and apply classification results in real-time to the system.
In some examples, the system and methods of the present disclosure comprise a hybrid method for real-time cell sorting.
[0295] In some examples, the system and methods of the present disclosure comprise a feedback loop (e.g., an automatic feedback loop). For example, the system and methods can be configured to (i) monitor the vital signals and (ii) finetune one or more parameters of the system and methods based on the signals being read. At the beginning or throughout the run (e.g., the use of the microfluidic channel for cell imaging, classification, and/or sorting). a processor (e.g., a ML/AI processor as disclosed herein) can specify target values for one or more selected parameters (e.g., flow rate, cell rate, etc.).
In another example, other signals that reflect {e.g., automatically reflect) the quality of the run (e.g., the number of cells that are out of focus within the last 100 imaged cells) can be utilized in the feedback loop. The feedback loop can receive (e.g., in real-time) values of the parameters/signals disclosed herein and, based on the predetermined target values and/or one or more general mandates (e.g., the fewer the out-of-focus cells, the better), the feedback loop can facilitate adjustments (e.g., adjustments to pressure systems, illumination, stage, etc.). In some examples, the feedback loop can be designed to monitor and/or handle degenerate scenarios, in which the microfluidic system is not responsive or malfunctioning (e.g., outputting a value read that is out of range of acceptable reads).
[0296] In some examples, the system and methods of the present disclosure can adjust a cell classification threshold based on expected true positive rate for a sample type. The expected true positive rate can come from statistics gathered in one or more previous runs from the same or other patients with similar conditions. Such approach can help neutralize run-to-run variations (e.g., illumination, chip fabrication variation, etc.) that would impact imaging and hence any inference therefrom.
[0297] In some examples, the systems disclosed herein further comprise a validation unit that detects the presence of a particle without getting detailed information, such as imaging. In some instances, the validation unit can be used for one or more purposes. In some examples, the validation unit detects a particle approaching the bifurcation and enables precise sorting. In some examples, the validation unit detects a particle after the particle has been sorted to one of subchannels in fluid communication with the bifurcation. In some examples, the validation unit provides timing information with a plurality of laser spots, e.g., two laser spots. In some instances, the validation unit provides timing information by referencing the imaging time. In some instances, the validation unit provides precise time delay information and/or flow speed of particles.
[0298] In some examples, the particles (e.g., cells) analyzed by the systems and methods disclosed herein are comprised in a sample. The sample can be a biological sample obtained from a subject (e.g., a human or any animal). It should be appreciated that an animal can be a variety of any applicable type, including, but not limited thereto, mammal or non-mammals. For example, the animal can be veterinarian animal, livestock animal or pet type animal, etc. As an example, the animal can be a laboratory animal specifically selected to have certain characteristics similar to a human (e.g., rat, dog, pig, monkey, or the like). It should be appreciated that the subject can be any applicable human patient, for example. In some examples, the biological sample comprises a biopsy sample from a subject. In some examples, the biological sample comprises a tissue sample from a subject. In some examples, the biological sample comprises liquid biopsy from a subject. In some examples, the biological sample can be a solid biological sample, e.g., a tumor sample. In some examples, a sample from a subject can comprise at least about 1%, at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% tumor cells from a tumor.
[0299] In some examples, the sample can be a liquid biological sample. In some examples, the liquid biological sample can be a blood sample (e.g., whole blood, plasma, or serum). A whole blood sample can be subjected to separation of cellular components (e.g., plasma, serum) and cellular components by use of a Ficoll reagent. In some examples, the liquid biological sample can be a urine sample. In some examples, the liquid biological sample can be a perilymph sample. In some examples, the liquid biological sample can be a fecal sample. In some examples, the liquid biological sample can be saliva. In some examples, the liquid biological sample can be semen. In some examples, the liquid biological sample can be amniotic fluid. In some examples, the liquid biological sample can be cerebrospinal fluid. In some examples, the liquid biological sample can be bile. In some examples, the liquid biological sample can be sweat. In some examples, the liquid biological sample can be tears. In some examples, the liquid biological sample can be sputum. In some examples, the liquid biological sample can be synovial fluid. In some examples, the liquid biological sample can be vomit.
[0300] In some examples, samples can be collected over a period of time and the samples can be compared to each other or with a standard sample using the systems and methods disclosed herein. In some examples the standard sample is a comparable sample obtained from a different subject, for example a different subject that is known to be healthy or a different subject that is known to be unhealthy. Samples can be collected over regular time intervals, or can be collected intermittently over irregular time intervals.
Cell Feature Extraction: Output Data and Interpretation
[0301] FIG. 9 illustrates an example training architecture 900 for aspects herein disclosed systems, such as the human foundation model previously illustrated in FIG. 1. Architecture 900 can be completely symmetric, completely asymmetric, partially symmetric, or partially asymmetric, or a combination of the foregoing, and include a multi-layered (e.g., 18 layers deep) convolutional neural network. It is understood that after training, the system (e.g., system 100) can be trained to predict morphometric characteristics of cellular images, including but not limited to cell class, cell type, cell state, other morphometric features such as blobs, related probabilities, and related accuracy identifiers.
[0302] In some examples, architecture 900 can be deployed on instruments (e.g., REM=I instrument of Table 4) and can be used to generate embeddings in a cloud-based computer system.
Architecture 900 can be used in a high-throughput setting so that images 912. In some examples, images 912 are captured by a camera (e.g., an ultra high-speed bright-field camera) as cell suspensions flow through a channel in the microfluidics chip. Architecture 900 can include an augmentation module 940 configured to crop collected ultra-high-speed bright-field images 912 of cells as they pass through an imaging zone (e.g., an imaging zone of a microfluidic chip such as those captured images of FIGs. 8A-8B). Augmentation module 940 can implement one or more augmentation methods to generate batches 942a, b of altered replicas of the images 912. Augmentation techniques of module 940 includes, but is not limited to, horizontal and vertical flips of images, orthogonal rotation, translation, gaussian noise, contrast variation, and the like.
[0303] Batches 942a, b can be used to train a deep learning (DL) encoder 950. Specifically, batches 942a, b of altered replicas of the images 912 can be introduced along with images 912 into DL encoder 950 to generate augmented embeddings 952a, b. Encoder 950 can be trained using a self-supervised learning (SSL) method that learns image features without labels and relies at least on preserving information of its embeddings, including embeddings 952a, 952b, as well as concatenated deep learning predictive embeddings 964 (discussed more particularly below). DL encoder 950 can be a
ResNet based encoder trained using a plurality of unlabeled cell images from different types of samples to detect differences in cell morphology without labeled training data. In some examples, encoder 950 learns image features without labels and with orthogonal morphometric features to improve model performance and interpretability.
[0304] Encoder 950 may include a plurality of convolution layers that use examples, such as edge detectors to detect a plurality of edge components of images 912 and batches 942a, b of altered replicas of the images 912. Encoder 950 can also use shape detectors to detect shape components of images 912 and batches 942a, b of altered replicas of the images 912 (e.g., a particular type of cell ridge).
Augmented embeddings 952a, b from deep learning encoder 950 can be used to determine deep learning interpretations of captured images 912 performed in real-time (e.g., approximately <150 ms latency). To generate embeddings 9524, b, encoder 950 can encode features of batches 942a, b into multi-dimensional vectors. In some examples, encoder 950 can extract a 64-dimensional feature vector for each altered image of batches 942a, b and images 912.
[0305] Encoder 950 can be trained with a loss function that utilizes maximum likelihood-based invariance between augmented images (such as mean squared error or categorical cross entropy), as well as variance, covariance, and morphometric decorrelation terms. In some examples, the variance and covariance terms used herein can include estimates of variance and covariance between feature dimensions by calculating the invariance and covariance directly for batches of images (e.g., of hundreds to thousands of images). In some examples, encoder 950 can be iteratively optimized until the DL model converges and calculate statistical quality (e.g., covariance) using the loss function. In some examples, encoder 950 can include a backbone, such as a ResNet-50 backbone, trained with the herein described invariance, variance, covariance, and morphometric decorrelation terms. The loss function uses an invariance term that learns invariance to vector transformations and is regularized with a variance term that prevents norm collapse. In some examples, the invariance term is determined using the mean square distance between embedding vectors (e.g., vectors of embeddings 952a, 952b).
The loss function also uses a covariance term that prevents informational collapse by decorrelating the different dimensions of the vectors of embeddings 952a, 952b. The variance loss constrains the variance term of the vectors of embeddings 952a, 952b along each dimension independently. To determine similarity loss between vectors of embeddings 952a, 952b, a distance between vector pairs of embeddings 952a, b of the augmented images of batches 942a, b of the same cell is minimized (e.g.,
Euclidean distance) and variance of each embedding 952a, b over a training batch is maintained above athreshold. In some examples, the threshold is a hyperparameter determined by a value that gives us the best or most-optimized performance on downstream tasks. In some examples, variance is optimized to be around approximately 1. In some examples, variance can be optimized to be any value on a range of 0 or strictly (0, infinity] (strictly greater than 0).
[0306] With respect to the covariance term, different dimensions of embeddings 952a, 952b are used to make non-diagonal values of a cross-correlation matrix to zero indicating that the values are orthogonal. The covariances between embedding variables 952a, b over a respective batch between every pair of centered embedding variable 1s attracted to zero (e.g., see each of embeddings 952a, 952b) so that embeddings 952a, b are decorrelated from each other.
[0307] Architecture 900 can also include a computer vision encoder 960 that can be self-supervised and can include human-constructed algorithms, which in some cases can be referred to as the previously-described “rule-based morphometrics.” See Table 1. Encoder 960 may process captured images 912 as input and extract morphometric cell features into a plurality of morphometric vectors 962 (e.g., dimensional morphometric features encoded into 95-dimensional vectors representing the cell morphology). The multi-dimension vectors 962 can include cell position, cell shape, pixel intensity, pixel count, cell size, texture, focus, or combinations thereof. In one nonlimiting example, encoder 960 can extract 99 dimensional embedding vectors representing cell morphology from high resolution images 912. In one nonlimiting example, previously described 64-dimensional embeddings 9524, b, and 51-dimensional morphometric features 962 can be encoded into 95-dimensional vectors representing the cell morphology.
[0308] Example depictions of certain contemplated morphometric cell features are shown in FIGs. 11A to 11B, where FIG. 11A illustrates representative images showing features that include cell shape and size (e.g., convex hull, max/min radius, max ferret diameter, min ferret diameter, long/short axis, etc.). FIG. 11B shows representative images showing features that include pixel intensity and texture (e.g., small white “blobs”, small black “blobs”, large white “blobs”, large white “blobs”, etc.). The illustrated morphometrics that describe size and intensities of “blobs” relate to cellular structures like granules, vesicles, and the like. In some examples, “blobs” can be understood as connected set(s) of pixels that are either substantially or entirely dark or substantially or entirely bright. In some examples, “blobs” can be understood region(s) in a respective image that differs in properties (e.g., brightness, color, etc.) relative to surrounding region(s).
[0309] In some examples, outputs of encoders 950, 960 can be analyzed together and concatenate as decorrelated concatenated morphometric predictive embeddings 964. Embeddings 964 can be generated adopting a probabilistic approach and/or using deep learning features of encoder 950 (e.g., using conditional batch normalization) concatenated with computer vision morphometric feature embeddings 962 from encoder 960 into different dimensions. Embeddings 964 can be predictive multidimensional vectors that include predictive features related to individual cells, clusters of cells, morphometric features, and related probabilities.
[0310] Embeddings 964 can be generated using morphometric decorrelation with encoder 950 minimizing the covariance term between vector pairs of embeddings 962 and rule-based morphometric dimensions over a training batch. Embeddings 964 can include encoded novel cell morphology information retained to be orthogonal to the rule-based morphometrics of encoder 960, including “blob” features such as those shown in FIGs. 11A to 11B. In so doing, architecture 900 effectively uses encoder 950 to splits its embedding layer into multiple sections to separate morphometric de-
correlated embeddings 962 and morphometrics predictive embeddings 964 to predict or otherwise approximate morphometric features, such as “blobs”. In so splitting, architecture 900 is able to optimize conflict objectives in different parameter space and thus avoids adding latency to the system.
In some examples, a correlation coefficient of predicted and/or approximated morphometric features, such as “blob” features, can be greater than approximately .9, though other ranges are contemplated as needed or required, including approximately .85 to 95.
[0311] In FIG. 10, architecture 900 is continued from FIG. 9 and shows that in one example a multi- layer projector 970 can be included downstream of encoder 950. In this example, projector 970 is provided to project high dimension to low dimension (e.g., 2D), such as reducing the dimensionality of the embeddings 952a, 952b. In this respect, embeddings 952a, 952b are capable of being visualized.
Projector 970 is configured to reduce the dimensionality to projected embeddings 972a, 972b and map representations of embeddings 952a, 952b. In this example, the previously discussed criterion, including the loss functions with invariance, variance, covariance, and morphometric decorrelation terms can be applied on projected embeddings 972a, 972b.
[0312] FIG. 12A illustrates cell classes, numbers of images used as training dataset to train a classifier using features extracted using the human foundation model, numbers of images processed by the human foundation model as test dataset, and corresponding representative cell images, in accordance with some examples of the present disclosure. A scale bar of 10 pm is shown on the representative cell images.
[0313] FIG. 12B illustrates a confusion matrix between predicted cell classes classified using features extracted using the human foundation model and actual cell classes, in accordance with some examples of the present disclosure. As illustrated, the human foundation model predicts cell classes with a high accuracy. The accuracy for predicting the Jurkat cell line, A375 cell line, and Caov-3 cell line is 90.5%, 89%, and 95.8%, respectively. The accuracy for predicting polystyrene beads is 100%.
Classifying and Sorting
[0314] Views (a) to (f) of FIG. 13 schematically illustrate an example system for classifying and sorting one or more cells. The platform as disclosed herein can allow for the input and flow of cells in suspension with confinement along a single lateral trajectory to obtain a narrow band of focus across the z-axis (views (a) to (f) of FIG. 13). View (a) of FIG. 13shows the microfluidic chip and the inputs and output of the sorter platform according to one example of the present disclosure. Cells in suspension and sheath fluid are inputted, along with run parameters entered by the user: target cell type(s) and a cap on the number of cells to sort, if sorting is of interest. Upon run completion, the system generates reports of the sample composition (number and types of all of the processed cells) and the parameters of the run, including: length of run, number of analyzed cells, quality of imaging, quality of the sample. If sorting option is selected, it outputs isolated cells in a reservoir on the chip as well as a report of the number of sorted cells, purity of the collected cells and yield of the sort.
Referring to view (b) of FIG. 13, a combination of hydrodynamic focusing and inertial focusing is used to focus the cells on a single z plane and a single lateral trajectory. Referring to views (c) and (d) of FIG. 13, the diagram shows the interplay between different components of the software (view (c) of FIG. 13) and hardware pieces (view (d) of FIG. 13). The classifier is blown up in view (e) of FIG. 13, depicting the process of image collection, and automated real-time assessment of single cells in flow. After the images are taken, individual cell images are cropped using an automated object detection module, the cropped images are then run through a deep neural networks model trained on the relevant cells (e.g., DL encoder 950). For each image, the model can generate deep learning embeddings (e.g., embeddings 952a, 952b), deep learning predictive embeddings 964, as well as generate a prediction vector over the available cell classes and an inference will be made according to a selection rule (e.g., argmax). The model may also infer the z focusing plane of the image. The percentage of debris and cell clumps may also be predicted by the neural network model as a proxy for “sample quality”. View (f) of FIG. 13 shows the performance of sorting. In this figure, the tradeoff between purity and yield is shown in three different modes, for profiling as sorting of 130,000 [A], 500,000 [B] or 1,000,000 [C] cells within one hour.
[0315] Using a combination of hydrodynamic and inertial focusing, the platform can collect ultra high-speed bright-field images of cells as they pass through the imaging zone of the microfluidic chip (views (a) and (b) of FIG. 13). In order to capture the single cell images for processing, an automated object detection module can be incorporated to crop each image centered around the cell, before feeding the cropped images into a deep convolutional neural network (CNN) based on Inception architecture, which is trained on images of relevant cell types. In addition to classifying cells into categories of interest, the CNN can be trained to assess the focus of each image (in Z plane) and identify debris and cell clusters, thus providing information to assess sample quality (view (e) of FIG. 13). A feedback loop can be engineered so that the CNN inferred cell type can be used in real time to regulate pneumatic valves for sorting a cell into either the positive reservoir (cell collection reservoir)
for a targeted category of interest or a waste outlet (FIG. 13A). Sorted cells in the reservoir may then be retrieved for downstream processing and molecular analysis. In some examples, the feedback loop can be engineered so that the generated deep learning embeddings (e.g., embeddings 952a, 952b, 964, etc.) can be used in real time to regulate pneumatic valves for sorting a cell into either a cell collection reservoir or a waste outlet (FIG. 13A).
[0316] FIG. 14 schematically illustrate operations that can be performed in an example method.
View (a) of FIG. 14 shows high resolution images of single cells in flow are stored. Referring to view (b) of FIG. 14, ATAIA (Al Assisted Image Annotation) is used to cluster individual cell images into morphologically similar groups of cells. In some examples, AIAIA is used to cluster individual cell
IO images into groups of cells using deep learning embeddings (e.g., embeddings 9524, 952b, 964). In some examples, a user uses the labeling tool to adjust and batch-label the cell clusters. In the example shown, one AML cell can be mis-clustered into a group of WBC cells and an image showing a cell clump (debris) can be mis-clustered in a NSCLC cell group. These errors are corrected by the “Expert clean-up” operation of view (b). Referring to view (c) of FIG. 14, the annotated cells are then integrated into a Cell Morphology Atlas (CMA). Referring to view (d) of FIG. 14, the CMA is used to generate both training and validation sets of the next generation of the models. Referring to view (e) of FIG. 14, during a sorting experiment, the pre-trained model shown in view ({d) of FIG. 141s used to infer the cell type (class) in real-time. The enriched cells are retrieved from the device. The retrieved cells are further processed for molecular profiling. The platform can be run in multiple different modes. In the training/validation mode, the collected images of a sample can be fed to the AIAIA, configured to use unsupervised learning to group cells into sub-clusters. In some examples, the sub- clusters be morphologically distinct sub-clusters. In some examples, the cells can be grouped using deep learning embeddings {e.g., embeddings 952a, 952b, 964). Using AIAIA, a user can clean up the sub-clusters by removing cells that are incorrectly clustered and annotates each cluster based on a predefined annotation schema. The annotated cell images are then integrated into the Cell Morphology
Atlas (CMA), a growing database of expert-annotated images of single cells. The CMA is broken down into training and validation sets and is used to train and evaluate CNN models aimed at identifying cell types, cell states, morphometric features, and/or the like. Under the analysis mode (view (d) of FIG. 14), the collected images are fed into models that had been previously trained using the CMA, and a report is generated demonstrating the composition of the sample of interest. A UMAP visualization is used to depict the morphometric map of all the single cells within the sample. A set of prediction probabilities is also generated showing the classifier prediction of each individual cell within the sample belonging to every predefined cell class within the CMA. In the sorting mode (view (e) of FIG. 14), the collected images are passed to the CNN in real-time and a decision is made on the fly to assign each single cell to one of the predefined classes within the CMA. In some examples, the collected images are passed to the CNN in real-time and the decision is made on the fly to assign each single cell using deep learning embeddings (e.g., embeddings 952a, 952b, 964) within the CMA. The target cells are then sorted in real-time and are outputted for downstream molecular assessment.
[0317] FIG. 15 shows a computer system that is programmed or otherwise configured to implement methods provided herein. For example, the present disclosure provides computer systems that are programmed to implement methods of the disclosure. FIG. 15 shows a computer system 1501 that is programmed or otherwise configured to capture and/or analyze one or more images of the cell. The computer system 1501 can regulate various examples of components of the cell sorting system of the present disclosure, such as, for example, the pump, the valve, and the imaging device. The computer system 1501 can be an electronic device of a user or a computer system that is remotely located with respect to the electronic device. The electronic device can be a mobile electronic device.
[0318] The computer system 1501 includes a central processing unit (CPU, also “processor” and “computer processor” herein) 1505, which can be a single core or multi core processor, or a plurality of processors for parallel processing. The computer system 1501 also includes memory or memory location 1510 (e.g., random-access memory, read-only memory, flash memory), electronic storage unit 1515 (e.g., hard disk), communication interface 1520 (e.g., network adapter) for communicating with one or more other systems, and peripheral devices 1525, such as cache, other memory, data storage and/or electronic display adapters. The memory 1510, storage unit 1515, interface 1520 and peripheral devices 1525 are in communication with the CPU 1505 through a communication bus (solid lines), such as a motherboard. The storage unit 1515 can be a data storage unit (or data repository) for storing data. The computer system 1501 can be operatively coupled to a computer network (“network™) 1530 with the aid of the communication interface 1520. The network 1530 can be the
Internet, an internet and/or extranet, or an intranet and/or extranet that is in communication with the
Internet. The network 1530 in some cases is a telecommunication and/or data network. The network 1530 can include one or more computer servers, which can enable distributed computing, such as cloud computing. The network 1530, in some cases with the aid of the computer system 1501, can implement a peer-to-peer network, which may enable devices coupled to the computer system 1501 to behave as a client or a server. f0319] The CPU 1505 can execute a sequence of machine-readable instructions, which can be embodied in a program or software. The instructions can be stored in a memory location, such as the memory 1510. The instructions can be directed to the CPU 1505, which can subsequently program or otherwise configure the CPU 1505 to implement methods of the present disclosure. Examples of operations performed by the CPU 1505 can include fetch, decode, execute, and writeback. {0320] The CPU 1505 can be part of a circuit, such as an integrated circuit. One or more other components of the system 1501 can be included in the circuit. In some cases, the circuit is an application specific integrated circuit (ASIC).
[0321] The storage unit 1515 can store files, such as drivers, libraries and saved programs. The storage unit 1515 can store user data, e.g., user preferences and user programs. The computer system 1501 in some cases can include one or more additional data storage units that are external to the computer system 1501, such as located on a remote server that is in communication with the computer system 1501 through an intranet or the Internet.
[0322] The computer system 1501 can communicate with one or more remote computer systems through the network 1530. For instance, the computer system 1501 can communicate with a remote computer system of a user. Examples of remote computer systems include personal computers {(e.g., portable PC), slate or tablets, telephones, Smart phones (e.g., Apple® iPhone, Android-enabled device,
Blackberry®), or personal digital assistants. The user can access the computer system 1501 using the network 1530.
[0323] Methods as described herein can be implemented by way of machine (e.g., computer processor) executable code stored on an electronic storage location of the computer system 1501, such as, for example, on the memory 1510 or electronic storage unit 1515. The machine executable or machine readable code can be provided in the form of software. During use, the code can be executed by the processor 1505. In some cases, the code can be retrieved from the storage unit 1515 and stored on the memory 1510 for ready access by the processor 1505. In some situations, the electronic storage unit 1515 can be precluded, and machine-executable instructions are stored on memory 1510.
[0324] The code can be pre-compiled and configured for use with a machine having a processor adapted to execute the code, or can be compiled during runtime. The code can be supplied in a programming language that can be selected to enable the code to execute in a pre-compiled or as- compiled fashion.
[0325] Examples of the systems and methods provided herein, such as the computer system 1501, can be embodied in programming. Various examples of the technology can be thought of as “products” or “articles of manufacture” typically in the form of machine (or processor) executable code and/or associated data that is carried on or embodied in a type of machine readable medium. Machine- executable code can be stored on an electronic storage unit, such as memory (e.g., read-only memory, random-access memory, flash memory) or a hard disk.
[0326] “Storage” type media can include any or all of the tangible memory of the computers, processors or the like, or associated modules thereof, such as various semiconductor memories, tape drives, disk drives and the like, which may provide non-transitory storage at any time for the software programming. All or portions of the software may at times be communicated through the Internet or various other telecommunication networks. Such communications, for example, may enable loading of the software from one computer or processor into another, for example, from a management server or host computer into the computer platform of an application server. Thus, another type of media that may bear the software elements includes optical, electrical and electromagnetic waves, such as used across physical interfaces between local devices, through wired and optical landline networks and over various air-links. The physical elements that carry such waves, such as wired or wireless links, optical links or the like, also can be considered as media bearing the software. As used herein, unless restricted to non-transitory, tangible “storage” media, terms such as computer or machine “readable medium” refer to any medium that participates in providing instructions to a processor for execution.
[0327] Hence, a machine readable medium, such as computer-executable code, may take many forms, including but not limited to, a tangible storage medium, a carrier wave medium or physical transmission medium. Non-volatile storage media include, for example, optical or magnetic disks, such as any of the storage devices in any computer(s) or the like, such as can be used to implement the databases, etc. shown in the drawings. Volatile storage media include dynamic memory, such as main memory of such a computer platform. Tangible transmission media include coaxial cables; copper wire and fiber optics, including the wires that comprise a bus within a computer system. Carrier-wave transmission media may take the form of electric or electromagnetic signals, or acoustic or light waves such as those generated during radio frequency (RF) and infrared (IR) data communications. Common forms of computer-readable media therefore include for example: a floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, a CD-ROM, DVD or DVD-ROM, any other optical medium, punch cards paper tape, any other physical storage medium with patterns of holes, a RAM,
a ROM, a PROM and EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave transporting data or instructions, cables or links transporting such a carrier wave, or any other medium from which a computer may read programming code and/or data. Many of these forms of computer readable media can be involved in carrying one or more sequences of one or more instructions to a processor for execution.
[0328] The computer system 1501 can include or be in communication with an electronic display 1535 that comprises a user interface (UD 1540 for providing, for example, the one or more images of the cell that is transported through the channel of the cell sorting system. In some cases, the computer system 1501 can be configured to provide a live feedback of the images. Examples of Uls include, without limitation, a graphical user interface (GUI) and web-based user interface.
[0329] Methods and systems of the present disclosure can be implemented by way of one or more algorithms. An algorithm can be implemented by way of software upon execution by the central processing unit 1505. The algorithm can include, for example, the human foundation model.
[0330] It will be appreciated that the features and operations described herein can be used in any suitable combination with one another. For example, FIG. 16 illustrates an example flow of operations in a method of processing images. Method 1600 illustrated in FIG. 16 includes extracting, using a
Deep Learning (DL) model (e.g., DL encoder 950), a set of machine learning (ML)-based features from a cell image (e.g., images 912 and/or augmented images thereof such as in batches 942a, 942b ) (operation 1610). As one example, any suitable component(s) may include a processor and a non- computer readable medium storing the machine learning model and related encoder, and instructions for causing the processor to perform operations. For example, the processor can be included with a cloud-based computing environment and/or within a microfluidics platform (e.g., platform 20, 310).
Nonlimiting examples of machine learning encoders (such as deep learning encoders, for example, convolutional neural networks) are provided elsewhere herein. In some examples, cells in the one or more cell images are unstained. The one or more cell images are brightfield cell images in some examples, though it will be appreciated that other types of cell images readily can be provided to the machine learning encoder and computer vision encoder.
[0331] Method 1600 illustrated in FIG. 16 also may include generating, using the DL model and the set of ML-based features, a plurality of DL embeddings orthogonal to each other {operation 1620).
[0332] Method 1600 illustrated in FIG. 16 also may include extracting, using a computer vision model, a set of cell morphometric features from the cell image, the cell morphometric features being orthogonal to the plurality of DL embeddings (operation 1630). Some nonlimiting examples of cell morphometric features include cell position, cell shape, pixel intensity, texture, focus, or any combination thereof. These and other nonlimiting examples of cell morphological features are described in Table 2.
[0333] Method 1600 illustrated in FIG. 16 also may include generating a plurality of morphometric predictive embeddings using the set of ML-based features and the set of cell morphometric features (operation 1640).
[0334] FIG. 17 illustrates an example flow of operations in a method of processing images. Method 1700 illustrated in FIG. 17 includes extracting, using a Deep Learning (DL) model (e.g., DL encoder 950) and image data of a plurality of cells, a vector for a cell of the plurality of cells, the vector comprising a set of machine-learning (ML )-based features and a set of cell morphometric features extracted using a computer vision model (operation 1710). As one example, any suitable component(s) may include a processor and a non-computer readable medium storing the machine learning model and related encoder, and instructions for causing the processor to perform operations. For example, the processor can be included with a cloud-based computing environment and/or within a microfluidics platform (e.g., platform 20, 310). Nonlimiting examples of machine learning encoders (such as deep learning encoders, for example, convolutional neural networks) are provided elsewhere herein. In some examples, cells in the one or more cell images are unstained. The one or more cell images are brighttield cell images in some examples, though it will be appreciated that other types of cell images readily can be provided to the machine learning encoder and computer vision encoder.
[0335] Method 1700 illustrated in FIG. 17 also may include generating, using the DL model and using the set of ML-based features, a plurality of DL embeddings orthogonal to each other (operation 1720).
Non-Limiting Working Examples
[0336] The following examples are intended to be purely illustrative, and not limiting of the present subject matter.
Example 1 — Evaluation of the human foundation model using different cell lines.
[0337] Cancer cells lines A375 and Caov-3, and immune cell line Jurkat were used to evaluate the classification performance of the human foundation model. Polystyrene beads with a size of 6 micrometers (4m) were used as control. The cell lines and polystyrene beads were imaged using the microfluidics platform (e.g., REM-I platform) as described herein and combined in silico to evaluate the performance of the human foundation model. The human foundation model processed the images of the cell lines and polystyrene beads, extracted deep learning and morphometric features. These features were standardized and projected into a lower dimensional principal components analysis (PCA) basis. Nearest neighbors were computed in the PCA space, then used to compute 2D Uniform
Manifold Approximation and Projections (UMAP). Table 1 (above) is a panel of deep learning derived features generated using the DL model of the human foundation model and used in the present examples. Table 2 (above) is a panel of morphometric features generated using the computer vision model of the human foundation model and used in the present examples. It should be noted that the number and types of features listed in Tables 1 and 2 are provided only as examples, without limiting the scope of the present disclosure. More and/or different features can be included. This example illustrates generated and extracted features using the human foundation model and computer vision model, including features that can be referred to as blobs or granules.
Example 2 — System for Cell Morphology Analysis.
[0338] Table 3 in this example lists parameters and specifications of an example system such as described with reference to FIG. 2A (e.g., REM-1) for cell morphology analysis as well as image analysis generally, cell sorting, and other operations described herein. Table 4 lists the example components of one example system. “*” in Table 3 denotes the specification is dependent on sample characteristics and/or sorting configurations.
Specification
Facilities Instrument dimensions H: 29.5 in /75 cm
W:35.5m/90 cm
D:285in/ 75cm
Included ancillary Computer tower equipment Monitor
Keyboard
Mouse
Electrical 3 x 100-240 V surge- protected outlets
Network connection 1 Gbps ethernet with >150
Mbps upload bandwidth
Clean dry air connection | 0.55-0.72 MPa / 80-105 psi
Temperature operation 59-86 °F / 15-30 °C ranges
Operating relative 15-70, non-condensing humidity (9%)
Instrument | Input cell size (pm)
Onotpat cell viability (% | »95% of input viability)
Output collection ox positive outlet wells
Ix negative outlet tube
Ix waste bottle
Instrument | Positive outlet capacity Up to 3.000 (cells/well)
Recommended run time | Live cells: Up to 180 (min) rixed cells: Up to 600
Image resolution 3.16 {(pm/pixel)
Image size (pixels) 256 x 256
Imaging throughput Up to 1,000 {events/s)
Sorting throughput Up to 30 {cells/s)
Table 4. Components of the REM-1 system.
Name Description
REM-1 Instrument Microfluidic instrument
REM-I Imaging Kit Reagents and consumables for imaging workflow
REM-I Sorting Kit Reagents and consumables for imaging plus sorting workflow
Human Foundation Model | Artificial intelligence (AD) model for high-dimensional single-cell (HEM) morphology analysis
Data Suite Data suite for visualizing, analyzing, and storing data
[0339] While certain examples of the present subject matter have been shown and described herein, it will be obvious to those skilled in the art that such examples are purely illustrative. It is not intended that the invention be limited by the specific examples provided within the specification. While nonlimiting examples of the invention has been described with reference to the aforementioned specification, the descriptions and illustrations of the examples herein are not meant to be construed in a limiting sense. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the invention. Furthermore, it shall be understood that all aspects of the invention are not limited to the specific depictions, configurations or relative proportions set forth herein which depend upon a variety of conditions and variables. It should be understood that various alternatives to the examples described herein can be employed in practicing the invention. It is therefore contemplated that the claims shall also cover any such alternatives, modifications, variations, or equivalents.
[0340] The disclosure further comprises the following clauses, which correspond to the appended
Dutch-language claims:
CLAUSES
1. A method of processing, comprising: extracting, using a Deep Learning (DL) model, a set of machine learning (ML)-based features from a cell image; generating, using the DL model and the set of ML-based features, a plurality of DL embeddings orthogonal to each other; extracting, using a computer vision model, a set of cell morphometric features from the cell image, the cell morphometric features being orthogonal to the plurality of DL embeddings; and generating a plurality of morphometric predictive embeddings using the set of ML-based features and the set of cell morphometric features. 2. The method of Clause 1, wherein the generating the plurality of morphometric predictive embeddings comprises: generating, using the computer vision model and the set of the cell morphometric features, a plurality of morphometric embeddings; and concatenating, using the DL model, the plurality of morphometric embeddings and the plurality of DL embeddings to a plurality of predictive multidimensional vectors comprising predictive probabilities morphometric features. 3. The method of Clause 1, wherein the DL model is trained using a loss function comprising invariance, variance, covariance, morphometric decorrelation terms, or combinations thereof. 4. The method of Clause 1, further comprising:
predicting, using the DL model and the plurality of morphometric predictive embeddings, a plurality of morphometric blob features using a correlation coefficient greater than approximately 0.9. 5. The method of Clause 1, further comprising: predicting, using the DL model and the plurality of morphometric predictive embeddings, a plurality of morphometric blob features using a correlation coefficient ranging between approximately 0.85 and approximately 0.95. 6. The method of Clause 1, wherein the set of cell morphometric features comprises a plurality of blob features. 7. The method of Clause 1, wherein the generating the plurality of morphometric predictive embeddings is in a high throughput setting. 8. The method of Clause 1, wherein the plurality of DL embeddings comprises cell morphology information independent from a fixed set of rule-based morphometric features. 9. The method of Clause 1, wherein the plurality of DL embeddings comprises cell morphology data orthogonal to a fixed set of rule-based morphometric features.
10. The method of Clause 1, further comprising: separating, using the DL model, a plurality of de-correlated DL embeddings and the plurality of morphometric predictive embeddings into different dimensions. 11. The method of Clause 1, wherein the DL model is a self-supervised machine learning (SSL) system, the method further comprising: de-correlating, using the DL model, the set of morphometric features from the DL embeddings such that the DL model is trained to acquire information not covered using the computer vision model. 12. The method of Clause 1, wherein the cell image comprises a label free image.
13. The method of Clause 1, further comprising: hosting the DL model and the computer vision model in a cloud computing environment. 14. The method of Clause 1, wherein the method is performed in a cloud computing environment. 15. The method of Clause 1, further comprising: generating an instruction to sort a cell of the cell image based on the plurality of morphometric predictive embeddings.
16. The method of Clause 1. further comprising: sorting, using the plurality of morphometric predictive embeddings, a cell of the cell image; and feeding data from the sorting back to the DL model in order to train the DL model for future generating of the plurality of DL embeddings. 17. A method for assessing image data, comprising: extracting, using a Deep Learning (DL) model and image data of a plurality of cells, a vector for a cell of the plurality of cells, the vector comprising a set of machine-learning (ML)-based features and a set of cell morphometric features extracted using a computer vision model; and generating, using the DL model and using the set of ML-based features, a plurality of DL embeddings orthogonal to each other. 18. The method of Clause 17, wherein the DL model is trained using a loss function comprising invariance, variance, covariance, morphometric decorrelation terms, or combinations thereof. 19. The method of Clause 17, further comprising: generating, using the DL model, a plurality of morphometric predictive embeddings using the set of ML-based features and the set of cell morphometric features.
20. The method of Clause 19, wherein the generating the plurality of morphometric predictive embeddings comprises:
generating, using the computer vision model and at least the set of the cell morphometric features, a plurality of morphometric embeddings; and concatenating, using the DL model, the plurality of morphometric embeddings and the plurality of DL embeddings to a plurality of predictive multidimensional vectors comprising predictive probabilities morphometric features.
21. The method of Clause 19, further comprising:
predicting, using the DL model and at least the plurality of morphometric predictive embeddings, a plurality of morphometric blob features using a correlation ranging between approximately 0.85 to approximately 0.95.
22. The method of Clause 17, wherein the plurality of DL embeddings comprise cell morphology information orthogonal to the set of morphometric features, and wherein the set of morphometric features are determined using a fixed set of rules.
23. The method of Clause 17, further comprising: separating, using the DL model, a plurality of de-correlated DL embeddings and the plurality of morphometric predictive embeddings into different dimensions.
24. The method of Clause 17, wherein the DL model is a self-supervised machine learning (SSL) system, the method comprising: de-correlating, using the DL model, the set of morphometric features from the plurality of DL embeddings so that the DL model is trained to acquire information not covered using the computer vision model.
25. The method of Clause 17, wherein the image data comprises a label free image of each cell of the plurality of cells.
26. The method of Clause 17, further comprising: hosting the DL model and the computer vision model in a cloud computing environment. 27. The method of Clause 17, further comprising: generating an instruction to sort the plurality of cells using the plurality of DL embeddings. 28. The method of Clause 17, further comprising: sorting, using the plurality of morphometric predictive embeddings, a cell of the cell image; and feeding data from the sorting to the DL model in order to train the DL model for future generating of the plurality of DL embeddings. 29. A system for analyzing image data, the system comprising: at least one processor to execute instructions to perform operations comprising: extracting, using a Deep Learning (DL) model, a set of machine learning (ML)-based features from a cell image; generating, using the DL model and using the ML-based features, a plurality of DL embeddings orthogonal to each other; extracting, using a computer vision model, a set of cell morphometric features from the cell image that are orthogonal to the plurality of DL embeddings; and generating a plurality of morphometric predictive embeddings using the set of ML-based features and the set of cell morphometric features. 30. The system of Clause 29, wherein the set of cell morphometric features comprises a plurality of blob features. 31. The system of Clause 29, wherein the generating the plurality of morphometric predictive embeddings is in a high throughput setting. 32. The system of Clause 29, wherein the plurality of DL embeddings comprise cell morphology data orthogonal to a fixed set of rule based morphometric features.
33. The system of Clause 29, the operations further comprising:
separating, using the DL model, a plurality of de-correlated DL embeddings and the plurality of morphometric predictive embeddings into different dimensions.
34. The system of Clause 29, wherein the DL model is a self-supervised machine learning (SSL) system, the operations further comprising:
de-correlating the set of morphometric features from the plurality of DL embeddings so that the plurality of DL model is trained to acquire information not covered using the computer vision model.
35. The system of Clause 29, wherein the at least one processor is in a cloud computing environment.
36. A non-transitory computer-readable medium storing instructions that, when executed by processor, cause the processor to perform operations for analyzing image data of a cell image, the operations comprising:
extracting, using a trained Deep Learning (DL) model and from image data a plurality of cells, a vector for a cell of the plurality of cells, the vector comprising a set of machine-learning (ML)-based features and a set of cell morphometric features extracted by a trained computer vision model; generating, using the trained DL model and the ML-based features, a plurality of DL embeddings orthogonal to each other; and extracting, using the trained computer vision model, a set of cell morphometric features from the cell image that are orthogonal to the plurality of DL embeddings.
37. A cloud-based computing system, the system comprising: at least one cloud-based processor to execute the instructions to perform operations comprising: extracting, using a Deep Learning (DL) model, a set of machine learning (ML)-based features from a cell image;
generating, using the DL model and using the set of ML-based features, a plurality of DL embeddings orthogonal to each other; extracting, using a computer vision model, a set of cell morphometric features from the cell image. the cell morphometric features being orthogonal to the plarality of DL embeddings; and generating a plurality of morphometric predictive embeddings using the set of ML-based features and the set of cell morphometric features. 38. The system of Clause 37, wherein the generating the plurality of morphometric predictive embeddings comprises: generating, using the computer vision model and at least the set of the cell morphometric features, a plurality of morphometric embeddings; and concatenating, using the DL model, the plurality of morphometric embeddings and the plurality of DL embeddings to a plurality of predictive multidimensional vectors comprising predictive probabilities morphometric features.
39. The system of Clause 37, wherein the DL model is trained using a loss function comprising one or more invariance, variance, covariance, and morphometric decorrelation terms. 40. The system of Clause 37, the operations further comprising: predicting, using the DL model and the plurality of morphometric predictive embeddings. a plurality of morphometric blob features using a correlation coefficient greater than approximately 0.9. 41. The system of Clause 37, the operations further comprising: predicting, using the DL model and the plurality of morphometric predictive embeddings, a plurality of morphometric blob features using a correlation coefficient ranging between approximately 0.85 and approximately 0.95. 42. The system of Clause 37, the operations further comprising: separating, using the DL model, a plurality of de-correlated DL embeddings and the plurality of morphometric predictive embeddings into different dimensions.
43. The system of Clause 37, wherein the DL model is a self-supervised machine learning (SSL) system, the operations further comprising: de-correlating, using the DL model, the set of morphometric features from the DL embeddings such that the DL model is trained to acquire information not covered using the computer vision model. 44. A method for cell sorting, comprising: transporting a cell suspended in a fluid through a flow channel, wherein the flow channel is in fluid communication with a plurality of sub-channels; capturing one or more images of the cell as the cell is transported through the flow channel; extracting, by a Deep Learning (DL) model, a set of machine learning (ML)-based features from the one or more images; generating, using the DL model and the set of ML-based features, a plurality of DL embeddings orthogonal to each other; extracting, using a computer vision model, a set of cell morphometric features from the one or more images, the cell morphometric features being orthogonal to the plurality of DL embeddings; generating a plurality of morphometric predictive embeddings using the set of ML-based features and the set of cell morphometric features; and sorting the cell to a selected sub-channel of the plurality of sub-channels using the plurality of morphometric predictive embeddings. 45. A method for cell sorting, comprising: transporting a cell suspended in a fluid through a flow channel, wherein the flow channel is in fluid communication with a plurality of sub-channels; capturing one or more images of the cell as the cell is transported through the flow channel; extracting, by a Deep Learning (DL) model, a set of machine learning (ML)-based features from the one or more images; generating, using the DL model and using the set of ML-based features, a plurality of DL embeddings orthogonal to each other; and sorting the cell to a selected sub-channel of the plurality of sub-channels using the plurality of
DL embeddings.
46. A cell sorting system, comprising: a flow channel configured to transport a cell through the flow channel; an imaging device configured to capture one or more images of the cell as the cell is transported through the flow channel; and a processor to execute instructions to perform operations comprising: extracting, using a Deep Learning (DL) model, a set of machine learning (ML)-based features from a cell image; generating, using the DL model and using the ML-based features, a plurality of DL embeddings orthogonal to each other; extracting, using a computer vision model, a set of cell morphometric features from the cell image that are orthogonal to the plurality of DL embeddings; generating a plurality of morphometric predictive embeddings using the set of ML-based features and the set of cell morphometric features; and enabling sorting of the cell using one of the plurality of morphometric predictive embeddings, the plurality of DL embeddings, and the set of cell morphometric features. 47. A cell sorting system, comprising: a flow channel configured to transport a cell through the flow channel; an imaging device configured to capture one or more images of the cell as the cell 1s transported through the flow channel; and a processor to execute instructions to perform operations comprising: extracting, using a Deep Learning (DL) model and image data of a plurality of cells, a vector for a cell of the plurality of cells, the vector comprising a set of machine-learning (ML)-based features and a set of cell morphometric features extracted using a computer vision model; generating, using the DL model and using the set of ML-based features, a plurality of DL embeddings orthogonal to each other; and enabling sorting of the cell using one of the plurality of DL embeddings and the set of cell morphometric features.

Claims (47)

CONCLUSIESCONCLUSIONS 1. Een werkwijze voor verwerking, omvattende: het extraheren, met behulp van een deep-learning (DL )-model, een set op machine learning (ML) gebaseerde kenmerken uit een celbeeld; het genereren, met behulp van het DL-model en de set op ML gebaseerde kenmerken, van een aantal DL-inbeddingen die orthogonaal ten opzichte van elkaar staan; het extraheren, met behulp van een computervisiemodel, van een set van celmorfometrische kenmerken uit het celbeeld, waarbij de celmorfometrische kenmerken orthogonaal staan op het aantal DL-inbeddingen, en het genereren van een aantal morfometrische voorspellende inbeddingen met behulp van de set op ML gebaseerde kenmerken en de set van celmorfometrische kenmerken.1. A method of processing comprising: extracting, using a deep learning (DL) model, a set of machine learning (ML)-based features from a cell image; generating, using the DL model and the set of ML-based features, a plurality of DL embeddings that are orthogonal to each other; extracting, using a computer vision model, a set of cell morphometric features from the cell image, the cell morphometric features being orthogonal to the plurality of DL embeddings; and generating a plurality of morphometric predictive embeddings using the set of ML-based features and the set of cell morphometric features. 2. De werkwijze volgens conclusie 1, waarbij het genereren van het aantal morfometrische voorspellende inbeddingen omvat: het genereren, met behulp van het computervisiemodel en de set van celmorfometrische kenmerken, van een aantal morfometrische inbeddingen, en het samenvoegen, met behulp van het DL-model, van het aantal morfometrische inbeddingen en het aantal DL-inbeddingen tot een aantal voorspellende multidimensionale vectoren omvattende voorspellende waarschijnlijkheden morfometrische kenmerken.2. The method of claim 1, wherein generating the plurality of morphometric predictive embeddings comprises: generating, using the computer vision model and the set of cell morphometric features, a plurality of morphometric embeddings, and concatenating, using the DL model, the plurality of morphometric embeddings and the plurality of DL embeddings into a plurality of predictive multidimensional vectors comprising predictive probabilities of morphometric features. 3. De werkwijze volgens conclusie 1, waarbij het DL-model wordt getraind met behulp van een verliesfunctie omvattende invariantie-, variantie-, covariantie-, morfometrische decorrelatie-termen of combinaties daarvan.3. The method of claim 1, wherein the DL model is trained using a loss function comprising invariance, variance, covariance, morphometric decorrelation terms or combinations thereof. 4. De werkwijze volgens conclusie 1, verder omvattende: het voorspellen, met behulp van het DL-model en het aantal morfometrische voorspellende inbeddingen, van een aantal morfometrische blob-kenmerken met behulp van een correlatiecoëfficiënt groter dan ongeveer 0,9.4. The method of claim 1, further comprising: predicting, using the DL model and the plurality of morphometric predictive embeddings, a plurality of morphometric blob features using a correlation coefficient greater than about 0.9. 5. De werkwijze volgens conclusie 1, verder omvattende: het voorspellen, met behulp van het DL-model en het aantal morfometrische voorspellende inbeddingen, van een aantal morfometrische blob-kenmerken met behulp van een correlatiecoëfficiënt tussen ongeveer 0,85 en ongeveer 0,95.5. The method of claim 1, further comprising: predicting, using the DL model and the plurality of morphometric predictive embeddings, a plurality of morphometric blob features using a correlation coefficient between about 0.85 and about 0.95. 6. De werkwijze volgens conclusie 1, waarbij de set van celmorfometrische kenmerken een aantal blob-kenmerken omvat.6. The method of claim 1, wherein the set of cell morphometric features comprises a plurality of blob features. 7. De werkwijze volgens conclusie 1, waarbij het genereren van het aantal morfometrische voorspellende inbeddingen in een high-throughput omgeving gebeurt.7. The method of claim 1, wherein generating the plurality of morphometric predictive embeddings occurs in a high-throughput environment. 8. De werkwijze volgens conclusie 1, waarbij het aantal DL-inbeddingen celmorfologische informatie omvat die onafhankelijk is van een vaste set op regels gebaseerde morfometrische kenmerken.8. The method of claim 1, wherein the plurality of DL embeddings comprise cell morphological information that is independent of a fixed set of rule-based morphometric features. 9. De werkwijze volgens conclusie 1, waarbij het aantal DL-inbeddingen celmorfologische gegevens omvat die orthogonaal staan op een vaste set op regels gebaseerde morfometrische kenmerken.9. The method of claim 1, wherein the plurality of DL embeddings comprise cell morphological data orthogonal to a fixed set of rule-based morphometric features. 10. De werkwijze volgens conclusie 1, verder omvattende: het scheiden, met behulp van het DL-model, van een aantal gedecorreleerde DL-inbeddingen en het aantal morfometrische voorspellende inbeddingen in verschillende dimensies.10. The method of claim 1, further comprising: separating, using the DL model, a plurality of decorrelated DL embeddings and the plurality of morphometric predictive embeddings in different dimensions. 11. De werkwijze volgens conclusie 1, waarbij het DL-model een zelf-gesuperviseerd machine learning (SSL) systeem is, waarbij de werkwijze verder omvattende: het de-correleren, met behulp van het DL-model, van de set van morfometrische kenmerken van de DL-inbeddingen, zodat het DL-model wordt getraind om informatie te verkrijgen die niet wordt verkregen met behulp van het computervisiemodel.11. The method of claim 1, wherein the DL model is a self-supervised machine learning (SSL) system, the method further comprising: de-correlating, using the DL model, the set of morphometric features of the DL embeddings, such that the DL model is trained to obtain information not obtained using the computer vision model. 12. De werkwijze volgens conclusie 1, waarbij het celbeeld een label-vrije afbeelding omvat.12. The method of claim 1, wherein the cell image comprises a label-free image. 13. De werkwijze volgens conclusie 1, verder omvattende:13. The method of claim 1, further comprising: het hosten van het DL-model en het computervisiemodel in een cloudcomputing-omgeving.hosting the DL model and the computer vision model in a cloud computing environment. 14. De werkwijze volgens conclusie 1, waarbij de werkwijze wordt uitgevoerd in een cloudcomputing-omgeving.14. The method of claim 1, wherein the method is performed in a cloud computing environment. 15. De werkwijze volgens conclusie 1, verder omvattende: het genereren van een instructie om een cel ait het celbeeld te sorteren op basis van het aantal morfometrische voorspellende inbeddingen.15. The method of claim 1, further comprising: generating an instruction to sort a cell from the cell image based on the number of morphometric predictive embeddings. 16. De werkwijze volgens conclusie 1, verder omvattende: Het sorteren, met behulp van het aantal morfometrische voorspellende inbeddingen, van een cel uit het celbeeld; en het terugvoeren van de gegevens van het sorteren aan het DL-model om het DL-model te trainen voor het genereren van het aantal DL-inbeddingen in de toekomst.16. The method of claim 1, further comprising: sorting, using the plurality of morphometric predictive embeddings, a cell from the cell image; and feeding the sorting data back to the DL model to train the DL model to generate the plurality of DL embeddings in the future. 17. Een werkwijze voor het beoordelen van beeldgegevens, omvattende: het extraheren, met behulp van een Deep Learning (DL )-model en beeldgegevens van een aantal cellen, van een vector voor een cel van het aantal cellen, waarbij de vector een set op machine learning (ML) gebaseerde kenmerken en een set van morfometrische kenmerken van cellen omvat die zijn geëxtraheerd met behulp van een computervisiemodel; en het genereren, met behulp van het DL-model en met behulp van de set op ML gebaseerde kenmerken, van een aantal DL-inbeddingen die orthogonaal ten opzichte van elkaar staan.17. A method for evaluating image data, comprising: extracting, using a Deep Learning (DL) model and image data of a plurality of cells, a vector for a cell of the plurality of cells, the vector comprising a set of machine learning (ML)-based features and a set of morphometric features of cells extracted using a computer vision model; and generating, using the DL model and using the set of ML-based features, a plurality of DL embeddings that are orthogonal to each other. 18. De werkwijze volgens conclusie 17, waarbij het DL-model wordt getraind met behulp van een verliesfunctie omvattende invariantie-, variantie-, covariantie-, morfometrische decorrelatie-termen of combinaties daarvan.18. The method of claim 17, wherein the DL model is trained using a loss function comprising invariance, variance, covariance, morphometric decorrelation terms or combinations thereof. 19. De werkwijze volgens conclusie 17, verder omvattende: het genereren, met behulp van het DL-model, van een aantal morfometrische voorspellende inbeddingen met behulp van de set op ML gebaseerde kenmerken en de set van celmorfometrische kenmerken.19. The method of claim 17, further comprising: generating, using the DL model, a plurality of morphometric predictive embeddings using the set of ML-based features and the set of cell morphometric features. 20. De werkwijze volgens conclusie 19, waarbij het genereren van het aantal morfometrische voorspellende inbeddingen omvat: het genereren, met behulp van het computervisiemodel en ten minste de set de celmorfometrische kenmerken, van een aantal morfometrische inbeddingen; en het samenvoegen, met behulp van het DL-model, van het aantal morfometrische inbeddingen en het aantal DL-inbeddingen tot een aantal voorspellende multidimensionale vectoren die voorspellende waarschijnlijkheden morfometrische kenmerken omvatten.20. The method of claim 19, wherein generating the plurality of morphometric predictive embeddings comprises: generating, using the computer vision model and at least the set of cell morphometric features, a plurality of morphometric embeddings; and concatenating, using the DL model, the plurality of morphometric embeddings and the plurality of DL embeddings into a plurality of predictive multidimensional vectors comprising predictive probabilities of morphometric features. 21. De werkwijze volgens conclusie 19, verder omvattende: het voorspellen, met behulp van het DL-model en ten minste het aantal morfometrische voorspellende inbeddingen, van een aantal morfometrische blob-kenmerken met behulp van een correlatie in het bereik van ongeveer 0,85 tot en met ongeveer 0,95.21. The method of claim 19, further comprising: predicting, using the DL model and at least the plurality of morphometric predictive embeddings, a plurality of morphometric blob features using a correlation in the range of about 0.85 to about 0.95. 22. De werkwijze volgens conclusie 17, waarbij het aantal DL-inbeddingen celmorfologische informatie die loodrecht staat op de set van morfometrische kenmerken omvat, en waarbij de set van morfometrische kenmerken wordt vastgesteld met behulp van een vaste set regels.22. The method of claim 17, wherein the plurality of DL embeddings comprise cell morphological information orthogonal to the set of morphometric features, and wherein the set of morphometric features is determined using a fixed set of rules. 23. De werkwijze volgens conclusie 17, verder omvattende: het scheiden, met behulp van het DL-model, van een aantal gedecorreleerde DL-inbeddingen en het aantal morfometrische voorspellende inbeddingen in verschillende dimensies.23. The method of claim 17, further comprising: separating, using the DL model, a plurality of decorrelated DL embeddings and the plurality of morphometric predictive embeddings in different dimensions. 24. De werkwijze volgens conclusie 17, waarbij het DL-model een zelf-gesuperviseerd machine learning (SSL) systeem is, waarbij de werkwijze omvat: het de-correleren, met behulp van het DL-model, van de set van morfometrische kenmerken van het aantal DL-inbeddingen, zodat het DL-model wordt getraind om informatie te verkrijgen die niet wordt verkregen met behulp van het computervisiemodel.24. The method of claim 17, wherein the DL model is a self-supervised machine learning (SSL) system, the method comprising: de-correlating, using the DL model, the set of morphometric features of the plurality of DL embeddings, such that the DL model is trained to obtain information not obtained using the computer vision model. 25. De werkwijze volgens conclusie 17, waarbij de beeldgegevens een label-vrije afbeelding van elke cel uit het aantal cellen omvatten.25. The method of claim 17, wherein the image data comprises a label-free image of each cell from the plurality of cells. 26. De werkwijze volgens conclusie 17, verder omvattende: het hosten van het DL-model en het computervisiemodel in een cloudcomputing-omgeving.26. The method of claim 17, further comprising: hosting the DL model and the computer vision model in a cloud computing environment. 27. De werkwijze volgens conclusie 17, verder omvattende: het genereren van een instructie om het aantal cellen te sorteren met behulp van het aantal DL-inbeddingen.27. The method of claim 17, further comprising: generating an instruction to sort the plurality of cells using the plurality of DL embeddings. 28. De werkwijze volgens conclusie 17, verder omvattende: Het sorteren, met behulp van het aantal morfometrische voorspellende inbeddingen, van een IO cel uit het celbeeld; en het terugvoeren van de gegevens van het sorteren aan het DL-model om het DL-model te trainen voor het genereren van het aantal DL-inbeddingen in de toekomst.28. The method of claim 17, further comprising: sorting, using the plurality of morphometric predictive embeddings, an IO cell from the cell image; and feeding the sorting data back to the DL model to train the DL model to generate the plurality of DL embeddings in the future. 29. Een systeem voor het analyseren van beeldgegevens, het systeem omvattende: ten minste één processor om instructies uit te voeren voor het uitvoeren van bewerkingen omvattende: het extraheren, met behulp van een Deep Learning (DL )-model, van een set op machine learning (ML) gebaseerde kenmerken uit een celbeeld; het genereren, met behulp van het DL-model en met behulp van de op ML gebaseerde kenmerken, van een aantal DL-inbeddingen die orthogonaal ten opzichte van elkaar staan; het extraheren, met behulp van een computervisiemodel, van een set van morfometrische kenmerken uit het celbeeld die orthogonaal staan op het aantal DL-inbeddingen, en het genereren van een aantal morfometrische voorspellende inbeddingen met behulp van de set op ML gebaseerde kenmerken en de set van celmorfometrische kenmerken.29. A system for analyzing image data, the system comprising: at least one processor to execute instructions for performing operations comprising: extracting, using a Deep Learning (DL) model, a set of machine learning (ML)-based features from a cell image; generating, using the DL model and the ML-based features, a plurality of DL embeddings that are orthogonal to each other; extracting, using a computer vision model, a set of morphometric features from the cell image that are orthogonal to the plurality of DL embeddings, and generating a plurality of morphometric predictive embeddings using the set of ML-based features and the set of cell morphometric features. 30. Het systeem volgens conclusie 29, waarbij de set van celmorfometrische kenmerken een aantal blob-kenmerken omvat.30. The system of claim 29, wherein the set of cell morphometric features comprises a plurality of blob features. 31. Het systeem volgens conclusie 29, waarbij het genereren van het aantal morfometrische voorspellende inbeddingen in een high-throughput omgeving gebeurt.31. The system of claim 29, wherein generating the plurality of morphometric predictive embeddings occurs in a high-throughput environment. 32. Het systeem volgens conclusie 29, waarbij het aantal DL-inbeddingen celmorfologische gegevens die orthogonaal staan op een vaste set op regels gebaseerde morfometrische kenmerken omvat.32. The system of claim 29, wherein the plurality of DL embeddings comprise cell morphological data orthogonal to a fixed set of rule-based morphometric features. 33. Het systeem volgens conclusie 29, de bewerkingen verder omvattende: het scheiden, met behulp van het DL-model, van een aantal gedecorreleerde DL-inbeddingen en het aantal morfometrische voorspellende inbeddingen in verschillende dimensies.33. The system of claim 29, further comprising the operations of: separating, using the DL model, a plurality of decorrelated DL embeddings and the plurality of morphometric predictive embeddings in different dimensions. 34. Het systeem volgens conclusie 29, waarbij het DL-model een zelf-gesuperviseerd machine learning (SSL) systeem is, waarbij de bewerkingen verder omvat: het de-correleren van de set van morfometrische kenmerken van het aantal DL-inbeddingen, zodat het DL-model wordt getraind om informatie te verkrijgen die niet wordt verkregen met behulp van het computervisiemodel. 34. The system of claim 29, wherein the DL model is a self-supervised machine learning (SSL) system, the operations further comprising: de-correlating the set of morphometric features of the plurality of DL embeddings, such that the DL model is trained to obtain information not obtained using the computer vision model. 35, Het systeem volgens conclusie 29, waarbij de ten minste één processor zich in een cloudcomputing-omgeving bevindt.35. The system of claim 29, wherein the at least one processor is in a cloud computing environment. 36. Een door de computer leesbaar medium van niet-voorbijgaande aard dat instructies opslaat die, wanneer ze door de processor worden uitgevoerd, doen ontstaan dat de processor bewerkingen uitvoert voor het analyseren van beeldgegevens van een celbeeld, de bewerkingen omvattende: het extraheren, met behulp van een getraind Deep Learning (DL )-model en uit beeldgegevens een aantal cellen, een vector voor een cel uit het aantal cellen, de vector omvattende een set op machine learning (ML) gebaseerde kenmerken en een set van morfometrische kenmerken van cellen die zijn geëxtraheerd door een getraind computervisiemodel; het genereren, met behulp van het getrainde DL-model en de op ML gebaseerde kenmerken, een aantal DL-inbeddingen die orthogonaal ten opzichte van elkaar staan; en het extraheren, met behulp van het getrainde computervisiemodel, van een set van celmorfometrische kenmerken uit het celbeeld die orthogonaal staan op het aantal DL-inbeddingen.36. A non-transient computer-readable medium storing instructions that, when executed by the processor, cause the processor to perform operations for analyzing image data of a cell image, the operations comprising: extracting, using a trained Deep Learning (DL) model and from image data a plurality of cells, a vector for a cell from the plurality of cells, the vector comprising a set of machine learning (ML)-based features and a set of cell morphometric features extracted by a trained computer vision model; generating, using the trained DL model and the ML-based features, a plurality of DL embeddings that are orthogonal to each other; and extracting, using the trained computer vision model, a set of cell morphometric features from the cell image that are orthogonal to the plurality of DL embeddings. 37. Een cloudgebaseerde computersysteem, het systeem omvattende:37. A cloud-based computer system, the system comprising: ten minste één cloudgebaseerde processor om de instructies uit te voeren voor het uitvoeren van bewerkingen, omvattende: het extraheren, met behulp van een Deep Learning (DL )-model, van een set op machine learning (ML) gebaseerde kenmerken uit een celbeeld; het genereren, met behulp van het DL-model en met behulp van de set op ML gebaseerde kenmerken, van een aantal DL-inbeddingen die orthogonaal ten opzichte van elkaar staan; het extraheren, met behulp van een computervisiemodel, van een set van morfometrische kenmerken van de cel uit het celbeeld, waarbij de morfometrische kenmerken van de cel orthogonaal staan op het aantal DL-inbeddingen, en het genereren van een aantal morfometrische voorspellende inbeddingen met behulp van de set op ML gebaseerde kenmerken en de set van celmorfometrische kenmerken.at least one cloud-based processor to execute the instructions for performing operations comprising: extracting, using a Deep Learning (DL) model, a set of machine learning (ML)-based features from a cell image; generating, using the DL model and using the set of ML-based features, a plurality of DL embeddings that are orthogonal to each other; extracting, using a computer vision model, a set of morphometric features of the cell from the cell image, the morphometric features of the cell being orthogonal to the plurality of DL embeddings, and generating a plurality of morphometric predictive embeddings using the set of ML-based features and the set of cell morphometric features. 38. Het systeem volgens conclusie 37, waarbij het genereren van het aantal morfometrische voorspellende inbeddingen omvat: het genereren, met behulp van het computervisiemodel en ten minste de set celmorfometrische kenmerken, van een aantal morfometrische inbeddingen, en het samenvoegen, met behulp van het DL-model, van het aantal morfometrische inbeddingen en het aantal DL-inbeddingen tot een aantal voorspellende multidimensionale vectoren die voorspellende waarschijnlijkheden morfometrische kenmerken omvatten.38. The system of claim 37, wherein generating the plurality of morphometric predictive embeddings comprises: generating, using the computer vision model and at least the set of cell morphometric features, a plurality of morphometric embeddings, and concatenating, using the DL model, the plurality of morphometric embeddings and the plurality of DL embeddings into a plurality of predictive multidimensional vectors comprising predictive probabilities of morphometric features. 39. Het systeem volgens conclusie 37, waarbij het DL-model wordt getraind met behulp van een verliesfunctie omvattende invariantie-, variantie-, covariantie-, morfometrische decorrelatie-termen of combinaties daarvan.39. The system of claim 37, wherein the DL model is trained using a loss function comprising invariance, variance, covariance, morphometric decorrelation terms or combinations thereof. 40. Het systeem volgens conclusie 37, de bewerkingen verder omvattende: het voorspellen, met behulp van het DL-model en het aantal morfometrische voorspellende inbeddingen, van een aantal morfometrische blob-kenmerken met behulp van een correlatiecoëfficiënt groter dan ongeveer 0,9.40. The system of claim 37, further comprising the operations of: predicting, using the DL model and the plurality of morphometric predictive embeddings, a plurality of morphometric blob features using a correlation coefficient greater than about 0.9. 41. Het systeem volgens conclusie 37, de bewerkingen verder omvattende:41. The system of claim 37, further comprising the operations: het voorspellen, met behulp van het DL-model en het aantal morfometrische voorspellende inbeddingen, van een aantal morfometrische blob-kenmerken met behulp van een correlatiecoëfficiënt tussen ongeveer 0,85 en ongeveer 0,95.predicting, using the DL model and the number of morphometric predictive embeddings, a number of morphometric blob features using a correlation coefficient between about 0.85 and about 0.95. 42. Het systeem volgens conclusie 37, de bewerkingen verder omvattende: het scheiden, met behulp van het DL-model, van een aantal gedecorreleerde DL-inbeddingen en het aantal morfometrische voorspellende inbeddingen in verschillende dimensies.42. The system of claim 37, further comprising the operations of: separating, using the DL model, a plurality of decorrelated DL embeddings and the plurality of morphometric predictive embeddings in different dimensions. 43. Het systeem volgens conclusie 37, waarbij het DL-model een zelf-gesuperviseerd machine learning (SSL) systeem is, waarbij de bewerkingen omvat: het de-correleren, met behulp van het DL-model, van de set van morfometrische kenmerken van de DL-inbeddingen, zodat het DL-model wordt getraind om informatie te verkrijgen die niet wordt verkregen met behulp van het computervisiemodel.43. The system of claim 37, wherein the DL model is a self-supervised machine learning (SSL) system, the operations comprising: de-correlating, using the DL model, the set of morphometric features of the DL embeddings, such that the DL model is trained to obtain information not obtained using the computer vision model. 44. Een werkwijze voor het sorteren van cellen, omvattende: het transporteren van een cel die in een vloeistof is gesuspendeerd door een stroomkanaal, waarbij het stroomkanaal in vloeistofcommunicatie is met een aantal subkanalen; het vastleggen van een of meer beelden van de cel terwijl de cel door het stroomkanaal wordt getransporteerd; het extraheren, door middel van een Deep Learning (DL )-model, van een set op machine learning (ML) gebaseerde kenmerken uit de een of meer afbeeldingen: het genereren, met behulp van het DL-model en de set op ML gebaseerde kenmerken, van een aantal DL-inbeddingen die orthogonaal ten opzichte van elkaar staan; het extraheren, met behulp van een computervisiemodel, van een set van celmorfometrische kenmerken uit de een of meer beelden, waarbij de morfometrische kenmerken van de cel orthogonaal staan op het aantal DL-inbeddingen; het genereren van een aantal morfometrische voorspellende inbeddingen met behulp van de set op ML gebaseerde kenmerken en de set van celmorfometrische kenmerken; en het indelen van de cel in een gekozen subkanaal uit het aantal subkanalen met behulp van het aantal morfometrische voorspellende inbeddingen.44. A method of sorting cells, comprising: transporting a cell suspended in a fluid through a flow channel, the flow channel being in fluid communication with a plurality of subchannels; capturing one or more images of the cell as the cell is transported through the flow channel; extracting, using a Deep Learning (DL) model, a set of machine learning (ML)-based features from the one or more images; generating, using the DL model and the set of ML-based features, a plurality of DL embeddings orthogonal to each other; extracting, using a computer vision model, a set of cell morphometric features from the one or more images, the morphometric features of the cell being orthogonal to the plurality of DL embeddings; generating a plurality of morphometric predictive embeddings using the set of ML-based features and the set of cell morphometric features; and classifying the cell into a chosen subchannel from the number of subchannels using the number of morphometric predictive embeddings. 45. Een werkwijze voor het sorteren van cellen, omvattende: het transporteren van een cel die in een vloeistof is gesuspendeerd door een stroomkanaal, waarbij het stroomkanaal in vloeistofcommunicatie is met een aantal subkanalen; het vastleggen van een of meer beelden van de cel terwijl de cel door het stroomkanaal wordt getransporteerd; het extraheren, door middel van een Deep Learning (DL)-model, van een set op machine learning (ML) gebaseerde kenmerken uit de een of meer afbeeldingen: het genereren, met behulp van het DL-model en met behulp van de set op ML gebaseerde kenmerken, van een aantal DL-inbeddingen die orthogonaal ten opzichte van elkaar staan; en het indelen van de cel in een gekozen subkanaal uit het aantal subkanalen met behulp van het aantal DL-inbeddingen.45. A method of sorting cells, comprising: transporting a cell suspended in a fluid through a flow channel, the flow channel being in fluid communication with a plurality of subchannels; capturing one or more images of the cell as the cell is transported through the flow channel; extracting, using a Deep Learning (DL) model, a set of machine learning (ML)-based features from the one or more images; generating, using the DL model and using the set of ML-based features, a plurality of DL embeddings that are orthogonal to each other; and classifying the cell into a selected subchannel from the plurality of subchannels using the plurality of DL embeddings. 46. Een celsorteersysteem, omvattende: een stroomkanaal dat is geconfigureerd om een cel door het stroomkanaal te transporteren; een beeldvormingsinrichting dat is geconfigureerd om een of meer beelden van de cel vast te leggen terwijl de cel door het stroomkanaal wordt getransporteerd; en een processor om instructies uit te voeren voor het uitvoeren van bewerkingen omvattende: het extraheren, met behulp van een Deep Learning (DL )-model, van een set op machine learning (ML) gebaseerde kenmerken uit een celbeeld; het genereren, met behulp van het DL-model en met behulp van de op ML gebaseerde kenmerken, van een aantal DL-inbeddingen die orthogonaal ten opzichte van elkaar staan; het extraheren, met behulp van een computervisiemodel, van een set van celmorfometrische kenmerken uit het celbeeld die orthogonaal staan op het aantal DL-inbeddingen; het genereren van een aantal morfometrische voorspellende inbeddingen met behulp van de set op ML gebaseerde kenmerken en de set van celmorfometrische kenmerken; en het mogelijk maken dat de cel kan worden gesorteerd met behulp van één van het aantal morfometrische voorspellende inbeddingen, het aantal DL-inbeddingen en de set van celmorfometrische kenmerken.46. A cell sorting system, comprising: a flow channel configured to transport a cell through the flow channel; an imaging device configured to capture one or more images of the cell as the cell is transported through the flow channel; and a processor to execute instructions for performing operations comprising: extracting, using a Deep Learning (DL) model, a set of machine learning (ML)-based features from a cell image; generating, using the DL model and the ML-based features, a plurality of DL embeddings that are orthogonal to each other; extracting, using a computer vision model, a set of cell morphometric features from the cell image that are orthogonal to the plurality of DL embeddings; generating a plurality of morphometric predictive embeddings using the set of ML-based features and the set of cell morphometric features; and allow the cell to be sorted using any one of the number of morphometric predictive embeddings, the number of DL embeddings, and the set of cell morphometric features. 47. Een celsorteersysteem, omvattende: een stroomkanaal dat is geconfigureerd om een cel door het stroomkanaal te transporteren;47. A cell sorting system comprising: a flow channel configured to transport a cell through the flow channel; een beeldvormingsinrichting dat is geconfigureerd om een of meer beelden van de cel vast te leggen terwijl de cel door het stroomkanaal wordt getransporteerd; en een processor om instructies uit te voeren voor het uitvoeren van bewerkingen omvattende: het extraheren, met behulp van een Deep Learning (DL )-model en beeldgegevens van een aantal cellen, van een vector voor een cel uit het aantal cellen, waarbij de vector omvat een set op machine learning (ML) gebaseerde kenmerken en een set van celmorfometrische kenmerken die zijn geëxtraheerd met behulp van een computervisiemodel; het genereren, met behulp van het DL-model en met behulp van de set op ML gebaseerde kenmerken, van een aantal DL-inbeddingen die orthogonaal ten opzichte van elkaar staan; en het mogelijk maken dat de cel kan worden gesorteerd met behulp van één van het aantal DL- inbeddingen en de set van celmorfometrische kenmerken.an imaging device configured to capture one or more images of the cell as the cell is transported through the flow channel; and a processor to execute instructions for performing operations comprising: extracting, using a Deep Learning (DL) model and image data of a plurality of cells, a vector for a cell from the plurality of cells, the vector comprising a set of machine learning (ML)-based features and a set of cell morphometric features extracted using a computer vision model; generating, using the DL model and using the set of ML-based features, a plurality of DL embeddings that are orthogonal to each other; and enabling the cell to be sorted using one of the plurality of DL embeddings and the set of cell morphometric features.
NL2036275A 2023-10-30 2023-11-15 Systems and methods for morphometric analysis NL2036275B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/US2024/053235 WO2025096338A1 (en) 2023-10-30 2024-10-28 Systems and methods for morphometric analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US202363546385P 2023-10-30 2023-10-30

Publications (1)

Publication Number Publication Date
NL2036275B1 true NL2036275B1 (en) 2025-05-13

Family

ID=95745077

Family Applications (1)

Application Number Title Priority Date Filing Date
NL2036275A NL2036275B1 (en) 2023-10-30 2023-11-15 Systems and methods for morphometric analysis

Country Status (1)

Country Link
NL (1) NL2036275B1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220180975A1 (en) * 2019-01-28 2022-06-09 The Broad Institute, Inc. Methods and systems for determining gene expression profiles and cell identities from multi-omic imaging data

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220180975A1 (en) * 2019-01-28 2022-06-09 The Broad Institute, Inc. Methods and systems for determining gene expression profiles and cell identities from multi-omic imaging data

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
BLONDEL ET AL.: "Fast unfolding of communities in large networks", JOURNAL OF STATISTICAL MECHANICS: THEORY AND EXPERIMENT, vol. 2008, 2008, pages 10008
DOSOVITSKIY ET AL.: "An image is worth 16x16 words: Transformers for image recognition at scale", INTERNATIONAL CONFERENCE ON LEARNING REPRESENTATIONS, 2021, pages 21, Retrieved from the Internet <URL:arxiv.org/abs/2010.11929>
MICHAL GERASIMIUK ET AL: "MURAL: An Unsupervised Random Forest-Based Embedding for Electronic Health Record Data", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 19 November 2021 (2021-11-19), XP091101069 *
TRAAG ET AL.: "From Louvain to Leiden: guaranteeing well-connected communities", SCIENTIFIC REPORTS, vol. 9, no. 5233, 2019, pages 12

Similar Documents

Publication Publication Date Title
EP4022500B1 (en) Multiple instance learner for tissue image classification
US20240153289A1 (en) Systems and methods for cell analysis
Walter et al. Artificial intelligence in hematological diagnostics: Game changer or gadget?
Sommer et al. A deep learning and novelty detection framework for rapid phenotyping in high-content screening
CN113454733B (en) Multi-instance learner for prognostic tissue pattern recognition
Van Valen et al. Deep learning automates the quantitative analysis of individual cells in live-cell imaging experiments
Amat et al. Fast, accurate reconstruction of cell lineages from large-scale fluorescence microscopy data
Kraus et al. Computer vision for high content screening
Bakker et al. Morphologically constrained and data informed cell segmentation of budding yeast
CN117178302A (en) Systems and methods for cell analysis
Ota et al. Implementing machine learning methods for imaging flow cytometry
US20240362454A1 (en) Systems and methods of machine learning-based physical sample classification with sample variation control
US20250191680A1 (en) Analyzing cell phenotypes
US20250356959A1 (en) Toxicity Prediction Of Compounds In Cellular Structures
US20250218201A1 (en) Analyzing phenotypes of cells
Eulenberg et al. Deep learning for imaging flow cytometry: cell cycle analysis of Jurkat cells
NL2036275B1 (en) Systems and methods for morphometric analysis
US20240362462A1 (en) Systems and methods of machine learning-based sample classifiers for physical samples
Mapstone et al. Machine learning approaches for image classification in developmental biology and clinical embryology
US20250356483A1 (en) Quality Control Of In-Vitro Analysis Sample Output
Fishman et al. Segmenting nuclei in brightfield images with neural networks
WO2024238130A2 (en) Systems and methods for cell morphology analysis
WO2025122842A1 (en) Analyzing cell phenotypes
WO2025096338A1 (en) Systems and methods for morphometric analysis
Verdier et al. A maximum mean discrepancy approach reveals subtle changes in α-synuclein dynamics