Disclosure of Invention
In view of this, the invention provides a method for organizing source tracing information of multi-source remote sensing image metadata and a management system, which are used for solving the problem that information recorded in the remote sensing image metadata cannot meet complex source tracing requirements.
The invention discloses a multi-source remote sensing image pixel data traceability information organization method in a first aspect, which comprises the following steps:
obtaining multi-source remote sensing image tracing information under different scenes; abstracting multi-source remote sensing image traceability information into four types of elements of events, entities, relations and attributes, and establishing a conceptual model of the multi-source remote sensing image traceability information in a mapping mode;
and carrying out dimension induction on the remote sensing image metadata model UMM, and embedding the traceability information into the remote sensing image metadata model UMM based on the image source, the processing process and the relationship among the images of the traceability information in the conceptual model to obtain a metadata organization model with enhanced traceability expression.
On the basis of the above technical solution, preferably, the method for obtaining the multi-source remote sensing image traceability information based on the original remote sensing image data includes:
modeling from top to bottom according to the interlayer level relation of the remote sensing images, and creating traceability information in batches;
automatically capturing algorithm, input/output and execution time information used by a remote sensing data processing tool, and recording a traceability information fragment by using PROV-O;
manually inputting the tracing information of the remote sensing data by a user through an interactive interface;
and (4) according to the semantic relation of the tracing map, performing tracing relation mining and automatically completing tracing information.
On the basis of the above technical solution, preferably, abstracting the multi-source remote sensing image traceability information into four types of elements of events, entities, relationships, and attributes specifically includes:
abstracting the processing process of the remote sensing image into event elements, wherein the event elements have attribute information within the starting time and the ending time;
abstracting an image data set, a single remote sensing image, a processing algorithm and related individuals/mechanisms involved in the processing process of the remote sensing image into entity elements;
the relationship elements comprise relationships between entities and events;
the attribute element is semantic information included in the event element, the entity element, or the relationship element.
On the basis of the above technical solution, preferably, the relationship between the entity and the entity includes an inclusion relationship between the image data set and a single remote sensing image, a derivation or substitution relationship between the image data set and the image data set, a derivation or substitution relationship between a single remote sensing image and a single remote sensing image, and an attribution relationship between the image data set or the single remote sensing image and the individual/organization;
the variable dimension information in the derivative relationship is expressed by a coding scheme, and the coding order represents the spatial range, the band information, the resolution and the data type from left to right.
On the basis of the above technical solution, preferably, the performing dimensionality induction on the remote sensing image metadata model UMM, based on the image source, the processing process and the inter-image relationship of the traceability information in the conceptual model, embedding the traceability information into the remote sensing image metadata model UMM, and obtaining the metadata organization model with enhanced traceability expression specifically includes:
the remote sensing image metadata model UMM is summarized into eight-dimensional information: identification dimension, time dimension, space dimension, traceability dimension, platform dimension, data dimension, quality dimension and authority dimension;
analyzing the image source, the processing process and the relationship information between the images of the traceability information in the conceptual model, embedding the traceability information into the traceability dimension of the remote sensing image metadata model UMM, and obtaining the metadata organization model with enhanced traceability expression.
On the basis of the above technical solution, preferably, the method further includes:
and establishing a mapping frame of the conceptual model and the PROV model, expressing the traceability information of the conceptual model into RDF data by using PROV-O, and sharing the traceability information of the remote sensing image in the Web environment based on the RDF data.
On the basis of the above technical solution, preferably, the establishing a mapping framework of the conceptual model and the PROV model specifically includes:
mapping event elements in the conceptual model to activities of the PROV model;
mapping entity elements in the conceptual model into entities or agents of the PROV model; wherein the image data set and the single remote sensing image are mapped into an entity of the PROV model, and the individuals/institutions and the software are mapped into an agent of the PROV model;
and mapping the relation elements in the conceptual model into seven relations in the PROV model.
In a second aspect of the present invention, a multi-source remote sensing image metadata traceability information management system is disclosed, the system is based on the method of the first aspect of the present invention, the system comprises:
the source tracing information importing module comprises: the tracing information is used for importing tracing information and supports the import of a tracing information fragment in a PROV-O format;
a metadata storage module: the source tracing system is used for organizing and storing the tracing information and the metadata according to a metadata organization model;
a source tracing information query module: the remote sensing image source tracing and metadata searching and retrieving system is used for tracing the remote sensing image and searching and retrieving metadata based on the metadata organization model;
the tracing information visualization module: the method is used for performing map type visual display on the tracing information.
Compared with the prior art, the invention has the following beneficial effects:
1) According to the invention, event, entity, relation and attribute information in the remote sensing image derivation process are expressed in a atlas mode, a concept model facing to the tracing information is constructed, the tracing model is embedded into a metadata model, a metadata organization model for enhancing the tracing information is designed, and metadata content is enriched;
2) The traceability information-oriented conceptual model records information such as steps, used algorithms, software environments, activity executors, changes of derivative data generated after processing and the like from a data source to the processing process, expresses the variable dimensional information in the derivative relation through a coding scheme, can perform traceability tracking in all directions, and improves the availability and reliability of remote sensing image traceability;
3) The invention provides a mapping method of a traceability concept model and a PROV traceability model, expands a W3CPROV model, improves the interoperability of remote sensing image traceability information in the Web field, and facilitates data sharing.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments of the present invention, belong to the protection scope of the present invention.
Referring to fig. 1, the present invention provides a method for organizing metadata traceability information of a multi-source remote sensing image, where the method includes:
s1, obtaining multi-source remote sensing image traceability information under different scenes.
The remote sensing image traceability information refers to the complete record of all processes from the beginning to the extinction of the remote sensing image, and comprises the source of data, producer information, processing steps and processing algorithms in the data production process and the like. The invention provides four acquisition modes of remote sensing image traceability information:
1) Modeling from top to bottom: modeling from top to bottom according to the interlayer level relation of the remote sensing images, and creating traceability information in batches; for example, in fig. 2, the video data L2 in the data set Landsat Collection2 is derived from the video data L1.
2) Automatic capture by the remote sensing data processing tool: when the remote sensing image is processed by using a remote sensing data software processing tool, information such as an algorithm, input/output, execution time and the like used in the remote sensing image is automatically captured, and a tracing information fragment is recorded by using PROV-O.
3) Manual input: manually inputting the tracing information of the remote sensing data through an interactive interface by a user;
4) And (3) tracing relation mining: and (4) according to the semantic relation of the tracing atlas, performing tracing relation mining and automatically completing tracing information.
S2, abstracting the multi-source remote sensing image traceability information into four elements of events, entities, relations and attributes, and establishing a conceptual model of the multi-source remote sensing image traceability information in a mapping mode.
The method constructs a conceptual model of the tracing information in a mapping mode, and abstracts the tracing information into four types of elements of events, entities, relationships and attributes. The processing process of the remote sensing image is abstracted into event elements, and the event elements have attribute information with the starting time and the ending time; abstracting an image data set (Collection), a single remote sensing image (Granule), a processing algorithm and related individuals/mechanisms involved in the processing process of the remote sensing image into entity elements; the relationship element describes the relationship between the entity and the relationship between the entity and the event; the attribute element describes semantic information contained in the event element, the entity element, or the relationship element.
Fig. 2 is a conceptual model diagram of remote sensing image traceability information, which is illustrated by Landsat data, and records change information, such as changes in spatial range, resolution, and band information, generated from steps, algorithms, software environments, and activity executors, which are performed in the data source to processing, to the processed information. In fig. 2, an image data set Collection1L1 and an image data set Collection 2L 1 obtained from a data source Landsat are in a replaceable relationship, the image data set Collection 2L 1 derives an image data set Collection 2L 2, a single image data grain 2 in the image data set Collection 2L 1 is subjected to atmospheric correction processing to obtain a single image data grain 3, the single image data grain 3 is subjected to water body extraction to obtain a single image data grain 4, various attribution relationships, incidence relationships, derivation relationships and the like are included, and a complete remote sensing image traceability information conceptual model is finally constructed and obtained.
Table 1 lists the concepts of the four types of elements of events, entities, relationships and attributes and the elements that they contain.
Table 1 traceability information conceptual model element table
As can be seen from table 1, the relationship between the entities includes an inclusion relationship between the image dataset and a single remote-sensing image, a derivation or substitution relationship between the image dataset and the image dataset, a derivation or substitution relationship between a single remote-sensing image and a single remote-sensing image, and an attribution relationship between the image dataset or a single remote-sensing image and a person/organization. The derivative relationship between the images can identify the variation dimensionality of the images through the attribute of the variation code, for example, the variation of the spatial dimensionality (cropping operation and the like), the variation of the spatial resolution dimensionality (resampling and the like), and the variation of the image wave band dimensionality (calculation of the normalized vegetation index and the like).
The invention expresses the changed dimension information in the derivative relation through an encoding scheme, encodes the changed information into '01', and encodes other dimensions which are not changed into '00'. As shown in fig. 3, a specific encoding method is illustrated, and the encoding sequence represents a spatial range, band information, resolution, data type, and the like from left to right, and the encoding method can be extended according to actual needs.
Taking the image processing process in fig. 2 as an example, the water extraction processing operation is performed between the image kernel 4 and the image kernel 3, and the operation is implemented by calculating the water index through the NDWI, and the image band information is changed, so the metadata change information is encoded to "00010000", and the encoding sequence represents the spatial range, the band information, the resolution and the data type from left to right.
And S3, establishing a mapping frame of the conceptual model and the PROV model, using PROV-O to express the traceability information of the conceptual model into RDF data, and sharing the traceability information of the remote sensing image in the Web environment based on the RDF data.
The W3C PROV model defines three cores including an Entity (Entity), an Agent (Agent), and an Activity (Activity), and seven relations defining the relationships between the inside of each of the three cores and each other, such as a derivative relationship between output and input data, a representative relationship between an individual and an organization, a usage or generation relationship between processing and data, and the like.
The method realizes the tracing information sharing under the distributed environment by establishing the mapping framework of the tracing information conceptual model and the W3C PROV tracing model. Specifically, mapping event elements in the conceptual model into activities (Activities) of the PROV model; mapping entity elements in the conceptual model into entities or agents of the PROV model; wherein, the entities such as mapping of the image data set (Collection) and the single remote sensing image (Granule) are entities (Entity) of the PROV model, and the individuals/mechanisms and the software are mapped to agents (Agent) of the PROV model; the relationship elements in the conceptual model are mapped to seven relationships in the PROV model. Two entities may be considered as alternatives to each other if they differ only slightly or not.
As shown in fig. 4, which is a mapping framework constructed by the present invention, the present invention uses elements with different shapes to respectively represent three elements, namely an Entity (Entity), an Activity (Activity), and an Agent (Agent) in a PROV model, and further maps the relationship between the elements. For example, an ellipse represents an Entity (Entity) in the PROV model, and may include an image data set (Collection), a single remote sensing image (Granule), an Algorithm (Algorithm), and the like; rectangles represent activities (Activities) in the PROV model, which may represent the processing of imagery; the hexagon represents an Agent (Agent) in the PROV model, and can be an organization or an individual to which the remote sensing image data source belongs, or an executor of the processing process, and the like.
The source tracing information is expressed into RDF (Resource Description Framework) data by using PROV Ontology (PROV-O) pushed by W3C, and the expression of the source tracing information knowledge can be realized. The traceability information expression based on the PROV model improves the interoperation capacity of the remote sensing image traceability information in the Web environment, and can conveniently share the traceability information.
And S4, carrying out dimensionality induction on metadata information of the remote sensing image metadata model UMM, and embedding the traceability information into the remote sensing image metadata model UMM based on the image source, the processing process and the relationship among the images of the traceability information in the conceptual model to obtain a metadata organization model with enhanced traceability expression.
The remote sensing image Metadata model UMM used by NASA is an extensible Metadata model, mainly comprises seven configuration files including UMM-C, UMM-G, UMM-S, UMM-Var, UMM-Vis, UMM-T and UMM Common, and provides a bridge for mapping between Metadata standards supported by CMR (Common Metadata Repository). The method expands a remote sensing image metadata model UMM, embeds the traceability information into the metadata model, and designs a metadata organization model with enhanced traceability expression.
Firstly, the metadata information of the remote sensing image metadata model UMM is summarized into eight-dimensional information: the identification dimension, the time dimension, the space dimension, the traceability dimension, the platform dimension, the data dimension, the quality dimension and the authority dimension, and the metadata information mainly contained in the eight dimensions is shown in table 2.
Table 2 metadata organizational model dimension table
And then, analyzing the image source, the processing process and the relationship information among the images of the tracing information in the conceptual model established in the step S2, and embedding the tracing information into the tracing dimension of a remote sensing image metadata model UMM to obtain a metadata organization model with enhanced tracing expression. Fig. 4 shows an example of the metadata organization model obtained by extension, in which a dashed box is embedded traceability information, and specifically, information such as relationships between images, relationship types, single image traceability, image processing procedures, image processing events, image processing data sources, and image set traceability is embedded.
And S5, visual traceability tracking and metadata searching and retrieving of the remote sensing image.
The metadata information of the remote sensing image metadata model UMM is subjected to multi-dimensional induction, the metadata information of each dimension is summarized, the traceability information is embedded into the metadata information, the metadata content is enriched, the capability of expressing the traceability information by the remote sensing image metadata is improved, the traceability tracking and the metadata searching and retrieving of the image can be realized through the organizational model, the image processing process is made public and transparent, and the guarantee is provided for the traceability of the quality of remote sensing image products.
On the basis of the above method for organizing the source tracing information of the metadata of the multi-source remote sensing image, the invention also provides a system for managing the source tracing information of the metadata of the multi-source remote sensing image, which comprises:
the source tracing information importing module: the tracing information is used for importing tracing information and supports the import of a tracing information fragment in a PROV-O format;
a metadata storage module: the source tracing system is used for organizing and storing the source tracing information and the metadata according to the form of a metadata organization model; the specific organization is the same as steps S1 to S4 of the method embodiment described above.
A source tracing information query module: the remote sensing image source tracing and metadata searching and retrieving system is used for tracing the remote sensing image and searching and retrieving metadata based on the metadata organization model;
the traceability information visualization module: the method is used for performing map type visual display on the tracing information.
The system adopts a B/S framework, the back end of the system is developed based on a Springboot framework by using a JAVA language, and the front end of the system is developed based on OpenLayers. The system uses open-source relational database software PostgreSQL to store remote sensing image metadata, wherein spatial dimension information is stored in a spatial extension PostGIS of the PostgreSQL. The system supports the import of the tracing information fragment in the PROV-O format, and also provides a remote sensing image tracing information query interface and a map type visualization mode.
The above system embodiments are implemented based on method embodiments, and please refer to the method embodiments for brief description of the system embodiments.
The above-described system embodiments are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, i.e. may be distributed over a plurality of network units. Without creative labor, a person skilled in the art can select some or all of the modules according to actual needs to achieve the purpose of the solution of the embodiment.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.