[go: up one dir, main page]

CN117216006A - File content retrieval methods, devices, storage media and electronic equipment - Google Patents

File content retrieval methods, devices, storage media and electronic equipment Download PDF

Info

Publication number
CN117216006A
CN117216006A CN202311468353.XA CN202311468353A CN117216006A CN 117216006 A CN117216006 A CN 117216006A CN 202311468353 A CN202311468353 A CN 202311468353A CN 117216006 A CN117216006 A CN 117216006A
Authority
CN
China
Prior art keywords
file
creating
content
character
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311468353.XA
Other languages
Chinese (zh)
Inventor
王佳新
廖逍
丁学英
张喆
吴刚
张志治
杨洋
王莹
李治
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Information and Telecommunication Group Co Ltd
Original Assignee
State Grid Information and Telecommunication Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Information and Telecommunication Group Co Ltd filed Critical State Grid Information and Telecommunication Group Co Ltd
Priority to CN202311468353.XA priority Critical patent/CN117216006A/en
Publication of CN117216006A publication Critical patent/CN117216006A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请提供一种文件内容的检索方法、装置、存储介质和电子设备;所述方法包括:基于预设的多个文件类型,对应每个文件类型创建指向文件中文本内容的内容对象;构建对应多个文件的文件索引集,并基于所述文件索引集对应的文件,按照预设的检索条件构建目标列表,所述检索条件包括关键内容和文件类型;对于所述目标列表对应的每个文件,利用对应该文件的内容对象读取该文件的文本内容,并检索所述该文件的文本内容中是否存在所述关键内容;将文本内容中存在所述关键内容的文件作为候选文件,并将各个候选文件的文件名组成结果列表并展示,通过选择所述结果列表中任意文件名,展示对应该文件名的文本内容和/或调取对应该文件名的文件。

This application provides a method, device, storage medium and electronic device for retrieving file content; the method includes: based on multiple preset file types, corresponding to each file type, creating a content object pointing to the text content in the file; constructing a corresponding A file index set of multiple files, and based on the files corresponding to the file index set, a target list is constructed according to the preset search conditions, and the search conditions include key content and file type; for each file corresponding to the target list , use the content object corresponding to the file to read the text content of the file, and retrieve whether the key content exists in the text content of the file; use the file with the key content in the text content as a candidate file, and The file names of each candidate file form a result list and are displayed. By selecting any file name in the result list, the text content corresponding to the file name is displayed and/or the file corresponding to the file name is retrieved.

Description

File content retrieval method and device, storage medium and electronic equipment
Technical Field
The embodiment of the application relates to the technical field of file retrieval, in particular to a method and a device for retrieving file contents, a storage medium and electronic equipment.
Background
The related search method only supports searching and searching through file names, but cannot directly search file contents, which causes inconvenience and inefficiency of file searching; meanwhile, the related searching method cannot support searching and displaying of various files, and cannot perform unified searching when the files to be searched of various file types are faced.
Based on this, a solution is needed that can realize the retrieval of text content of a file and can uniformly retrieve different file types.
Disclosure of Invention
In view of the above, an object of the present application is to provide a method, an apparatus, a storage medium, and an electronic device for retrieving file contents.
Based on the above object, the present application provides a method for retrieving file contents, comprising:
creating a content object pointing to text content in a file corresponding to each file type based on a plurality of preset file types;
constructing a file index set corresponding to a plurality of files, and constructing a target list according to preset search conditions based on the files corresponding to the file index set, wherein the search conditions comprise key contents and file types;
For each file corresponding to the target list, reading the text content of the file by using the content object corresponding to the file, and searching whether the key content exists in the text content of the file;
and taking the file with the key content in the text content as a candidate file, forming a result list by the file names of the candidate files, displaying the text content corresponding to the file names and/or calling the file corresponding to the file names by selecting any file name in the result list.
Further, the file types include a TXT type, a Word type, an Excel type, a PPT type and a PDF type; the files comprise TXT files, word files, excel files, PPT files and PDF files;
creating a content object that points to text content in a file for each file type, comprising:
creating a first character object pointing to characters in a TXT file and creating a first image object pointing to images in the TXT file according to the TXT type, wherein the first character object is used for reading the character content in the TXT file and the first image object is used for reading the image content in the TXT file;
creating a second character object pointing to a character in a Word file and creating a second image object pointing to an image in the Word file, wherein the second character object is used for reading character content in the Word file, and the second image object is used for reading image content in the Word file;
Creating a third character object pointing to characters in an Excel file and creating a third image object pointing to images in the Excel file, wherein the third character object is used for reading character contents in the Excel file, and the third image object is used for reading image contents in the Excel file;
creating a fourth character object pointing to characters in the PPT file and creating a fourth image object pointing to images in the PPT file according to the PPT type, wherein the fourth character object is used for reading character contents in the PPT file, and the fourth image object is used for reading image contents in the PPT file;
and creating a fifth character object pointing to characters in the PDF file and a fifth image object pointing to an image in the PDF file according to the PDF type, wherein the fifth character object is used for reading the character content in the PDF file, and the fifth image object is used for reading the image content in the PDF file.
Further, creating a second character object pointing to a character in the Word file, and creating a second image object pointing to an image in the Word file, comprising:
creating a primary Word object pointing to a Word program, and creating a secondary Word object pointing to a Word file under the primary Word object, wherein the primary Word object is used for determining the Word file, and the secondary Word object is used for reading the Word file;
Creating the second character object and the second image object under the second-level Word object;
creating a third character object pointing to a character in the Excel file, and creating a third image object pointing to an image in the Excel file, including,
creating a primary Excel object pointing to an Excel program, and creating a secondary Excel object pointing to an Excel file under the primary Excel object, wherein the primary Excel object is used for determining the Excel file, and the secondary Excel object is used for reading the Excel file;
creating the third character object and the third image object under the secondary Excel object;
creating a fourth character object that points to a character in the PPT file and creating a fourth image object that points to an image in the PPT file, including,
creating a primary PPT object pointing to a PPT program, and creating a secondary PPT object pointing to a PPT file under the primary PPT object, wherein the primary PPT object is used for determining the PPT file, and the secondary PPT object is used for reading the PPT file;
creating the fourth character object and the fourth image object under the secondary PPT object;
creating a fifth character object pointing to a character in the PDF file and creating a fifth image object pointing to an image in the PDF file, including,
Creating a primary PDF object pointing to a PDF program, and creating a secondary PDF object pointing to a PDF file under the primary PDF object, wherein the primary PDF object is used for determining the PDF file, and the secondary PDF object is used for reading each page in the PDF file;
and creating the fifth character object and the fifth image object under the secondary PDF object.
Further, constructing a file index set corresponding to the plurality of files, including:
scanning all files in a database to be retrieved, and constructing the file name of each file and a corresponding file storage path as an index entry corresponding to the file;
and forming all index entries into a file index set corresponding to all files in the database.
Further, based on the file corresponding to the file index set, constructing a target list according to a preset search condition, including:
determining a file with the same file type as the file type in the retrieval condition from the files corresponding to the file index set;
and forming index entries corresponding to the files with the same file types into the target list.
Further, before retrieving whether the key content exists in the text content of the file, the method further includes:
Judging whether each file is encrypted or not;
judging whether a decryption password is available or not in response to determining that any file is encrypted;
decrypting with the decryption password in response to determining that the decryption password is provided;
in response to determining that the decryption password is not available, the file is not retrieved.
Further, the search condition also includes a file name;
after constructing the file index set corresponding to the plurality of files, the method further comprises:
and in response to the file name included in the search condition, searching the file index set for the file with the file name consistent with the file name in the search condition, and taking the file as the candidate file.
Based on the same inventive concept, the application also provides a retrieval device of file content, comprising: the system comprises an object creation module, a target list construction module, a retrieval module and a display module;
the object creation module is configured to create a content object pointing to text content in a file corresponding to each file type based on a plurality of preset file types;
the target list construction module is configured to construct a file index set corresponding to a plurality of files, and construct a target list according to preset retrieval conditions based on the files corresponding to the file index set, wherein the retrieval conditions comprise key contents and file types;
The retrieval module is configured to, for each file corresponding to the target list, read text content of the file by using a content object corresponding to the file, and retrieve whether the key content exists in the text content of the file;
the display module is configured to take the file with the key content in the text content as a candidate file, form a result list with the file names of the candidate files, display the text content corresponding to the file names and/or call the file corresponding to the file names by selecting any file name in the result list.
Based on the same inventive concept, the application also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the file content searching method according to any one of the above when executing the program.
Based on the same inventive concept, the present application also provides a non-transitory computer readable storage medium, wherein the non-transitory computer readable storage medium stores computer instructions for causing the computer to perform a method for retrieving file contents as described above.
As can be seen from the above description, the method, the device, the storage medium and the electronic equipment for searching file contents provided by the application are based on different file types of the file to be searched, and respective content objects are created corresponding to each file type, so that text contents of different file types are read, searching across file types is realized, searching from a file index set is realized through a constructed file index set, each file is not required to be searched one by one, efficiency is greatly improved, meanwhile, preliminary screening on the file types is carried out on the file to be searched through searching conditions, so that the efficiency of searching key contents is improved, and finally when candidate files are displayed, text contents can be read from the candidate files through the created content objects, and display of the text contents is realized.
Drawings
In order to more clearly illustrate the technical solutions of the present application or related art, the drawings that are required to be used in the description of the embodiments or related art will be briefly described below, and it is apparent that the drawings in the following description are only embodiments of the present application, and other drawings may be obtained according to the drawings without inventive effort to those of ordinary skill in the art.
FIG. 1 is a first flowchart of a method for retrieving file contents according to an embodiment of the present application;
FIG. 2 is a second flowchart of a method for retrieving file contents according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a retrieving device for file contents according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the application.
Detailed Description
The present application will be further described in detail below with reference to specific embodiments and with reference to the accompanying drawings, in order to make the objects, technical solutions and advantages of the present application more apparent.
It should be noted that unless otherwise defined, technical or scientific terms used in the embodiments of the present application should be given the ordinary meaning as understood by one of ordinary skill in the art to which the present application belongs. The terms "first," "second," and the like, as used in embodiments of the present application, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", etc. are used merely to indicate relative positional relationships, which may also be changed when the absolute position of the object to be described is changed.
As described in the background section, the related file content retrieval method is also difficult to meet the needs of the actual retrieval work.
The applicant finds that in the process of implementing the present application, the main problems of the related file content searching method are: the related search method only supports searching and searching through file names, but cannot directly search file contents, which causes inconvenience and inefficiency of file searching; meanwhile, the related searching method cannot support searching and displaying of various files, and cannot perform unified searching when the files to be searched of various file types are faced.
Based on this, one or more embodiments of the present application provide a method of retrieving file contents.
Embodiments of the present application are described in detail below with reference to the accompanying drawings.
Referring to fig. 1, a method for retrieving file contents according to an embodiment of the present application includes the steps of:
step S101, creating a content object pointing to text content in a file corresponding to each file type based on a plurality of preset file types.
In the embodiment of the application, for a plurality of files to be detected, the files can be distinguished according to a plurality of preset file types, and for files of different file types, different content objects can be created for the files, wherein the content objects point to text contents of the corresponding file types.
Specifically, a plurality of file types may be preset, including: TXT type, word type, excel type, PPT type, PDF type, and the like.
Based on this, all files in the database to be detected can be classified into a TXT file, a Word file, an Excel file, a PPT file, and a PDF file according to the file types described above.
Further, for a TXT file, a first content object may be created that points to the text content in its file.
Specifically, in a TXT file, its text content may include characters and images, based on which a first character object and a first image object may be created for the TXT file.
Wherein the first character object points to a character in the text content of the TXT file and the first image object points to an image in the text content of the TXT file.
Specifically, a primary TXT object may be first created that points to a TXT program, with which the TXT file may be determined and read out of a plurality of files of different file types.
Further, the primary TXT object may be utilized as a parent object to create a next child object, i.e., a first character object and a first image object under the primary TXT object, the first character object may be used to read or extract character content in the TXT file, and the first image object may be used to read or extract image content in the TXT file.
Further, in the Word file, the text content thereof may also include characters and images, on the basis of which a second character object and a second image object with respect to the Word file may be set.
Wherein the second character object points to a character in the text content of the Word file and the second image object points to an image in the text content of the Word file.
Specifically, a primary Word object pointing to a Word program may be first created, with which a Word file may be determined from among a plurality of files of different file types.
Further, the primary Word object may be utilized as a parent object to create a next child object, i.e., a secondary Word object is created under the primary Word object, which may be used to read or extract the determined Word file.
Based on this, the secondary Word object may be utilized as a parent object to create child objects of the secondary Word object, i.e., a second character object that may be used to read or extract character content in the Word file and a second image object that may be used to read or extract image content in the Word file.
Further, in an Excel file, the text content thereof may also include characters and images, on the basis of which a third character object and a third image object with respect to the Excel file may be set.
Wherein the third character object points to a character in the text content of the Excel file and the third image object points to an image in the text content of the Excel file.
Specifically, a primary Excel object pointing to an Excel program may be first created, with which an Excel file may be determined from among a plurality of files of different file types.
Further, the primary Excel object may be utilized as a parent object to create a next child object, i.e., a secondary Excel object under the primary Excel object, which may be used to read or extract the determined Excel file, i.e., the Excel workbook.
Based on this, the secondary Excel object may be used as a parent object to create a secondary object of the secondary Excel object, that is, a third character object and a third image object, which may be used to traverse each cell in the Excel workbook and read or extract the valid cell therein, and further, the third character object may read or extract the character content in the valid cell, and the third image object may read or extract the image content in the valid cell.
Further, in the PPT file, the text content thereof may also include characters and images, on the basis of which a fourth character object and a fourth image object with respect to the PPT file may be set.
Wherein the fourth character object points to a character in the text content of the PPT file and the fourth image object points to an image in the text content of the PPT file.
Specifically, a primary PPT object may be first created that points to a PPT program, with which PPT objects may be used to determine PPT files among a plurality of files of different file types.
Further, the primary PPT object may be utilized as a parent object to create a next child object, i.e., a secondary PPT object under the primary PPT object, which may be used to read or extract the determined PPT file.
Based on this, the secondary PPT object may be utilized as a parent object to create child objects of the secondary PPT object, i.e., a fourth character object and a fourth image object, which may be used to traverse the respective slides in the PPT file, and further, the fourth character object may read or extract character content in the text box, and the fourth image object may read or extract image content in the PPT file.
Further, in the PDF file, the text content thereof may also include characters and images, on the basis of which a fifth character object and a fifth image object with respect to the PDF file may be set.
Wherein the fifth character object points to a character in the text content of the PDF file and the fifth image object points to an image in the text content of the PDF file.
Specifically, a primary PDF object pointing to a PDF program may be first created with which PDF files may be determined among a plurality of files of different file types.
Further, the primary PDF object may be utilized as a parent object to create a next child object, i.e., a secondary PDF object under the primary PDF object, which may be used to read or extract the determined PDF file.
Based on this, a child level object of the secondary PDF object, that is, a fifth character object and a fifth image object, which can be used to read or extract each page in the PDF file, can be created using the secondary PDF object as a parent level object, and further, the fifth character object can read or extract character content in each page, and the fifth image object can read or extract image content in each page.
Step S102, constructing a file index set corresponding to a plurality of files, and constructing a target list according to preset search conditions based on the files corresponding to the file index set, wherein the search conditions comprise key contents and file types.
In the embodiment of the application, for all files in the database to be searched, a file index set can be established for the files, and a subset is established based on the file index set so as to be capable of searching in a smaller range when the files are searched.
Specifically, all files in the database to be checked can be scanned to obtain the file name of each file and the storage path corresponding to the file, and the file name and the corresponding storage path are constructed as index entries corresponding to the file.
It can be determined that an index entry containing a corresponding storage path of a file name is formed for each file in the database to be checked, and accordingly, all the index entries can be formed into a file index set.
Wherein the file index set corresponds to all files in the database to be retrieved.
In this embodiment, as shown in fig. 2, when performing file search, step S201 may be performed first, and search conditions may be set.
In particular, the search criteria may include file type and key content; the file type can be set one or more, and the key content can be key characters in the form of characters or key pictures in the form of pictures.
Further, when searching is performed based on the obtained file index set, according to the file type set in the search condition, an index entry consistent with the file type in the search condition can be selected from the file index set, and the index entry can be formed into a target list.
It can be seen that, when the file types set in the search condition are Word type, excel type and PPT type, the files determined correspondingly in the target list are Word file, excel file and PPT file, that is, step S202 in fig. 2, determines the Word file, excel file and PPT file.
Step S103, for each file corresponding to the target list, reads the text content of the file by using the content object corresponding to the file, and searches whether the key content exists in the text content of the file.
In the embodiment of the present application, based on the above-determined target list, the search may be performed in all files corresponding to the target list.
Specifically, based on the created content objects corresponding to the file types, the content objects corresponding to the different file types can be utilized to read out text content from the files, and key content in the search condition is searched in the text content.
In the specific example shown in fig. 2, the second character object may be used to read the character content of each corresponding Word file in the target list, and the second image object may be used to read the image content of each corresponding Word file in the target list; reading character contents of the PPT files corresponding to the target list by using the third character object, and reading image contents of the PPT files corresponding to the target list by using the third image object; and reading the character content of each PPT file corresponding to the target list by using the fourth character object, and reading the image content of each PPT file corresponding to the target list by using the fourth image object.
Step S104, taking the file with the key content in the text content as a candidate file, forming a result list by the file names of the candidate files, displaying, and displaying the text content corresponding to the file name and/or calling the file corresponding to the file name by selecting any file name in the result list.
In an embodiment of the present application,
based on the above, for any file corresponding to the target list, when the text content of the file contains character content and/or image content consistent with the key content, the file can be used as a candidate file, and the candidate file is displayed, so that the user can select the target file from the candidate files.
In the specific example shown in fig. 2, it may also be determined whether the files in each target list have been encrypted before retrieving the files in the target list.
Further, if the file is not encrypted, step S203 may be performed to retrieve text content.
Further, if the file is encrypted, step S204 may be executed to determine whether to preset a decryption password, where the decryption password may be preset in the execution body of the method.
Further, if the decryption password is not preset for the encrypted file, it is considered that the text content cannot be retrieved, and step S205 may be further executed to discard the retrieval.
Further, if the decryption password is preset for the encrypted file, step S203 may be continued to be executed to retrieve the text content of the file.
Further, for the retrieved candidate files, the respective file names may be formed into a result list, and the result list is presented to the user, that is, step S206 is performed, where the file names are displayed in the result list.
In other embodiments, a file name may be included in the search criteria, and the file name may be used to perform a search in the file index set.
In the specific example shown in fig. 2, if a file name is set in the search condition, the search of the file type may be skipped, and the process of constructing the target list may be skipped, and the set file name may be directly searched in the file index set.
Further, if a consistent file name is retrieved in the file index set, the file corresponding to the file name may be listed as a candidate file and displayed in the result list, that is, step S206 may be performed.
Further, in addition to the file names of the candidate files being shown to the user in the result list, a content list may be created, and after the user selects an arbitrary file name from the result list, the text content of the corresponding file is shown in the content list, that is, step S207 is performed to display the text content.
Specifically, the user may complete the selection of the file name by, for example, clicking the left mouse button.
When the text content is displayed, a part, which is consistent with the key content, of the text content can be highlighted.
Further, in addition to displaying the file names of the candidate files and displaying the text content to the user in the result list, the user may select an arbitrary file name from the result list, and approximately the file folder in which the file name corresponds to the file, that is, execute step S08, open the file folder.
Specifically, the user may complete the selection of the file name by, for example, clicking a right mouse button.
Further, the user may also read the corresponding file after selecting an arbitrary file name from the result list, and open the file, that is, execute step S209 to open the file.
Specifically, the user may complete the selection of the file name by, for example, double clicking the left mouse button.
Therefore, the method for searching the file content according to the embodiment of the application creates respective content objects corresponding to each file type based on different file types of the file to be searched, so as to read text content of different file types, realize searching across file types, realize searching from the file index set through the constructed file index set, and not need to search each file one by one, greatly improve efficiency, and simultaneously, perform preliminary screening on the file types of the file to be searched through search conditions, so as to improve efficiency in searching key content, and finally, when candidate files are displayed, text content can be read from the candidate files through the created content objects, thereby realizing display of the text content.
It should be noted that, the method of the embodiment of the present application may be performed by a single device, such as a computer or a server. The method of the embodiment can also be applied to a distributed scene, and is completed by mutually matching a plurality of devices. In the case of such a distributed scenario, one of the devices may perform only one or more steps of the method of an embodiment of the present application, the devices interacting with each other to complete the method.
It should be noted that the foregoing describes some embodiments of the present application. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments described above and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
Based on the same inventive concept, the embodiment of the application also provides a retrieval device of file content, which corresponds to the method of any embodiment.
Referring to fig. 3, the file content retrieving apparatus includes: an object creation module 301, a target list construction module 302, a retrieval module 303, and a presentation module 304;
wherein, the object creating module 301 is configured to create, based on a plurality of preset file types, a content object pointing to text content in a file corresponding to each file type;
the target list construction module 302 is configured to construct a file index set corresponding to a plurality of files, and construct a target list according to preset search conditions based on the files corresponding to the file index set, where the search conditions include key content and file types;
the retrieving module 303 is configured to, for each file corresponding to the target list, read text content of the file by using a content object corresponding to the file, and retrieve whether the key content exists in the text content of the file;
the displaying module 304 is configured to use the file with the key content in the text content as a candidate file, and form and display the file names of the candidate files into a result list, and select any file name in the result list to display the text content corresponding to the file name and/or call the file corresponding to the file name.
For convenience of description, the above devices are described as being functionally divided into various modules, respectively. Of course, the functions of each module may be implemented in the same piece or pieces of software and/or hardware when implementing an embodiment of the present application.
The device of the foregoing embodiment is used to implement the corresponding method for searching the file content in any of the foregoing embodiments, and has the beneficial effects of the corresponding method embodiment, which is not described herein.
Based on the same inventive concept, the embodiment of the application also provides an electronic device corresponding to the method of any embodiment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the method for searching file contents according to any embodiment.
Fig. 4 shows a more specific hardware architecture of an electronic device according to this embodiment, where the device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 implement communication connections therebetween within the device via a bus 1050.
The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit ), microprocessor, application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, etc. for executing relevant programs to implement the technical solutions provided by the embodiments of the present application.
The Memory 1020 may be implemented in the form of ROM (Read Only Memory), RAM (Random Access Memory ), static storage device, dynamic storage device, or the like. Memory 1020 may store an operating system and other application programs, and when the embodiments of the present application are implemented in software or firmware, the associated program code is stored in memory 1020 and executed by processor 1010.
The input/output interface 1030 is used to connect with an input/output module for inputting and outputting information. The input/output module may be configured as a component in a device (not shown in the figure) or may be external to the device to provide corresponding functionality. Wherein the input devices may include a keyboard, mouse, touch screen, microphone, various types of sensors, etc., and the output devices may include a display, speaker, vibrator, indicator lights, etc.
Communication interface 1040 is used to connect communication modules (not shown) to enable communication interactions of the present device with other devices. The communication module may implement communication through a wired manner (such as USB, network cable, etc.), or may implement communication through a wireless manner (such as mobile network, WIFI, bluetooth, etc.).
Bus 1050 includes a path for transferring information between components of the device (e.g., processor 1010, memory 1020, input/output interface 1030, and communication interface 1040).
It should be noted that although the above-described device only shows processor 1010, memory 1020, input/output interface 1030, communication interface 1040, and bus 1050, in an implementation, the device may include other components necessary to achieve proper operation. Furthermore, it will be understood by those skilled in the art that the above-described apparatus may include only the components necessary for implementing the embodiments of the present application, and not all the components shown in the drawings.
The device of the foregoing embodiment is used to implement the corresponding method for searching the file content in any of the foregoing embodiments, and has the beneficial effects of the corresponding method embodiment, which is not described herein.
Based on the same inventive concept, the present application also provides a non-transitory computer readable storage medium corresponding to the method of any of the above embodiments, where the non-transitory computer readable storage medium stores computer instructions for causing the computer to execute the method of retrieving file content according to any of the above embodiments.
The computer readable media of the present embodiments, including both permanent and non-permanent, removable and non-removable media, may be used to implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device.
The storage medium of the foregoing embodiments stores computer instructions for causing the computer to execute the method for searching file contents according to any one of the foregoing embodiments, and has the advantages of the corresponding method embodiments, which are not described herein.
Those of ordinary skill in the art will appreciate that: the discussion of any of the embodiments above is merely exemplary and is not intended to suggest that the scope of the application (including the claims) is limited to these examples; the technical features of the above embodiments or in the different embodiments may also be combined within the idea of the application, the steps may be implemented in any order and there are many other variations of the different aspects of the embodiments of the application as described above, which are not provided in detail for the sake of brevity.
Additionally, well-known power/ground connections to Integrated Circuit (IC) chips and other components may or may not be shown within the provided figures, in order to simplify the illustration and discussion, and so as not to obscure embodiments of the present application. Furthermore, the devices may be shown in block diagram form in order to avoid obscuring embodiments of the present application, and also in view of the fact that specifics with respect to implementation of such block diagram devices are highly dependent upon the platform within which the embodiments of the present application are to be implemented (i.e., such specifics should be well within purview of one skilled in the art). Where specific details (e.g., circuits) are set forth in order to describe example embodiments of the application, it should be apparent to one skilled in the art that embodiments of the application can be practiced without, or with variation of, these specific details. Accordingly, the description is to be regarded as illustrative in nature and not as restrictive.
While the application has been described in conjunction with specific embodiments thereof, many alternatives, modifications, and variations of those embodiments will be apparent to those skilled in the art in light of the foregoing description. For example, other memory architectures (e.g., dynamic RAM (DRAM)) may use the embodiments discussed.
The embodiments of the application are intended to embrace all such alternatives, modifications and variances which fall within the broad scope of the appended claims. Therefore, any omissions, modifications, equivalents, improvements and the like, which are within the spirit and principles of the embodiments of the application, are intended to be included within the scope of the application.

Claims (10)

1. A method for retrieving content of a file, comprising:
creating a content object pointing to text content in a file corresponding to each file type based on a plurality of preset file types;
constructing a file index set corresponding to a plurality of files, and constructing a target list according to preset search conditions based on the files corresponding to the file index set, wherein the search conditions comprise key contents and file types;
for each file corresponding to the target list, reading the text content of the file by using the content object corresponding to the file, and searching whether the key content exists in the text content of the file;
And taking the file with the key content in the text content as a candidate file, forming a result list by the file names of the candidate files, displaying the text content corresponding to the file names and/or calling the file corresponding to the file names by selecting any file name in the result list.
2. The method of claim 1, wherein the file type comprises a TXT type, a Word type, an Excel type, a PPT type, and a PDF type; the files comprise TXT files, word files, excel files, PPT files and PDF files;
creating a content object pointing to text content in a file corresponding to each file type, including:
creating a first character object pointing to characters in a TXT file and creating a first image object pointing to images in the TXT file according to the TXT type, wherein the first character object is used for reading the character content in the TXT file and the first image object is used for reading the image content in the TXT file;
creating a second character object pointing to a character in a Word file and creating a second image object pointing to an image in the Word file, wherein the second character object is used for reading character content in the Word file, and the second image object is used for reading image content in the Word file;
Creating a third character object pointing to characters in an Excel file and creating a third image object pointing to images in the Excel file, wherein the third character object is used for reading character contents in the Excel file, and the third image object is used for reading image contents in the Excel file;
creating a fourth character object pointing to characters in the PPT file and creating a fourth image object pointing to images in the PPT file according to the PPT type, wherein the fourth character object is used for reading character contents in the PPT file, and the fourth image object is used for reading image contents in the PPT file;
and creating a fifth character object pointing to characters in the PDF file and a fifth image object pointing to an image in the PDF file according to the PDF type, wherein the fifth character object is used for reading the character content in the PDF file, and the fifth image object is used for reading the image content in the PDF file.
3. The method of claim 2, wherein creating a second character object that points to a character in the Word file and creating a second image object that points to an image in the Word file comprises:
creating a primary Word object pointing to a Word program, and creating a secondary Word object pointing to a Word file under the primary Word object, wherein the primary Word object is used for determining the Word file, and the secondary Word object is used for reading the Word file;
Creating the second character object and the second image object under the second-level Word object;
the creating a third character object pointing to a character in the Excel file and creating a third image object pointing to an image in the Excel file, including,
creating a primary Excel object pointing to an Excel program, and creating a secondary Excel object pointing to an Excel file under the primary Excel object, wherein the primary Excel object is used for determining the Excel file, and the secondary Excel object is used for reading the Excel file;
creating the third character object and the third image object under the secondary Excel object;
the creating a fourth character object that points to a character in the PPT file and creating a fourth image object that points to an image in the PPT file, including,
creating a primary PPT object pointing to a PPT program, and creating a secondary PPT object pointing to a PPT file under the primary PPT object, wherein the primary PPT object is used for determining the PPT file, and the secondary PPT object is used for reading the PPT file;
creating the fourth character object and the fourth image object under the secondary PPT object;
the creating a fifth character object pointing to a character in the PDF file and creating a fifth image object pointing to an image in the PDF file includes,
Creating a primary PDF object pointing to a PDF program, and creating a secondary PDF object pointing to a PDF file under the primary PDF object, wherein the primary PDF object is used for determining the PDF file, and the secondary PDF object is used for reading each page in the PDF file;
and creating the fifth character object and the fifth image object under the secondary PDF object.
4. The method of claim 1, wherein building a file index set corresponding to a plurality of files comprises:
scanning all files in a database to be retrieved, and constructing the file name of each file and a corresponding file storage path as an index entry corresponding to the file;
and forming all index entries into a file index set corresponding to all files in the database.
5. The method of claim 4, wherein the constructing the target list based on the files corresponding to the file index set according to the preset search condition includes:
determining a file with the same file type as the file type in the retrieval condition from the files corresponding to the file index set;
and forming index entries corresponding to the files with the same file types into the target list.
6. The method of claim 1, wherein said retrieving whether said key content exists in the text content of said document further comprises:
judging whether each file is encrypted or not;
judging whether a decryption password is available or not in response to determining that any file is encrypted;
decrypting with the decryption password in response to determining that the decryption password is provided;
in response to determining that the decryption password is not available, the file is not retrieved.
7. The method of claim 1, wherein the search criteria further comprises a filename;
after the file index set corresponding to the plurality of files is constructed, the method further comprises the following steps:
and in response to the file name included in the search condition, searching the file index set for the file with the file name consistent with the file name in the search condition, and taking the file as the candidate file.
8. A retrieval device for file contents, comprising: the system comprises an object creation module, a target list construction module, a retrieval module and a display module;
the object creation module is configured to create a content object pointing to text content in a file corresponding to each file type based on a plurality of preset file types;
The target list construction module is configured to construct a file index set corresponding to a plurality of files, and construct a target list according to preset retrieval conditions based on the files corresponding to the file index set, wherein the retrieval conditions comprise key contents and file types;
the retrieval module is configured to, for each file corresponding to the target list, read text content of the file by using a content object corresponding to the file, and retrieve whether the key content exists in the text content of the file;
the display module is configured to take the file with the key content in the text content as a candidate file, form a result list with the file names of the candidate files, display the text content corresponding to the file names and/or call the file corresponding to the file names by selecting any file name in the result list.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable by the processor, wherein the processor implements the method of any one of claims 1 to 7 when executing the computer program.
10. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method of any one of claims 1 to 7.
CN202311468353.XA 2023-11-07 2023-11-07 File content retrieval methods, devices, storage media and electronic equipment Pending CN117216006A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311468353.XA CN117216006A (en) 2023-11-07 2023-11-07 File content retrieval methods, devices, storage media and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311468353.XA CN117216006A (en) 2023-11-07 2023-11-07 File content retrieval methods, devices, storage media and electronic equipment

Publications (1)

Publication Number Publication Date
CN117216006A true CN117216006A (en) 2023-12-12

Family

ID=89044736

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311468353.XA Pending CN117216006A (en) 2023-11-07 2023-11-07 File content retrieval methods, devices, storage media and electronic equipment

Country Status (1)

Country Link
CN (1) CN117216006A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070244867A1 (en) * 2006-04-13 2007-10-18 Tony Malandain Knowledge management tool
CN101223517A (en) * 2005-07-15 2008-07-16 微软公司 Smart Container Indexing and Searching
US20100228711A1 (en) * 2009-02-24 2010-09-09 Microsoft Corporation Enterprise Search Method and System
CN112328548A (en) * 2021-01-05 2021-02-05 统信软件技术有限公司 A file retrieval method and computing device
CN114153791A (en) * 2021-10-14 2022-03-08 北京鸿合爱学教育科技有限公司 File fast retrieval method and device, electronic equipment and storage medium
CN114297143A (en) * 2021-12-29 2022-04-08 深圳市绿联科技股份有限公司 A method for searching files, a method, device and mobile terminal for displaying files

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101223517A (en) * 2005-07-15 2008-07-16 微软公司 Smart Container Indexing and Searching
US20070244867A1 (en) * 2006-04-13 2007-10-18 Tony Malandain Knowledge management tool
US20100228711A1 (en) * 2009-02-24 2010-09-09 Microsoft Corporation Enterprise Search Method and System
CN112328548A (en) * 2021-01-05 2021-02-05 统信软件技术有限公司 A file retrieval method and computing device
CN114153791A (en) * 2021-10-14 2022-03-08 北京鸿合爱学教育科技有限公司 File fast retrieval method and device, electronic equipment and storage medium
CN114297143A (en) * 2021-12-29 2022-04-08 深圳市绿联科技股份有限公司 A method for searching files, a method, device and mobile terminal for displaying files

Similar Documents

Publication Publication Date Title
US20240143904A1 (en) Human-Computer Interaction Method And Apparatus Thereof
US20200226133A1 (en) Knowledge map building system and method
CN114356851B (en) Data file storage method and device, electronic equipment and storage medium
CN102272784A (en) Method, apparatus and computer program product for providing analysis and visualization of content item associations
CN106648569B (en) Target serialization realization method and device
CN114153791A (en) File fast retrieval method and device, electronic equipment and storage medium
JP2020074193A (en) Search method, device, facility, and non-volatile computer memory
JP2015106347A (en) Recommendation device and recommendation method
CN105069175A (en) Information retrieval method and server based on version control system
CN112347324A (en) Document query method and device, electronic equipment and storage medium
US20160179821A1 (en) Searching Inside Items
CN117216006A (en) File content retrieval methods, devices, storage media and electronic equipment
CN105991400B (en) Group searching method and device
JP2014120178A (en) System and method for translation between chinese traditional character and chinese simplified character
CN117435589A (en) Data transfer method, device, computer equipment, storage medium
CN116822536A (en) Quick batch extraction translation method and device, electronic equipment and storage medium
CN112100132B (en) Deleted file type identification method and device, electronic equipment and storage medium
CN116628280A (en) A similar data search method, system and electronic equipment based on orientation coding
JP2013045190A (en) Multifunctional information terminal and method for operating the same
US11080319B2 (en) System and method for providing image search result online using device information
CN114722145B (en) Knowledge base retrieval method, device, computing equipment and medium
CN113204639B (en) Document online playing method and device, computing equipment and readable storage medium
CN114860714B (en) Data assembly method and related equipment
CN104951388B (en) A kind of method and device for showing folder information
KR102051507B1 (en) Method for providing web browser and terminal device using the same

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20231212

RJ01 Rejection of invention patent application after publication