[go: up one dir, main page]

WO2019023953A1 - Video editing method and video editing system based on intelligent terminal - Google Patents

Video editing method and video editing system based on intelligent terminal Download PDF

Info

Publication number
WO2019023953A1
WO2019023953A1 PCT/CN2017/095540 CN2017095540W WO2019023953A1 WO 2019023953 A1 WO2019023953 A1 WO 2019023953A1 CN 2017095540 W CN2017095540 W CN 2017095540W WO 2019023953 A1 WO2019023953 A1 WO 2019023953A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
portrait
feature
character
picture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2017/095540
Other languages
French (fr)
Chinese (zh)
Inventor
覃桐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Transsion Communication Co Ltd
Original Assignee
Shenzhen Transsion Communication Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Transsion Communication Co Ltd filed Critical Shenzhen Transsion Communication Co Ltd
Priority to PCT/CN2017/095540 priority Critical patent/WO2019023953A1/en
Publication of WO2019023953A1 publication Critical patent/WO2019023953A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs

Definitions

  • the present invention relates to the field of smart devices, and in particular, to a video editing method and a video editing system based on a smart terminal.
  • the image recognition algorithm distinguishes the back of the target person from the front picture, separately compares and provides a method of video editing in combination with the user's needs, and is applicable not only to the video program but also to the life of the user.
  • Video you can edit a video containing a specific character, or you can edit a video that contains multiple characters, or a video that does not contain a certain character, and provide a video containing the back of the target person, saving time and accuracy. And by filtering the extracted content by interacting with the user, again improving the accuracy.
  • an object of the present invention is to provide a video editing method and a video editing system based on a smart terminal, which can perform a dedicated video clip with or without a certain character or a plurality of characters according to the needs of the user, and Fast and convenient, high precision and time saving.
  • the invention discloses a video editing method based on a smart terminal, comprising the following steps:
  • the step of acquiring a person portrait picture having a portrait element and extracting the person portrait feature of the person portrait element comprises:
  • the body contour feature of the character portrait element is extracted as a first body contour feature
  • the facial portrait feature of the person portrait element is extracted as a first facial portrait feature
  • the step of acquiring the video clip of the character to be clipped that matches the character portrait feature comprises:
  • the video clip or the remaining video clips other than the video clip in the video to be clipped are spelled
  • the steps to follow include:
  • the step of acquiring a video segment of the character in the video to be clipped that matches the character portrait feature is splicing the remaining video segments other than the video segment in the video segment or the video to be clipped between the steps, the video editing method further includes:
  • the video clip is pushed to the user and filtered by the user to remove irrelevant video clips.
  • the invention also discloses a video editing system based on a smart terminal, comprising:
  • a video acquisition module which acquires a video file to be edited and stores the video file in the smart terminal
  • a portrait feature extraction module acquiring a portrait image of a person having a portrait element, and extracting a portrait feature of the portrait element
  • a video segment acquisition module configured to connect to the video acquisition module and the portrait feature extraction module, and acquire a video segment of the to-be-edited video that includes a character that matches the character portrait feature;
  • the video splicing module is connected to the video segment obtaining module to splicing the video clip or the remaining video segments except the video segment in the video to be clipped.
  • the portrait feature extraction module comprises:
  • a picture obtaining unit which acquires a portrait picture of a person having a portrait element and stores it in the smart terminal;
  • a portrait element identification unit connected to the picture acquisition unit to identify a person portrait element in the person portrait picture
  • a portrait feature extraction unit coupled to the portrait element recognition unit, extracting a body profile feature of the person portrait element as a first body profile feature, and extracting a face portrait feature of the person portrait element as a first face portrait feature .
  • the video segment obtaining module includes:
  • An element extracting unit splits the video to be clipped, obtains a frame of each frame, and extracts a pair of portrait elements in the frame of each frame, including a character back element and a character face element;
  • a feature extraction unit connected to the element extraction unit, extracting a body contour feature of the character back image element as a second body shape contour feature, and extracting a face portrait feature of the character face element as a second face portrait feature;
  • a back image acquisition unit connected to the feature extraction unit, and comparing the second body contour feature with the first body contour feature, and acquiring the second when the similarity is greater than or equal to the first similarity threshold
  • the picture corresponding to the figure outline feature is a character back picture
  • a front view acquiring unit connected to the feature extracting unit, comparing the second facial portrait feature with the first facial portrait feature, and acquiring the second facial when the similarity is greater than or equal to a second similarity threshold
  • the picture corresponding to the portrait feature is the front view of the character
  • the cutting unit is connected to the back view picture acquiring unit and the front picture acquiring unit, and cuts the character back picture from the character front picture from the to-be-edited video to form a video segment.
  • the video splicing module comprises:
  • a separating unit separating audio information and video information in the video segment to form an audio portion and a video portion
  • a splicing unit connected to the separating unit, splicing the audio portion and the video portion separately to form a complete audio portion and a complete video portion;
  • a synchronization unit coupled to the tiling unit to synchronize the complete audio portion with the complete video portion.
  • the video editing system further includes:
  • a video clip screening module that pushes the video clip to a user, and the user filters the video to eliminate irrelevant video Fragment.
  • the back of the target person can be identified to obtain a video clip containing or not including a character or a plurality of characters;
  • FIG. 1 is a flow chart showing a video editing method in accordance with a preferred embodiment of the present invention
  • FIG. 2 is a flow chart showing a method for extracting a portrait feature of a video editing method in accordance with a preferred embodiment of the present invention
  • FIG. 3 is a flow chart showing a method for acquiring a video clip by a video editing method in accordance with a preferred embodiment of the present invention
  • FIG. 4 is a schematic flow chart of a method for splicing a video clip or a video clip other than a video clip in a video clip or a video to be clipped according to a preferred embodiment of the present invention
  • FIG. 5 is a schematic flow chart of a video editing method according to another preferred embodiment of the present invention.
  • Figure 6 is a block diagram showing the structure of a video editing system in accordance with a preferred embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of a portrait feature extraction module of a video editing system in accordance with a preferred embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of a video clip acquiring module of a video editing system in accordance with a preferred embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of a video splicing module of a video editing system in accordance with a preferred embodiment of the present invention.
  • FIG. 10 is a schematic structural diagram of a system of a video editing system in accordance with another preferred embodiment of the present invention.
  • the mobile terminal can be implemented in various forms.
  • the terminal described in the present invention may include a mobile terminal such as a mobile phone, a smart phone, a notebook computer, a PDA (Personal Digital Assistant), a PAD (Tablet), a PMP (Portable Multimedia Player), a navigation device, and the like, and such as Fixed terminal for digital TV, desktop computer, etc.
  • a mobile terminal such as a mobile phone, a smart phone, a notebook computer, a PDA (Personal Digital Assistant), a PAD (Tablet), a PMP (Portable Multimedia Player), a navigation device, and the like, and such as Fixed terminal for digital TV, desktop computer, etc.
  • PDA Personal Digital Assistant
  • PAD Tablet
  • PMP Portable Multimedia Player
  • FIG. 1 is a schematic flowchart of a video clip method based on a smart terminal according to a preferred embodiment of the present invention.
  • the video editing method specifically includes the following steps:
  • S100 Acquire a video file to be edited and store it in the smart terminal.
  • the video file to be edited In order to implement the video clip, the video file to be edited must first be obtained.
  • the method for obtaining the video file to be edited includes importing the video in the smart terminal, and importing the video from outside the smart terminal and storing it in the smart terminal.
  • the video file to be edited imported here must contain the target person that the user wants to edit. If the user imports the video error, that is, does not include the target person who wants to edit, there will be no result when the subsequent image recognition acquires the video clip, and Remind users that they have not obtained the relevant video. Please check if the clip video file or the target portrait image is imported incorrectly.
  • the method of obtaining a portrait picture includes not only importing the picture in the smart terminal, but also importing the picture from outside the smart terminal and storing it in the smart terminal.
  • the portrait image imported here must be closely integrated with the user's needs. If the user needs a frontal portrait video clip of the target person, the user-provided portrait image must contain the facial element of the target person, if the user needs a video containing the target person's back. Fragment, the user-provided portrait image must contain the back of the target character.
  • the user-provided portrait image must also be a picture containing the target person alone; when the number of target characters is greater than one
  • the user needs to provide the corresponding portrait image of the target person, and no other person except the target person may appear in all the pictures, but may be a portrait picture of the person who simultaneously includes only a plurality of target characters.
  • the characters of the picture in the video need to be compared according to the extracted portrait feature to obtain a video segment containing the character corresponding to the extracted portrait feature.
  • it is necessary to establish a strategy according to the needs of the user including the distinction between the frontal image of the target person and the back image of the target person, the distinction between the number of target characters, and the distinction between the target person and other characters.
  • the video clip that the user needs is a screen containing only the front portrait of the target person, only the front portrait of the target person needs to be obtained. If the video clip that the user needs is only the image of the target person's back, only the target person needs to be acquired.
  • the picture of the back if both are needed, there should be a logical OR relationship between the picture containing the frontal portrait of the target person and the picture containing the back of the target person.
  • the relationship between the characters needs to be considered.
  • the relationship between the portrait features of each target character should be logical and when the user needs When any of the target characters appears, the relationship between the portrait features of each target person should be logical or.
  • the logical relationship between the target characters can be determined by the user's needs, such as two of them. Logic and relationship, the other is a logical OR relationship with the two. Take TV dramas as an example.
  • the two characters should appear at the same time.
  • the relationship between the two should be logical and the only picture of both characters appears. Both the back and the front.
  • the number of characters extracted in the picture should be consistent with the number of target characters in the picture.
  • the process of obtaining a video clip of a character to be clipped that matches the character of the target person is as follows: the clipped video is framed, and each frame is acquired, through image transform technology, image enhancement technology, image recognition technology, and image segmentation technology. Extracting the portrait elements in each frame, and extracting the portrait features in the portrait elements by sampling, and comparing them with the portrait elements extracted from the target figures, if the two match, the picture That is, the picture containing the target person, the connected picture forms a video segment.
  • the clip After obtaining the video clip of the character to be clipped that matches the character of the target person, the clip needs to be stitched, and the stitching should be in a certain order, either in chronological order or according to the characters in the screen. More or less in the order, in which the change of characters from small to many is the change from only the target person to the other person, from the opposite to the other, it can also be small according to the proportion of the target person in the video picture. In the order of large or small to large, the latter two sequences should be supplemented by chronological order. For example, the order of the target person's proportion of the video screen is taken as an example.
  • the ratio of the character to the video screen can be calculated by dividing the area of the portrait element in the video screen by the area of the video screen, and calculating the ratio. It should be done after identifying each frame of the picture.
  • the proportion of the character of a certain picture is relatively high, if the picture is in a selected video segment, the video segment is regarded as one body, and the time ratio is spliced regardless of the proportion of the other pictures in the video segment.
  • the user can quickly and accurately obtain the exclusive video with or without a certain character or a plurality of characters according to the user's needs, and can accurately identify the video segment containing the back of the target person.
  • the step of acquiring a portrait image of a person with a portrait element and extracting the portrait feature of the portrait element includes:
  • S201 Acquire a portrait picture of a person having a portrait element and store it in the smart terminal.
  • the method of obtaining the portrait image includes both importing the image in the smart terminal and importing from outside the smart terminal.
  • the picture is stored in the smart terminal.
  • image transformation such as Fourier transform, Wo The Ershi-Adama transform and the discrete Kafner-Levy transform transform the image from the time domain to the frequency domain, and then enhance the high-frequency abrupt components in the frequency domain image by image enhancement technology to enhance the edge of the image and enhance the edge of the image.
  • image recognition technology is needed to extract the portrait elements in the picture by extracting features, establishing indexes, and query steps, and extracting the portrait elements by image segmentation technology.
  • the extraction feature operation here is based on the external corresponding portrait database, and the recognition model of different portrait elements is established by sampling the portrait elements in the portrait database to distinguish different portrait elements, such as a large number of facial portraits in the database.
  • the recognition model for creating a facial portrait is sampled, and the model is used for recognition in the process of recognizing a portrait.
  • the portion is considered to be a facial portrait element.
  • the portrait feature needs to be extracted.
  • the portrait image should be distinguished.
  • the outline feature of the character's back should be extracted, including the body contour.
  • the proportion of each part and other features form a first figure outline feature;
  • the facial portrait features of the frontal portrait of the person should be extracted, including facial skin color, facial features, size, position distance Relationships and features with recognizable features on the face, such as black squats in the corners of the mouth, can also be uniformly sampled on the face and the size of the face portrait in the picture.
  • acquiring the video to be edited includes matching the character portrait feature.
  • the steps of the video clip of the character may specifically include:
  • S301 split the video to be clipped, obtain a frame of each frame, and extract a to-be-obtained portrait element in each frame, including a character back element and a character face element.
  • the video In order to identify the portrait of a person in a clip video, it is necessary to first frame the video to form a frame of one frame and one frame, and extract the portrait element in each frame, first through image transformation, such as Fourier transform.
  • image transformation such as Fourier transform.
  • the Walsh-Hadamard transform and the discrete Kafner-Levy transform transform the image from the time domain to the frequency domain, and then enhance the high-frequency abrupt component in the frequency domain image by image enhancement technology to enhance the image edge and image edge.
  • image enhancement technology After being enhanced, it is necessary to identify the portrait elements in the image by extracting features, indexing and query steps through image recognition technology, and finally extracting the portrait elements by image segmentation technology, and the extracted portrait elements include the characters of the characters and the facial elements of the characters. .
  • the extraction feature operation here is based on the external corresponding portrait database, and the recognition model of different portrait elements is established by sampling the portrait elements in the portrait database to distinguish different portrait elements, such as a large number of facial portraits in the database.
  • the recognition model for creating a facial portrait is sampled, and the model is used for recognition in the process of recognizing a portrait.
  • the portion is considered to be a facial portrait element.
  • the extraction method here should be The method of extracting the first figure outline feature of the character is consistent; after extracting the face part of the character in the picture of the video to be edited, the face part of the character is sampled, and the facial skin color, the facial features shape, the position distance relationship, and the face are extracted.
  • the step of deleting the sunglasses portion includes forming features such as contour lines and colors by sampling a large number of sunglasses database.
  • the constructed sunglasses model is indexed by the sunglasses model indexing.
  • S303 Align the second body contour feature with the first body contour feature, and obtain a picture corresponding to the second body contour feature when the similarity is greater than or equal to the first similarity threshold.
  • the first body contour feature is indexed according to the first body contour feature, and the second body contour feature is scaled to the same size as the first body contour feature, and according to the index pair
  • the scaled second body contour feature is sampled and searched, such as whether the body contour line is consistent, whether the proportion of each part of the body is consistent, etc., and the first similarity threshold is 90%, and the coincidence degree is greater than or equal to the first similarity.
  • the threshold value is considered to be greater than or equal to the first similarity threshold value, and the person corresponding to the second figure-shaped contour feature and the first figure-shaped contour feature is considered to be The same person obtains a picture corresponding to the contour feature of the second figure, and is a picture of the character's back.
  • the standard of the first similarity threshold can be adjusted up and down to meet a certain recognition accuracy. Since the back recognition is difficult and easy to identify errors, setting a higher similarity threshold is beneficial to improve the accuracy. To prevent the missing picture when the similarity threshold is too high, the similarity can be set between 85% and 90%. This screen pops up, and the user selects whether to obtain the screen, thereby reducing the omission of the screen.
  • S304 Align the second facial portrait feature with the first facial portrait feature, and obtain a screen corresponding to the second facial portrait feature when the similarity is greater than or equal to the second similarity threshold.
  • indexing is performed according to the first facial portrait feature, and the second facial portrait feature is scaled to the same size as the first facial portrait feature, and is scaled according to the index pair
  • the second facial portrait feature is compared, such as whether the facial skin color is the same, whether the facial shape shape, the position distance relationship are consistent, and the facial recognition feature, such as whether the black corner of the mouth is the same, etc., and the second similarity threshold is taken.
  • the second facial portrait feature is considered to be the first facial portrait
  • the similarity of the feature is greater than or equal to the second similarity threshold, and the person corresponding to the second facial portrait feature is considered to be the same person as the first facial portrait feature, and the image corresponding to the second facial portrait feature is obtained as the front of the character.
  • the standard of the second similarity threshold can be adjusted up and down to meet a certain recognition accuracy. Since the facial portrait features are more and the alignment is more accurate, the facial portrait feature comparison threshold, that is, the second similarity threshold, is slightly lower than the body contour feature comparison threshold, that is, the first similarity threshold.
  • the screen can be popped up when the similarity is between 80% and 85%. The user selects whether to obtain the picture, thereby reducing the omission of the picture.
  • the continuous frame picture is regarded as an integral cut from the video to be clipped to form a video segment.
  • the step of splicing the video clips or the remaining video clips other than the video clips in the video to be clipped includes:
  • S401 Separating the audio information and the video information in the remaining video segments except the video segment in the video clip or the video to be clipped to form an audio portion and a video portion.
  • the audio information in each clip needs to be separated from the video information, where the video information does not include audio information, and the audio information needs to be recorded.
  • the positional relationship of the video information, and the audio information and the video information are extracted to form an audio part and a video part.
  • the audio portion and the video portion unit need to be sequentially stitched together to form a complete audio portion composed entirely of the audio portion and a complete video portion composed entirely of the video portion.
  • the complete audio portion is synchronized with the complete video portion according to the recorded positional relationship between the audio information and the video information to form a final complete video.
  • S500 Push the video clip to a user, and the user performs screening to remove irrelevant video clips.
  • the user can perform screening by pushing the obtained video clip to the user, and can perform the delete operation to eliminate the discriminating unrelated video clip.
  • a smart terminal-based video editing system 100 in accordance with a preferred embodiment of the present invention specifically includes the following components:
  • the video capture module 11 In order to implement the video clip, the video capture module 11 must first obtain the video file to be edited, and the method for obtaining the video file to be edited includes not only importing the video in the smart terminal, but also importing the video from outside the smart terminal and storing it in the smart terminal. Inside.
  • the video file to be edited imported here must contain the target person that the user wants to edit. If the user imports the video error, that is, does not include the target person who wants to edit, there will be no result when the subsequent image recognition acquires the video clip, and Remind users that they have not obtained the relevant video. Please check if the clip video file or the target portrait image is imported incorrectly.
  • the portrait feature extraction module 13 is configured to acquire a character portrait image having a portrait element and extract a character portrait of the portrait element in the image after the video to be clipped is acquired. Sign.
  • the method of obtaining a portrait picture includes not only importing the picture in the smart terminal, but also importing the picture from outside the smart terminal and storing it in the smart terminal.
  • the portrait image imported here must be closely integrated with the user's needs. If the user needs a frontal portrait video clip of the target person, the user-provided portrait image must contain the facial element of the target person, if the user needs a video containing the target person's back. Fragment, the user-provided portrait image must contain the back of the target character.
  • the user-provided portrait image must also be a picture containing the target person alone; when the number of target characters is greater than one
  • the user needs to provide the corresponding portrait image of the target person, and no other person except the target person may appear in all the pictures, but may be a portrait picture of the person who simultaneously includes only a plurality of target characters.
  • the video segment obtaining module 12 is connected to the video acquiring module 11 and the portrait feature extracting module 13 to obtain a to-be-edited video file and a portrait image and extract the character portrait feature, and then needs to be based on the extracted character portrait feature in the video.
  • the characters of the screen are compared to obtain a video clip containing the character corresponding to the extracted portrait feature.
  • it is necessary to establish a strategy according to the needs of the user including the distinction between the frontal image of the target person and the back image of the target person, the distinction between the number of target characters, and the distinction between the target person and other characters.
  • the video clip that the user needs is a screen containing only the front portrait of the target person, only the front portrait of the target person needs to be obtained.
  • the video clip that the user needs is only the image of the target person's back, only the target person needs to be acquired.
  • the picture of the back if both are needed, there should be a logical OR relationship between the picture containing the frontal portrait of the target person and the picture containing the back of the target person.
  • the relationship between the characters needs to be considered.
  • the relationship between the portrait features of each target character should be logical and when the user needs When any of the target characters appears, the relationship between the portrait features of each target person should be logical or.
  • the logical relationship between the target characters can be determined by the user's needs, such as two of them.
  • the process of obtaining a video clip of a character to be clipped that matches the character of the target person is as follows: the clipped video is framed, and each frame is acquired, through image transform technology, image enhancement technology, image recognition technology, and image segmentation technology. Extracting the portrait elements in each frame, and extracting the portrait features in the portrait elements by sampling, and comparing them with the portrait elements extracted from the target figures, if the two match, the picture That is, the picture containing the target person, the connected picture forms a video segment.
  • the video splicing module 14 is connected to the video segment obtaining module 12, and after acquiring the video segment of the character to be clipped that matches the character of the target person, the spliced segment needs to be spliced, and the splicing should be in a certain order. It can be in chronological order or in the order of the characters in the picture from less to more or from more to less. The change of characters from small to many is the change from only the target person to other people, from more to less. Conversely, the order of the target characters in the video screen may be in the order of small to large or large to small, and the latter two sequences shall be supplemented by chronological order. For example, the order of the target person's proportion of the video screen is taken as an example.
  • the ratio of the character to the video screen can be calculated by dividing the area of the portrait element in the video screen by the area of the video screen, and calculating the ratio. It should be done after identifying each frame of the picture.
  • the proportion of the character of a certain picture is high, if the picture is in a selected video segment, the video segment is regarded as one body, regardless of the proportion of other pictures in the video segment. Low, all spliced in chronological order, preventing the splicing of the picture by the ratio splicing to cause the picture to be broken. It is equivalent to comparing the maximum ratio of the pictures in the video clip, and splicing in the order of the largest ratio. At the same time, when the ratio is the same, it is also spliced in chronological order.
  • the portrait feature extraction module 13 specifically includes:
  • the image acquisition unit obtains the video to be edited, in order to realize the video clip centered on the target person, it is necessary to obtain the portrait image of the person having the portrait element, and the manner of obtaining the portrait image includes both the image imported into the smart terminal and the The smart terminal externally imports the picture and stores it in the smart terminal.
  • the portrait element identification unit is connected to the picture acquisition unit, and after acquiring the person portrait picture having the person portrait element, since there may be a background with interference factors in the picture, the person portrait element in the picture needs to be extracted, here Firstly, the image is transformed from the time domain to the frequency domain by image transform, such as Fourier transform, Walsh-Hadamard transform and discrete Kafner-Levy transform, and then the image in the frequency domain image is high.
  • image transform such as Fourier transform, Walsh-Hadamard transform and discrete Kafner-Levy transform
  • the frequency mutation component is strengthened, and the edge of the image is strengthened.
  • the image recognition technology is used to extract the character portrait element in the image by extracting the feature, establishing the index build and the query step, and extracting the portrait element by the image segmentation technology.
  • the extraction feature operation here is based on the external corresponding portrait database, and the recognition model of different portrait elements is established by sampling the portrait elements in the portrait database to distinguish different portrait elements, such as a large number of facial portraits in the database.
  • the recognition model for creating a facial portrait is sampled, and the model is used for recognition in the process of recognizing a portrait.
  • the portion is considered to be a facial portrait element.
  • a portrait feature extraction unit is connected to the portrait element recognition unit, and after extracting the portrait element, the portrait feature needs to be extracted, where the portrait image should be distinguished, and when the portrait element in the portrait image is the back of the character,
  • the silhouette of the figure of the character should be extracted, including the contour of the body, the proportion of each part, etc., to form the first figure outline feature; when the portrait element in the portrait picture is the front portrait of the character, the front portrait of the character should be extracted.
  • the facial portrait features including the facial skin color, the size of the facial features, the positional distance relationship, and the features of the facial recognition, such as the black scorpion of the corner of the mouth, can also be uniformly sampled on the face and record the size of the face portrait in the picture.
  • the video segment obtaining module 12 specifically includes:
  • the element extraction unit in order to recognize the portrait of the person in the clip video, needs to first frame the video to form a frame of one frame and one frame, and extract the portrait element in each frame, first through image transformation, such as Fourier transform, Walsh-Hadamard transform and discrete Kafner-Levy transform transform the image from time domain to frequency domain, and then enhance the high frequency mutation component in the frequency domain image by image enhancement technology to enhance the image.
  • image transformation such as Fourier transform, Walsh-Hadamard transform and discrete Kafner-Levy transform transform the image from time domain to frequency domain
  • image enhancement technology to enhance the image.
  • Edge after the image edge is strengthened, it is necessary to identify the portrait element in the image by extracting features, building an index build and query step through image recognition technology, and finally extracting the portrait element by image segmentation technology, and the extracted character portrait element includes the character back view.
  • image transformation such as Fourier transform, Walsh-Hadamard transform and discrete Kafner-Levy transform transform the image from time domain to frequency domain
  • the extraction feature operation here is based on the external corresponding portrait database, and the recognition model of different portrait elements is established by sampling the portrait elements in the portrait database to distinguish different portrait elements, such as a large number of facial portraits in the database.
  • the recognition model for creating a facial portrait is sampled, and the model is used for recognition in the process of recognizing a portrait.
  • the portion is considered to be a facial portrait element.
  • the feature extraction unit is connected to the element extraction unit, and after extracting the back view element of the character in the picture of the video to be edited, the back view element of the character is sampled, and the outline of the body, the proportion of each part, and the like are extracted to form a second shape.
  • the outline contour feature, the extraction method here should be consistent with the method of extracting the first figure outline feature of the character; after extracting the face element of the character in the picture of the video to be edited, the face element of the character needs to be sampled, and the extraction includes Facial skin color, facial shape shape, position distance relationship, and facial recognition features, such as the corner of the mouth
  • the facial portrait features of features such as black scorpion form a second facial portrait feature
  • the extraction method here should be consistent with the method of extracting the first facial portrait feature of the character.
  • the target person may wear sunglasses, so it is necessary to delete the sunglasses portion when extracting the facial portrait feature, and only the remaining facial portrait portion is considered.
  • the step of deleting the sunglasses portion includes forming features such as contour lines and colors by sampling a large number of sunglasses database.
  • the constructed sunglasses model is indexed by the sunglasses model indexing.
  • the part matching the sunglasses model is inquired, the part is considered to be the sunglasses part, and the part is deleted.
  • a back image acquisition unit is connected to the feature extraction unit, and after acquiring the second body contour feature and the first body contour feature, the index is established according to the first body contour feature, and the second body contour feature is scaled to The shape of the contour is the same size, and the scaled second contour contour feature is sampled according to the index, such as whether the body contour line is consistent, whether the proportion of each part of the body is consistent, etc., taking the first similarity threshold 90%, when the degree of coincidence is greater than or equal to the first similarity threshold, the similarity between the second body contour feature and the first body contour feature is considered to be greater than or equal to the first similarity threshold, and the second body contour feature is considered to correspond to The character corresponding to the first figure contour feature is the same person, and the picture corresponding to the second body shape contour feature is obtained, which is a character back picture.
  • the standard of the first similarity threshold can be adjusted up and down to meet a certain recognition accuracy. Since the back recognition is difficult and easy to identify errors, setting a higher similarity threshold is beneficial to improve the accuracy. To prevent the missing picture when the similarity threshold is too high, the similarity can be set between 85% and 90%. This screen pops up, and the user selects whether to obtain the screen, thereby reducing the omission of the screen.
  • a front picture acquiring unit connected to the feature extracting unit, after acquiring the second facial portrait feature and the first facial portrait feature, indexing according to the first facial portrait feature, and scaling the second facial portrait feature to the first face
  • the portrait features the same size, and the scaled second facial portrait features are compared according to the index, such as whether the facial skin color is the same, the facial features, the positional distance relationship, and the facial recognition feature, such as the black corner of the mouth.
  • the second similarity threshold is 85%, and when the degree of coincidence is greater than or equal to the second similarity threshold, the similarity between the second facial portrait feature and the first facial portrait feature is considered to be greater than or equal to the second similarity.
  • the degree threshold is that the person corresponding to the second facial portrait feature and the person corresponding to the first facial portrait feature are the same person, and the screen corresponding to the second facial portrait feature is acquired as the front view of the character.
  • the standard of the second similarity threshold can be adjusted up and down to meet a certain recognition accuracy. Since the facial portrait features are more and the alignment is more accurate, the facial portrait feature comparison threshold, that is, the second similarity threshold, is slightly lower than the body contour feature comparison threshold, that is, the first similarity threshold. Similarly, for the comparison of facial portrait features, setting a higher similarity threshold is beneficial to improve the accuracy.
  • the screen can be popped up when the similarity is between 80% and 85%. The user selects whether to obtain the picture, thereby reducing the omission of the picture.
  • the cutting unit is connected to the back image acquiring unit and the front screen acquiring unit, and after acquiring the back image of the character and the front view of the character, whether the image of the adjacent frame is also acquired, when the image of the adjacent frame is acquired.
  • the continuous frame picture is regarded as an integral cut from the video to be clipped to form a video segment.
  • the video splicing module 14 specifically includes:
  • the separating unit after acquiring the video clip or the remaining video clips except the video clip in the video to be clipped, separates the audio information in each segment from the video information, where the video information does not include the audio information, and needs to be recorded.
  • the positional relationship between the audio information and the video information, and the audio information and the video information are extracted to form an audio part and a video part.
  • the audio portion and the video portion unit need to be spliced in order to form a complete audio portion composed entirely of the audio portion and a complete video portion composed entirely of the video portion.
  • the synchronization unit after acquiring the complete audio part and the complete video part, synchronizes the complete audio part with the complete video part according to the recorded position relationship of the audio information and the video information to form a final complete video.
  • the video editing system 100 further includes the following components:
  • the video segment screening module 15 obtains the video segment, and further improves the accuracy. By pushing the obtained video segment to the user, the user performs screening, and the deletion operation may be performed to remove the unrelated video segment that identifies the error.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

提供了一种基于智能终端的视频剪辑方法及视频剪辑系统,视频剪辑方法包括以下步骤:获取待剪辑视频文件,并存储于所述智能终端内;获取具有人物肖像元素的人物肖像图片,并提取所述人物肖像元素的人物肖像特征;获取所述待剪辑视频中包含与所述人物肖像特征相符的人物的视频片段;将所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段进行拼接。采用上述技术方案后,可根据用户的需求结合用户导入的人物图片实现智能剪辑并形成含或不含某一人物或多个人物的完整视频,同时提供交互接口,由用户对提取的视频片段再次筛选,提高准确度。A video editing method based on a smart terminal and a video editing system are provided. The video editing method comprises the steps of: acquiring a video file to be edited and storing in the smart terminal; acquiring a portrait image of a person having a portrait element, and extracting a character portrait feature of the character portrait element; acquiring a video segment of the character to be clipped that matches the character portrait feature; and remaining video of the video clip or the video to be clipped except the video clip The segments are stitched together. After adopting the above technical solution, the smart picture can be realized according to the user's demand combined with the picture of the person imported by the user, and a complete video with or without a certain character or multiple characters can be formed, and an interactive interface is provided, and the extracted video segment is again Screening to improve accuracy.

Description

一种基于智能终端的视频剪辑方法及视频剪辑系统Video editing method and video editing system based on intelligent terminal 技术领域Technical field

本发明涉及智能设备领域,尤其涉及一种基于智能终端的视频剪辑方法及视频剪辑系统。The present invention relates to the field of smart devices, and in particular, to a video editing method and a video editing system based on a smart terminal.

背景技术Background technique

随着视频节目的多元蓬勃发展,电视剧、电影等视频节目成为了人们生活中不可或缺的一部分,而人们对视频节目中的情节、人物都存在一定的偏好,经常出现想看某一演员但由不想看完全剧或整个电影的情况,人们一般都会采取快进的手段,但此方法不仅浪费时间、而且容易错过故事情节。现有技术中存在对感兴趣人物的视频片段进行提取的算法,但准确度不高。With the diversified development of video programs, TV programs, movies and other video programs have become an indispensable part of people's lives. People have certain preferences for plots and characters in video programs, and often want to see an actor but People don't want to watch the whole drama or the whole movie, people generally take the means of fast-forward, but this method is not only a waste of time, but also easy to miss the story. In the prior art, there is an algorithm for extracting video segments of a person of interest, but the accuracy is not high.

因此,本发明通过图像识别算法,将目标人物的背影与正面画面相区分,分别比对,并结合用户的需求提供一种视频剪辑的方法,不仅适用与视频节目,同样适用于用户拍摄的生活视频,可以剪辑包含某一人物的专属视频,也可剪辑同时包含某多个人物的视频,或不包含某一人物的视频,同时可以提供包含目标人物背影的视频,节省时间,并且精确度高,并通过与用户交互对提取内容进行筛选,再次提高精确度。Therefore, the image recognition algorithm distinguishes the back of the target person from the front picture, separately compares and provides a method of video editing in combination with the user's needs, and is applicable not only to the video program but also to the life of the user. Video, you can edit a video containing a specific character, or you can edit a video that contains multiple characters, or a video that does not contain a certain character, and provide a video containing the back of the target person, saving time and accuracy. And by filtering the extracted content by interacting with the user, again improving the accuracy.

发明内容Summary of the invention

为了克服上述技术缺陷,本发明的目的在于提供一种基于智能终端的视频剪辑方法及视频剪辑系统,可根据用户的需求进行包含或不包含某一人物或某多个人物的专属视频剪辑,且快捷方便,精确度高,节省时间。In order to overcome the above technical deficiencies, an object of the present invention is to provide a video editing method and a video editing system based on a smart terminal, which can perform a dedicated video clip with or without a certain character or a plurality of characters according to the needs of the user, and Fast and convenient, high precision and time saving.

本发明公开了一种基于智能终端的视频剪辑方法,包括以下步骤:The invention discloses a video editing method based on a smart terminal, comprising the following steps:

获取待剪辑视频文件,并存储于所述智能终端内;Obtaining a video file to be edited and storing in the smart terminal;

获取具有人物肖像元素的人物肖像图片,并提取所述人物肖像元素的人物肖像特征;Obtaining a portrait image of a person having a portrait element and extracting a portrait feature of the portrait element;

获取所述待剪辑视频中包含与所述人物肖像特征相符的人物的视频片段;Obtaining, in the video to be edited, a video segment of a character that matches the character portrait feature;

将所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段进行拼接。Splicing the video clip or the remaining video clips other than the video clip in the video to be clipped.

优选地,获取具有人物肖像元素的人物肖像图片,并提取所述人物肖像元素的人物肖像特征的步骤包括:Preferably, the step of acquiring a person portrait picture having a portrait element and extracting the person portrait feature of the person portrait element comprises:

获取具有人物肖像元素的人物肖像图片,并存储于所述智能终端内;Obtaining a portrait picture of a person having a portrait element and storing it in the smart terminal;

识别所述人物肖像图片中的人物肖像元素;Identifying a portrait element in the portrait image of the person;

提取所述人物肖像元素的身形轮廓特征为第一身形轮廓特征,提取所述人物肖像元素的面部肖像特征为第一面部肖像特征。The body contour feature of the character portrait element is extracted as a first body contour feature, and the facial portrait feature of the person portrait element is extracted as a first facial portrait feature.

优选地,获取所述待剪辑视频中包含与所述人物肖像特征相符的人物的视频片段的步骤包括:Preferably, the step of acquiring the video clip of the character to be clipped that matches the character portrait feature comprises:

将待剪辑视频进行拆分,获取每一帧画面,提取所述每一帧画面中的待比对人物肖像元素,包括人物背影元素与人物面部元素;Splitting the video to be clipped, obtaining each frame of the picture, and extracting the to-be-obtained portrait elements in the frame of each frame, including the character back element and the character face element;

提取所述人物背影元素的身形轮廓特征为第二身形轮廓特征,提取所述人物面部元素的面部肖像特征为第二面部肖像特征;Extracting a body contour feature of the character back element as a second body shape feature, and extracting a face portrait feature of the character face element as a second face portrait feature;

对所述第二身形轮廓特征与所述第一身形轮廓特征进行比对,获取相似度大于等于第一相似度阈值时所述第二身形轮廓特征对应的画面为人物背影画面;Comparing the second body contour feature with the first body contour feature, and obtaining a picture corresponding to the second body contour feature when the similarity is greater than or equal to the first similarity threshold is a character back image;

对所述第二面部肖像特征与所述第一面部肖像特征进行比对,获取相似度大于等于第二相似度阈值时所述第二面部肖像特征对应的画面为人物正面画面;Comparing the second facial portrait feature with the first facial portrait feature, and acquiring a picture corresponding to the second facial portrait feature when the similarity is greater than or equal to the second similarity threshold is a front view of the character;

从所述待剪辑视频中剪切所述人物背影画面与所述人物正面画面,形成视频片段。Cutting the character back picture from the character front picture from the to-be-edited video to form a video segment.

优选地,将所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段进行拼 接的步骤包括:Preferably, the video clip or the remaining video clips other than the video clip in the video to be clipped are spelled The steps to follow include:

分离所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段中的音频信息与视频信息,形成音频部分与视频部分;Separating the audio information and the video information in the remaining video segments except the video segment in the video clip or the video to be clipped to form an audio portion and a video portion;

将所述音频部分与所述视频部分单独进行拼接形成完整音频部分与完整视频部分;Separating the audio portion and the video portion separately to form a complete audio portion and a complete video portion;

将所述完整音频部分与所述完整视频部分进行同步。Synchronizing the complete audio portion with the full video portion.

优选地,在获取所述待剪辑视频中包含与所述人物肖像特征相符的人物的视频片段的步骤与将所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段进行拼接的步骤之间,所述视频剪辑方法还包括:Preferably, the step of acquiring a video segment of the character in the video to be clipped that matches the character portrait feature is splicing the remaining video segments other than the video segment in the video segment or the video to be clipped Between the steps, the video editing method further includes:

向用户推送所述视频片段,由用户进行筛选,剔除无关的视频片段。The video clip is pushed to the user and filtered by the user to remove irrelevant video clips.

本发明还公开了一种基于智能终端的视频剪辑系统,包括:The invention also discloses a video editing system based on a smart terminal, comprising:

视频获取模块,获取待剪辑视频文件,并存储于所述智能终端内;a video acquisition module, which acquires a video file to be edited and stores the video file in the smart terminal;

肖像特征提取模块,获取具有人物肖像元素的人物肖像图片,并提取所述人物肖像元素的人物肖像特征;a portrait feature extraction module, acquiring a portrait image of a person having a portrait element, and extracting a portrait feature of the portrait element;

视频片段获取模块,与所述视频获取模块及所述肖像特征提取模块连接,获取所述待剪辑视频中包含与所述人物肖像特征相符的人物的视频片段;a video segment acquisition module, configured to connect to the video acquisition module and the portrait feature extraction module, and acquire a video segment of the to-be-edited video that includes a character that matches the character portrait feature;

视频拼接模块,与所述视频片段获取模块连接,将所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段进行拼接。The video splicing module is connected to the video segment obtaining module to splicing the video clip or the remaining video segments except the video segment in the video to be clipped.

优选地,所述肖像特征提取模块包括:Preferably, the portrait feature extraction module comprises:

图片获取单元,获取具有人物肖像元素的人物肖像图片,并存储于所述智能终端内;a picture obtaining unit, which acquires a portrait picture of a person having a portrait element and stores it in the smart terminal;

肖像元素识别单元,与所述图片获取单元连接,识别所述人物肖像图片中的人物肖像元素;a portrait element identification unit connected to the picture acquisition unit to identify a person portrait element in the person portrait picture;

肖像特征提取单元,与所述肖像元素识别单元连接,提取所述人物肖像元素的身形轮廓特征为第一身形轮廓特征,提取所述人物肖像元素的面部肖像特征为第一面部肖像特征。a portrait feature extraction unit, coupled to the portrait element recognition unit, extracting a body profile feature of the person portrait element as a first body profile feature, and extracting a face portrait feature of the person portrait element as a first face portrait feature .

优选地,所述视频片段获取模块包括:Preferably, the video segment obtaining module includes:

元素提取单元,将待剪辑视频进行拆分,获取每一帧画面,提取所述每一帧画面中的待比对人物肖像元素,包括人物背影元素与人物面部元素;An element extracting unit splits the video to be clipped, obtains a frame of each frame, and extracts a pair of portrait elements in the frame of each frame, including a character back element and a character face element;

特征提取单元,与所述元素提取单元连接,提取所述人物背影元素的身形轮廓特征为第二身形轮廓特征,提取所述人物面部元素的面部肖像特征为第二面部肖像特征;a feature extraction unit, connected to the element extraction unit, extracting a body contour feature of the character back image element as a second body shape contour feature, and extracting a face portrait feature of the character face element as a second face portrait feature;

背影画面获取单元,与所述特征提取单元连接,对所述第二身形轮廓特征与所述第一身形轮廓特征进行比对,获取相似度大于等于第一相似度阈值时所述第二身形轮廓特征对应的画面为人物背影画面;a back image acquisition unit, connected to the feature extraction unit, and comparing the second body contour feature with the first body contour feature, and acquiring the second when the similarity is greater than or equal to the first similarity threshold The picture corresponding to the figure outline feature is a character back picture;

正面画面获取单元,与所述特征提取单元连接,对所述第二面部肖像特征与所述第一面部肖像特征进行比对,获取相似度大于等于第二相似度阈值时所述第二面部肖像特征对应的画面为人物正面画面;a front view acquiring unit, connected to the feature extracting unit, comparing the second facial portrait feature with the first facial portrait feature, and acquiring the second facial when the similarity is greater than or equal to a second similarity threshold The picture corresponding to the portrait feature is the front view of the character;

剪切单元,与所述背影画面获取单元及所述正面画面获取单元连接,从所述待剪辑视频中剪切所述人物背影画面与所述人物正面画面,形成视频片段。The cutting unit is connected to the back view picture acquiring unit and the front picture acquiring unit, and cuts the character back picture from the character front picture from the to-be-edited video to form a video segment.

优选地,所述视频拼接模块包括:Preferably, the video splicing module comprises:

分离单元,分离所述视频片段中的音频信息与视频信息,形成音频部分与视频部分;a separating unit, separating audio information and video information in the video segment to form an audio portion and a video portion;

拼接单元,与所述分离单元连接,将所述音频部分与所述视频部分单独进行拼接形成完整音频部分与完整视频部分;a splicing unit, connected to the separating unit, splicing the audio portion and the video portion separately to form a complete audio portion and a complete video portion;

同步单元,与所述拼接单元连接,将所述完整音频部分与所述完整视频部分进行同步。a synchronization unit coupled to the tiling unit to synchronize the complete audio portion with the complete video portion.

优选地,在视频片段获取模块与视频拼接模块之间,所述视频剪辑系统还包括:Preferably, between the video segment acquisition module and the video splicing module, the video editing system further includes:

视频片段筛选模块,向用户推送所述视频片段,由用户进行筛选,剔除无关的视频 片段。a video clip screening module that pushes the video clip to a user, and the user filters the video to eliminate irrelevant video Fragment.

采用了上述技术方案后,与现有技术相比,具有以下有益效果:After adopting the above technical solution, compared with the prior art, the following beneficial effects are obtained:

1.满足用户对获取包含或不包含某一人物或某多个人物的专属视频的需求;1. Meet the user's need to acquire a dedicated video with or without a character or multiple characters;

2.可识别目标人物的背影,获取包含或不包含某一人物或某多个人物背影的视频片段;2. The back of the target person can be identified to obtain a video clip containing or not including a character or a plurality of characters;

3.快捷方便,准确度高;3. Fast and convenient, high accuracy;

附图说明DRAWINGS

图1为符合本发明一优选实施例中视频剪辑方法的流程示意图;1 is a flow chart showing a video editing method in accordance with a preferred embodiment of the present invention;

图2为符合本发明一优选实施例中,视频剪辑方法的提取人物肖像特征的方法的流程示意图;2 is a flow chart showing a method for extracting a portrait feature of a video editing method in accordance with a preferred embodiment of the present invention;

图3为符合本发明一优选实施例中,视频剪辑方法的获取视频片段的方法的流程示意图;3 is a flow chart showing a method for acquiring a video clip by a video editing method in accordance with a preferred embodiment of the present invention;

图4为符合本发明一优选实施例中,视频剪辑方法的将视频片段或待剪辑视频中除视频片段外的剩余视频片段进行拼接的方法的流程示意图;4 is a schematic flow chart of a method for splicing a video clip or a video clip other than a video clip in a video clip or a video to be clipped according to a preferred embodiment of the present invention;

图5为符合本发明另一优选实施例中视频剪辑方法的流程示意图;FIG. 5 is a schematic flow chart of a video editing method according to another preferred embodiment of the present invention; FIG.

图6为符合本发明一优选实施例视频剪辑系统的系统结构示意图。Figure 6 is a block diagram showing the structure of a video editing system in accordance with a preferred embodiment of the present invention.

图7为符合本发明一优选实施例中,视频剪辑系统的肖像特征提取模块的结构示意图。FIG. 7 is a schematic structural diagram of a portrait feature extraction module of a video editing system in accordance with a preferred embodiment of the present invention.

图8为符合本发明一优选实施例中,视频剪辑系统的视频片段获取模块的结构示意图。FIG. 8 is a schematic structural diagram of a video clip acquiring module of a video editing system in accordance with a preferred embodiment of the present invention.

图9为符合本发明一优选实施例中,视频剪辑系统的视频拼接模块的结构示意图。FIG. 9 is a schematic structural diagram of a video splicing module of a video editing system in accordance with a preferred embodiment of the present invention.

图10为符合本发明另一优选实施例视频剪辑系统的系统结构示意图。FIG. 10 is a schematic structural diagram of a system of a video editing system in accordance with another preferred embodiment of the present invention.

附图标记:Reference mark:

100-视频剪辑系统;11-视频获取模块;12-视频片段获取模块;13-肖像特征提取模块;14-视频拼接模块;15-视频片段筛选模块。100-video editing system; 11-video acquisition module; 12-video segment acquisition module; 13-portrait feature extraction module; 14-video splicing module; 15-video segment screening module.

具体实施方式Detailed ways

这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本公开相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的装置和方法的例子。Exemplary embodiments will be described in detail herein, examples of which are illustrated in the accompanying drawings. The following description refers to the same or similar elements in the different figures unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present disclosure. Instead, they are merely examples of devices and methods consistent with aspects of the present disclosure as detailed in the appended claims.

在本公开使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本公开。在本公开和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。The terms used in the present disclosure are for the purpose of describing particular embodiments only, and are not intended to limit the disclosure. The singular forms "a", "the" and "the" It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.

在本发明的描述中,除非另有规定和限定,对于本领域的普通技术人员而言,可以根据具体情况理解上述术语的具体含义。In the description of the present invention, the specific meaning of the above terms may be understood by one of ordinary skill in the art, unless otherwise specified and defined.

在后续的描述中,使用用于表示元件的诸如“模块”、“部件”或“单元”的后缀仅为了有利于本发明的说明,其本身并没有特定的意义。因此,“模块”与“部件”可以混合地使用。In the following description, the use of suffixes such as "module", "component" or "unit" for indicating an element is merely an explanation for facilitating the present invention, and does not have a specific meaning per se. Therefore, "module" and "component" can be used in combination.

移动终端可以以各种形式来实施。例如,本发明中描述的终端可以包括诸如移动电话、智能电话、笔记本电脑、PDA(个人数字助理)、PAD(平板电脑)、PMP(便携式多媒体播放器)、导航装置等等的移动终端以及诸如数字TV、台式计算机等等的固定终端。The mobile terminal can be implemented in various forms. For example, the terminal described in the present invention may include a mobile terminal such as a mobile phone, a smart phone, a notebook computer, a PDA (Personal Digital Assistant), a PAD (Tablet), a PMP (Portable Multimedia Player), a navigation device, and the like, and such as Fixed terminal for digital TV, desktop computer, etc.

参阅图1,为本发明一优选实施例中基于智能终端的视频剪辑方法的流程示意图。该实施例中,视频剪辑方法具体包括以下步骤: 1 is a schematic flowchart of a video clip method based on a smart terminal according to a preferred embodiment of the present invention. In this embodiment, the video editing method specifically includes the following steps:

S100:获取待剪辑视频文件,并存储于所述智能终端内S100: Acquire a video file to be edited and store it in the smart terminal.

为了实现视频剪辑,首先必须要获取待剪辑的视频文件,获取待剪辑视频文件的方式既包括导入智能终端内的视频,也包括从智能终端外部导入视频,并存储在智能终端内。此处导入的待剪辑视频文件必须包含用户想要剪辑的目标人物,若用户导入视频错误,即不包含其想要剪辑的目标人物,则在后续图像识别获取视频片段时将会没有结果,并提醒用户未获取相关视频,请用户核对待剪辑视频文件或目标人物肖像图片是否导入错误。In order to implement the video clip, the video file to be edited must first be obtained. The method for obtaining the video file to be edited includes importing the video in the smart terminal, and importing the video from outside the smart terminal and storing it in the smart terminal. The video file to be edited imported here must contain the target person that the user wants to edit. If the user imports the video error, that is, does not include the target person who wants to edit, there will be no result when the subsequent image recognition acquires the video clip, and Remind users that they have not obtained the relevant video. Please check if the clip video file or the target portrait image is imported incorrectly.

S200:获取具有人物肖像元素的人物肖像图片,并提取所述人物肖像元素的人物肖像特征S200: Acquire a portrait image of a person with a portrait element, and extract a portrait feature of the portrait element

获取待剪辑视频后,为了实现以目标人物为中心的视频剪辑,需要获取具有人物肖像元素的人物肖像图片,并提取图片中人物肖像元素的人物肖像特征。获取人物肖像图片的方式既包括导入智能终端内的图片,也包括从智能终端外部导入图片,并存储在智能终端内。此处导入的人物肖像图片必须与用户的需求紧密结合,如果用户需要目标人物的正面肖像视频片段,则用户提供的人物肖像图片必须包含目标人物的面部元素,如果用户需要包含目标人物背影的视频片段,则用户提供的人物肖像图片必须包含目标人物的背影,同时,当目标人物数量为1时,用户提供的人物肖像图片也必须为包含目标人物单独一人的图片;当目标人物数量大于1时,用户需提供目标人物的相应的人物肖像图片,所有图片中均不可出现除目标人物以外的其他人,但可为同时并仅包含多个目标人物的人物肖像图片。为提供结果的精确度,将建议用户从待剪辑视频中截取满足上述要求的图片作为人物肖像图片,比对结果更加准确。After obtaining the video to be edited, in order to realize the video clip centered on the target person, it is necessary to acquire a portrait image having a portrait element and extract the portrait feature of the portrait element in the image. The method of obtaining a portrait picture includes not only importing the picture in the smart terminal, but also importing the picture from outside the smart terminal and storing it in the smart terminal. The portrait image imported here must be closely integrated with the user's needs. If the user needs a frontal portrait video clip of the target person, the user-provided portrait image must contain the facial element of the target person, if the user needs a video containing the target person's back. Fragment, the user-provided portrait image must contain the back of the target character. At the same time, when the number of target characters is 1, the user-provided portrait image must also be a picture containing the target person alone; when the number of target characters is greater than one The user needs to provide the corresponding portrait image of the target person, and no other person except the target person may appear in all the pictures, but may be a portrait picture of the person who simultaneously includes only a plurality of target characters. In order to provide the accuracy of the results, it is recommended that the user intercepts the image satisfying the above requirements from the video to be clipped as a portrait image, and the comparison result is more accurate.

S300:获取所述待剪辑视频中包含与所述人物肖像特征相符的人物的视频片段S300: Obtain a video segment of the video to be edited that includes a character that matches the character portrait feature

获取待剪辑视频文件以及人物肖像图片并提取人物肖像特征后,需要根据提取的人物肖像特征对视频中的画面的人物进行比对,获取包含与提取的人物肖像特征相符的人物的视频片段。在此过程中,需要根据用户的需求建立策略,既包括目标人物正面画面与目标人物背影画面的区分,目标人物数量的区分,也包括目标人物与其他人物的区分。首先,用户需要的视频片段若为只含目标人物正面肖像的画面则只需要获取含目标人物正面肖像的画面,用户需要的视频片段若为只含目标人物背影的画面则只需要获取含目标人物背影的画面,若二者都需要,则含目标人物正面肖像的画面与含目标人物背影的画面之间应为逻辑或的关系。其次,用户的目标人物数量超过1时,则需要考虑各人物之间的关系,当用户需要各目标人物同时出现的画面时,各目标人物肖像特征之间的关系应为逻辑与,当用户需要各目标人物任一出现的画面即可时,各目标人物肖像特征之间的关系应为逻辑或,除此之外,各目标人物之间的逻辑关系均可用户需求确定,如其中两者是逻辑与关系,另一者与该两者为逻辑或的关系。以电视剧为例,若用户需要某一男配角与女主角的所有对戏集锦,则两人物应同时出现,二者之间应为逻辑与的关系,则只获取二者人物肖像均出现的画面,既包括背影也包括正面。最后,关于目标人物与其他人物的区分,若用户需要只包含目标人物的画面,则画面中提取的人物数量应与画面中目标人物数量一致。After obtaining the video file to be edited and the portrait image and extracting the portrait feature, the characters of the picture in the video need to be compared according to the extracted portrait feature to obtain a video segment containing the character corresponding to the extracted portrait feature. In this process, it is necessary to establish a strategy according to the needs of the user, including the distinction between the frontal image of the target person and the back image of the target person, the distinction between the number of target characters, and the distinction between the target person and other characters. First, if the video clip that the user needs is a screen containing only the front portrait of the target person, only the front portrait of the target person needs to be obtained. If the video clip that the user needs is only the image of the target person's back, only the target person needs to be acquired. The picture of the back, if both are needed, there should be a logical OR relationship between the picture containing the frontal portrait of the target person and the picture containing the back of the target person. Secondly, when the number of target characters of the user exceeds 1, the relationship between the characters needs to be considered. When the user needs the screen of the simultaneous appearance of each target character, the relationship between the portrait features of each target character should be logical and when the user needs When any of the target characters appears, the relationship between the portrait features of each target person should be logical or. In addition, the logical relationship between the target characters can be determined by the user's needs, such as two of them. Logic and relationship, the other is a logical OR relationship with the two. Take TV dramas as an example. If the user needs all the highlights of a supporting actor and the heroine, then the two characters should appear at the same time. The relationship between the two should be logical and the only picture of both characters appears. Both the back and the front. Finally, regarding the distinction between the target person and other characters, if the user needs to include only the picture of the target person, the number of characters extracted in the picture should be consistent with the number of target characters in the picture.

获取待剪辑视频中包含与目标人物肖像特征相符的人物的视频片段的过程如下,对待剪辑视频进行分帧,获取每一帧画面,通过图像变换技术、图像增强技术、图像识别技术以及图像分割技术将每一帧画面中的人物肖像元素提取出来,再通过取样提取人物肖像元素中的人物肖像特征,一一与目标人物图片中提取的人物肖像元素比对,二者相一致时,则该画面即为含该目标人物的画面,相连的画面即形成一个视频片段。The process of obtaining a video clip of a character to be clipped that matches the character of the target person is as follows: the clipped video is framed, and each frame is acquired, through image transform technology, image enhancement technology, image recognition technology, and image segmentation technology. Extracting the portrait elements in each frame, and extracting the portrait features in the portrait elements by sampling, and comparing them with the portrait elements extracted from the target figures, if the two match, the picture That is, the picture containing the target person, the connected picture forms a video segment.

在上述电视剧例子中,将人物肖像特征分为面部肖像特征与身形轮廓特征,若男配角的面部肖像特征为A1,身形轮廓特征为A2,女主角的面部肖像特征为B1,身形轮廓特征为B2,画面人物肖像元素数量为N,则只有该男配角与该女主角对戏,既包括正面 也包括背影,且不包含其他人的画面应包含的特征的逻辑关系应为(A1 orA2)and(B1 or B2)and(N=2)。In the above TV drama example, the portrait feature is divided into a facial portrait feature and a body contour feature. If the facial portrait feature of the supporting role is A1, the body contour feature is A2, and the facial character feature of the heroine is B1, the body contour feature For B2, the number of portrait characters in the picture is N, then only the supporting actor plays against the heroine, including the front. Also includes the back, and the logical relationship of the features that should not be included in the picture of others should be (A1 or A2) and (B1 or B2) and (N=2).

S400:将所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段进行拼接S400: splicing the video clip or the remaining video clips except the video clip in the video to be clipped

获取待剪辑视频中包含与目标人物肖像特征相符的人物的视频片段后,需要将获取的片段进行拼接,拼接时应按照一定的顺序,既可以按照时间顺序,也可以按照画面中人物由少到多或由多到少的顺序,其中人物由少到多的变化即为仅含目标人物到含其他人的变化,由多到少与之相反,也可以按照目标人物占视频画面的比例由小到大或由大到小的顺序,后两种顺序均应辅以时间顺序。以按目标人物占视频画面的比例为标准的顺序为例,该人物占视频画面的比例可通过该人物肖像元素在视频画面中的面积除以视频画面的面积计算,记为占比率,此计算应在在识别每一帧画面后进行。当某一画面的人物占比率较高时,若该画面在某一被选取的视频片段中,则该视频片段视为一体,不论该视频片段其它画面中占比率高低,均按时间顺序拼接,防止单纯由占比率进行拼接造成画面断裂不连续。相当于用视频片段中画面的最大占比率进行比较,并按最大占比率大小顺序进行拼接,同时,占比率相同时,也按时间顺序拼接。After obtaining the video clip of the character to be clipped that matches the character of the target person, the clip needs to be stitched, and the stitching should be in a certain order, either in chronological order or according to the characters in the screen. More or less in the order, in which the change of characters from small to many is the change from only the target person to the other person, from the opposite to the other, it can also be small according to the proportion of the target person in the video picture. In the order of large or small to large, the latter two sequences should be supplemented by chronological order. For example, the order of the target person's proportion of the video screen is taken as an example. The ratio of the character to the video screen can be calculated by dividing the area of the portrait element in the video screen by the area of the video screen, and calculating the ratio. It should be done after identifying each frame of the picture. When the proportion of the character of a certain picture is relatively high, if the picture is in a selected video segment, the video segment is regarded as one body, and the time ratio is spliced regardless of the proportion of the other pictures in the video segment. Prevent the splicing of the screen from being discontinuous due to the splicing of the ratio. It is equivalent to comparing the maximum ratio of the pictures in the video clip, and splicing in the order of the largest ratio. At the same time, when the ratio is the same, it is also spliced in chronological order.

除了将获取的视频片段进行拼接外,当用户需求为不含某一人物或某多个人物的视频,则需要将待剪辑视频中除获取的视频片段外的剩余视频片段进行拼接,去除获取的视频片段。In addition to splicing the acquired video clips, when the user needs to include a video of a certain character or a plurality of characters, it is necessary to splicing the remaining video clips other than the obtained video clips in the video to be clipped, and removing the acquired video clips. Video clip.

具有上述配置后,用户可根据用户的需求快速准确获取包含或不包含某一人物或某多个人物的专属视频,同时可准确识别含目标人物背影的视频片段。With the above configuration, the user can quickly and accurately obtain the exclusive video with or without a certain character or a plurality of characters according to the user's needs, and can accurately identify the video segment containing the back of the target person.

参阅图2,在一优选实施例中,获取具有人物肖像元素的人物肖像图片,并提取所述人物肖像元素的人物肖像特征的步骤,具体包括有:Referring to FIG. 2, in a preferred embodiment, the step of acquiring a portrait image of a person with a portrait element and extracting the portrait feature of the portrait element includes:

S201:获取具有人物肖像元素的人物肖像图片,并存储于所述智能终端内S201: Acquire a portrait picture of a person having a portrait element and store it in the smart terminal.

获取待剪辑视频后,为了实现以目标人物为中心的视频剪辑,需要获取具有人物肖像元素的人物肖像图片,获取人物肖像图片的方式既包括导入智能终端内的图片,也包括从智能终端外部导入图片,并存储在智能终端内。After obtaining the video to be edited, in order to realize the video clip centered on the target person, it is necessary to obtain a portrait image having the portrait element, and the method of obtaining the portrait image includes both importing the image in the smart terminal and importing from outside the smart terminal. The picture is stored in the smart terminal.

S202:识别所述人物肖像图片中的人物肖像元素S202: Identify a portrait element in the portrait image of the person

获取具有人物肖像元素的人物肖像图片后,由于图片中可能存在具有干扰因素的背景,因此,需要将图片中的人物肖像元素提取出来,此处首先需要通过图像变换,如傅里叶变换、沃尔什-阿达玛变换以及离散卡夫纳-勒维变换将图像从时域变换到频域,再通过图像增强技术将频域图像中的高频突变分量强化,强化图像边缘,图像边缘被强化后,则需要通过图像识别技术通过提取特征、建立索引以及查询步骤识别图片中的人物肖像元素,通过图像分割技术提取人物肖像元素。此处的提取特征操作需以外部的相应人物肖像数据库为基础,通过对人物肖像数据库中肖像元素的采样建立不同肖像元素的识别模型,以区分不同的肖像元素,如对数据库中大量面部肖像的采样建立面部肖像的识别模型,在对人物肖像进行识别过程中利用该模型进行识别,当在图片中识别到与该模型一致的部分时,即认为该部分为面部肖像元素。After obtaining a portrait image with a portrait element, since there may be a background with disturbing factors in the image, it is necessary to extract the portrait element in the image. Here, first, through image transformation, such as Fourier transform, Wo The Ershi-Adama transform and the discrete Kafner-Levy transform transform the image from the time domain to the frequency domain, and then enhance the high-frequency abrupt components in the frequency domain image by image enhancement technology to enhance the edge of the image and enhance the edge of the image. After that, image recognition technology is needed to extract the portrait elements in the picture by extracting features, establishing indexes, and query steps, and extracting the portrait elements by image segmentation technology. The extraction feature operation here is based on the external corresponding portrait database, and the recognition model of different portrait elements is established by sampling the portrait elements in the portrait database to distinguish different portrait elements, such as a large number of facial portraits in the database. The recognition model for creating a facial portrait is sampled, and the model is used for recognition in the process of recognizing a portrait. When a portion consistent with the model is recognized in the image, the portion is considered to be a facial portrait element.

S203:提取所述人物肖像元素的身形轮廓特征为第一身形轮廓特征,提取所述人物肖像元素的面部肖像特征为第一面部肖像特征S203: extracting a body contour feature of the character portrait element as a first body contour feature, and extracting a facial portrait feature of the character portrait element as a first facial portrait feature

提取人物肖像元素后,需要提取其中的肖像特征,此处应对人物肖像图片进行区分,当人物肖像图片中的人物肖像元素为人物背影时,应提取人物背影的身形轮廓特征,包括身体轮廓,各部分的比例等特征,形成第一身形轮廓特征;当人物肖像图片中的人物肖像元素为人物正面肖像时,应提取人物正面肖像的面部肖像特征,包括面部肤色、五官形状大小、位置距离关系以及面部具有识别意义的特征,如嘴角的黑痣等特征,也可以在面部进行均匀取样,并记录面部肖像在画面中的面积大小。After extracting the portrait element, the portrait feature needs to be extracted. Here, the portrait image should be distinguished. When the portrait element in the portrait image is the back of the character, the outline feature of the character's back should be extracted, including the body contour. The proportion of each part and other features form a first figure outline feature; when the portrait element in the portrait picture is a frontal portrait of the person, the facial portrait features of the frontal portrait of the person should be extracted, including facial skin color, facial features, size, position distance Relationships and features with recognizable features on the face, such as black squats in the corners of the mouth, can also be uniformly sampled on the face and the size of the face portrait in the picture.

参阅图3,在一优选实施例中,获取所述待剪辑视频中包含与所述人物肖像特征相符 的人物的视频片段的步骤可具体包括:Referring to FIG. 3, in a preferred embodiment, acquiring the video to be edited includes matching the character portrait feature. The steps of the video clip of the character may specifically include:

S301:将待剪辑视频进行拆分,获取每一帧画面,提取所述每一帧画面中的待比对人物肖像元素,包括人物背影元素与人物面部元素S301: split the video to be clipped, obtain a frame of each frame, and extract a to-be-obtained portrait element in each frame, including a character back element and a character face element.

为了对待剪辑视频中的人物肖像进行识别,需要首先对视频进行分帧,形成一帧一帧的画面,并提取每一帧画面中的人物肖像元素,首先需要通过图像变换,如傅里叶变换、沃尔什-阿达玛变换以及离散卡夫纳-勒维变换将图像从时域变换到频域,再通过图像增强技术将频域图像中的高频突变分量强化,强化图像边缘,图像边缘被强化后,则需要通过图像识别技术通过提取特征、建立索引以及查询步骤识别图片中的人物肖像元素,最后通过图像分割技术提取人物肖像元素,提取的人物肖像元素包括人物背影元素与人物面部元素。此处的提取特征操作需以外部的相应人物肖像数据库为基础,通过对人物肖像数据库中肖像元素的采样建立不同肖像元素的识别模型,以区分不同的肖像元素,如对数据库中大量面部肖像的采样建立面部肖像的识别模型,在对人物肖像进行识别过程中利用该模型进行识别,当在图片中识别到与该模型一致的部分时,即认为该部分为面部肖像元素。In order to identify the portrait of a person in a clip video, it is necessary to first frame the video to form a frame of one frame and one frame, and extract the portrait element in each frame, first through image transformation, such as Fourier transform. The Walsh-Hadamard transform and the discrete Kafner-Levy transform transform the image from the time domain to the frequency domain, and then enhance the high-frequency abrupt component in the frequency domain image by image enhancement technology to enhance the image edge and image edge. After being enhanced, it is necessary to identify the portrait elements in the image by extracting features, indexing and query steps through image recognition technology, and finally extracting the portrait elements by image segmentation technology, and the extracted portrait elements include the characters of the characters and the facial elements of the characters. . The extraction feature operation here is based on the external corresponding portrait database, and the recognition model of different portrait elements is established by sampling the portrait elements in the portrait database to distinguish different portrait elements, such as a large number of facial portraits in the database. The recognition model for creating a facial portrait is sampled, and the model is used for recognition in the process of recognizing a portrait. When a portion consistent with the model is recognized in the image, the portion is considered to be a facial portrait element.

S302:提取所述人物背影元素的身形轮廓特征为第二身形轮廓特征,提取所述人物面部元素的面部肖像特征为第二面部肖像特征S302: extracting a body contour feature of the character back element as a second body shape feature, and extracting a face portrait feature of the character face element as a second face portrait feature

在待剪辑视频的画面中提取到人物背影元素后,需对人物背影元素进行采样,提取身体轮廓、各部分的比例等身形轮廓特征形成第二身形轮廓特征,此处的提取方法应与提取人物第一身形轮廓特征中的方法保持一致;在待剪辑视频的画面中提取到人物面部元素后,需对人物面部元素进行采样,提取包括面部肤色、五官形状大小、位置距离关系以及面部具有识别意义的特征,如嘴角的黑痣等特征的面部肖像特征形成第二面部肖像特征,此处的提取方法应与提取人物第一面部肖像特征中的方法保持一致。存在目标人物有可能会戴墨镜,因此在提取面部肖像特征时需要对墨镜部分删除,只考虑剩余面部肖像部分,删除墨镜部分的步骤包括通过对大量墨镜数据库的采样形成由轮廓线条与颜色等特征构成的墨镜模型,通过墨镜模型建立索引对面部元素进行查询,当查询到与墨镜模型一致的部分时,认为该部分为墨镜部分,删除该部分。After extracting the back elements of the character in the picture to be edited, it is necessary to sample the back elements of the character, extract the contours of the body contour and the proportion of each part to form the second body contour feature. The extraction method here should be The method of extracting the first figure outline feature of the character is consistent; after extracting the face part of the character in the picture of the video to be edited, the face part of the character is sampled, and the facial skin color, the facial features shape, the position distance relationship, and the face are extracted. A feature having a recognizable feature, such as a facial portrait feature of a black scorpion of the corner of the mouth, forms a second facial portrait feature, and the extraction method herein should be consistent with the method of extracting the first facial portrait feature of the character. There is a possibility that the target person may wear sunglasses, so it is necessary to delete the sunglasses portion when extracting the facial portrait feature, and only the remaining facial portrait portion is considered. The step of deleting the sunglasses portion includes forming features such as contour lines and colors by sampling a large number of sunglasses database. The constructed sunglasses model is indexed by the sunglasses model indexing. When the part matching the sunglasses model is inquired, the part is considered to be the sunglasses part, and the part is deleted.

S303:对所述第二身形轮廓特征与所述第一身形轮廓特征进行比对,获取相似度大于等于第一相似度阈值时所述第二身形轮廓特征对应的画面为人物背影画面S303: Align the second body contour feature with the first body contour feature, and obtain a picture corresponding to the second body contour feature when the similarity is greater than or equal to the first similarity threshold.

获取第二身形轮廓特征与第一身形轮廓特征后,根据第一身形轮廓特征建立索引,将第二身形轮廓特征缩放至与第一身形轮廓特征相同的尺寸,并根据索引对缩放后的第二身形轮廓特征进行采样查询,如身体的轮廓线条是否吻合,身体各部分的比例是否吻合等等,取第一相似度阈值为90%,当吻合度均大于等于第一相似度阈值时认为第二身形轮廓特征与第一身形轮廓特征的相似度大于等于第一相似度阈值,认为该第二身形轮廓特征对应的人物与第一身形轮廓特征对应的人物为同一人,获取该第二身形轮廓特征对应的画面,为人物背影画面。此处第一相似度阈值的标准可上下进行调整,满足一定的识别准确度均可。由于背影识别的难度较大,容易识别错误,因此,设置较高的相似度阈值有利于提高准确度,为了防止相似度阈值太高时遗漏画面,可设置相似度在85%-90%之间时弹出该画面,由用户进行选择是否获取该画面,以此减少画面的遗漏。After obtaining the second body contour feature and the first body contour feature, the first body contour feature is indexed according to the first body contour feature, and the second body contour feature is scaled to the same size as the first body contour feature, and according to the index pair The scaled second body contour feature is sampled and searched, such as whether the body contour line is consistent, whether the proportion of each part of the body is consistent, etc., and the first similarity threshold is 90%, and the coincidence degree is greater than or equal to the first similarity. The threshold value is considered to be greater than or equal to the first similarity threshold value, and the person corresponding to the second figure-shaped contour feature and the first figure-shaped contour feature is considered to be The same person obtains a picture corresponding to the contour feature of the second figure, and is a picture of the character's back. Here, the standard of the first similarity threshold can be adjusted up and down to meet a certain recognition accuracy. Since the back recognition is difficult and easy to identify errors, setting a higher similarity threshold is beneficial to improve the accuracy. To prevent the missing picture when the similarity threshold is too high, the similarity can be set between 85% and 90%. This screen pops up, and the user selects whether to obtain the screen, thereby reducing the omission of the screen.

S304:对所述第二面部肖像特征与所述第一面部肖像特征进行比对,获取相似度大于等于第二相似度阈值时所述第二面部肖像特征对应的画面为人物正面画面S304: Align the second facial portrait feature with the first facial portrait feature, and obtain a screen corresponding to the second facial portrait feature when the similarity is greater than or equal to the second similarity threshold.

获取第二面部肖像特征与第一面部肖像特征后,根据第一面部肖像特征建立索引,将第二面部肖像特征缩放至与第一面部肖像特征相同的尺寸,并根据索引对缩放后的第二面部肖像特征进行比对,如面部肤色是否相同、五官形状大小、位置距离关系是否吻合以及面部具有识别意义的特征,如嘴角的黑痣是否相同等等,取第二相似度阈值为85%,当吻合度均大于等于第二相似度阈值时,认为第二面部肖像特征与第一面部肖像 特征的相似度大于等于第二相似度阈值,认为该第二面部肖像特征对应的人物与第一面部肖像特征对应的人物为同一人,获取该第二面部肖像特征对应的画面,为人物正面画面。此处第二相似度阈值的标准可上下进行调整,满足一定的识别准确度均可。由于面部肖像特征较多,比对更准确,因此面部肖像特征比对时阈值即第二相似度阈值比身形轮廓特征比对时阈值即第一相似度阈值稍低。同样,对于面部肖像特征的比对,设置较高的相似度阈值有利于提高准确度,为了防止相似度阈值太高时遗漏画面,可设置相似度在80%-85%之间时弹出该画面,由用户进行选择是否获取该画面,以此减少画面的遗漏。After acquiring the second facial portrait feature and the first facial portrait feature, indexing is performed according to the first facial portrait feature, and the second facial portrait feature is scaled to the same size as the first facial portrait feature, and is scaled according to the index pair The second facial portrait feature is compared, such as whether the facial skin color is the same, whether the facial shape shape, the position distance relationship are consistent, and the facial recognition feature, such as whether the black corner of the mouth is the same, etc., and the second similarity threshold is taken. 85%, when the degree of coincidence is greater than or equal to the second similarity threshold, the second facial portrait feature is considered to be the first facial portrait The similarity of the feature is greater than or equal to the second similarity threshold, and the person corresponding to the second facial portrait feature is considered to be the same person as the first facial portrait feature, and the image corresponding to the second facial portrait feature is obtained as the front of the character. Picture. Here, the standard of the second similarity threshold can be adjusted up and down to meet a certain recognition accuracy. Since the facial portrait features are more and the alignment is more accurate, the facial portrait feature comparison threshold, that is, the second similarity threshold, is slightly lower than the body contour feature comparison threshold, that is, the first similarity threshold. Similarly, for the comparison of facial portrait features, setting a higher similarity threshold is beneficial to improve the accuracy. In order to prevent missing images when the similarity threshold is too high, the screen can be popped up when the similarity is between 80% and 85%. The user selects whether to obtain the picture, thereby reducing the omission of the picture.

S305:从所述待剪辑视频中剪切所述人物背影画面与所述人物正面画面,形成视频片段S305: Cut the character back picture from the character to be cut from the to-be-edited video to form a video segment.

获取人物背影画面与人物正面画面后,查询相邻帧的画面是否也被获取,当相邻帧的画面被获取时,将连续帧画面视为一体从待剪辑视频中剪切下来,形成视频片段。After obtaining the back image of the character and the front picture of the character, whether the picture of the adjacent frame is also acquired is obtained. When the picture of the adjacent frame is acquired, the continuous frame picture is regarded as an integral cut from the video to be clipped to form a video segment. .

参阅图4,在一优选实施例中,将所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段进行拼接的步骤具体包括:Referring to FIG. 4, in a preferred embodiment, the step of splicing the video clips or the remaining video clips other than the video clips in the video to be clipped includes:

S401:分离所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段中的音频信息与视频信息,形成音频部分与视频部分S401: Separating the audio information and the video information in the remaining video segments except the video segment in the video clip or the video to be clipped to form an audio portion and a video portion.

获取视频片段或待剪辑视频中除所述视频片段外的剩余视频片段后,需将每一片段中的音频信息与视频信息分离,此处的视频信息不包括音频信息,同时需要记录音频信息与视频信息的位置关系,并将音频信息与视频信息提取出来,形成音频部分与视频部分。After obtaining the remaining video segments of the video clip or the video to be clip except the video clip, the audio information in each clip needs to be separated from the video information, where the video information does not include audio information, and the audio information needs to be recorded. The positional relationship of the video information, and the audio information and the video information are extracted to form an audio part and a video part.

S402:将所述音频部分与所述视频部分单独进行拼接形成完整音频部分与完整视频部分S402: splicing the audio portion and the video portion separately to form a complete audio portion and a complete video portion

获取每一片段的音频部分与视频部分后,需要将音频部分与视频部分单元按顺序进行拼接,形成完全由音频部分构成的完整音频部分与完全由视频部分构成的完整视频部分。After acquiring the audio portion and the video portion of each segment, the audio portion and the video portion unit need to be sequentially stitched together to form a complete audio portion composed entirely of the audio portion and a complete video portion composed entirely of the video portion.

S403:将所述完整音频部分与所述完整视频部分进行同步S403: Synchronize the complete audio portion with the complete video portion

获取完整音频部分与完整视频部分后,根据所记录的音频信息与视频信息的位置关系,将完整音频部分与完整视频部分进行同步,形成最终完整的视频。After obtaining the complete audio portion and the complete video portion, the complete audio portion is synchronized with the complete video portion according to the recorded positional relationship between the audio information and the video information to form a final complete video.

参阅图5,为本发明另一优选实施例中基于智能终端的视频剪辑方法的流程示意图。该实施例中,在获取所述待剪辑视频中包含与所述人物肖像特征相符的人物的视频片段的步骤与将所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段进行拼接的步骤之间,该视频剪辑方法还包括以下步骤:5 is a schematic flowchart of a video clip method based on a smart terminal according to another preferred embodiment of the present invention. In this embodiment, the step of acquiring a video segment of the character corresponding to the character portrait feature in the video to be edited and performing the remaining video segment except the video segment in the video segment or the video to be clipped Between the stitching steps, the video editing method further includes the following steps:

S500:向用户推送所述视频片段,由用户进行筛选,剔除无关的视频片段S500: Push the video clip to a user, and the user performs screening to remove irrelevant video clips.

获取视频片段后,为进一步提高准确度,通过向用户推送获取的视频片段,由用户进行筛选,并可进行删除操作剔除识别错误的无关视频片段。After obtaining the video clip, in order to further improve the accuracy, the user can perform screening by pushing the obtained video clip to the user, and can perform the delete operation to eliminate the discriminating unrelated video clip.

参阅图6,为符合本发明一优选实施例中基于智能终端的视频剪辑系统100,其具体包括以下部件:Referring to FIG. 6, a smart terminal-based video editing system 100 in accordance with a preferred embodiment of the present invention specifically includes the following components:

视频获取模块11,为了实现视频剪辑,首先必须要获取待剪辑的视频文件,获取待剪辑视频文件的方式既包括导入智能终端内的视频,也包括从智能终端外部导入视频,并存储在智能终端内。此处导入的待剪辑视频文件必须包含用户想要剪辑的目标人物,若用户导入视频错误,即不包含其想要剪辑的目标人物,则在后续图像识别获取视频片段时将会没有结果,并提醒用户未获取相关视频,请用户核对待剪辑视频文件或目标人物肖像图片是否导入错误。In order to implement the video clip, the video capture module 11 must first obtain the video file to be edited, and the method for obtaining the video file to be edited includes not only importing the video in the smart terminal, but also importing the video from outside the smart terminal and storing it in the smart terminal. Inside. The video file to be edited imported here must contain the target person that the user wants to edit. If the user imports the video error, that is, does not include the target person who wants to edit, there will be no result when the subsequent image recognition acquires the video clip, and Remind users that they have not obtained the relevant video. Please check if the clip video file or the target portrait image is imported incorrectly.

肖像特征提取模块13,获取待剪辑视频后,为了实现以目标人物为中心的视频剪辑,需要获取具有人物肖像元素的人物肖像图片,并提取图片中人物肖像元素的人物肖像特 征。获取人物肖像图片的方式既包括导入智能终端内的图片,也包括从智能终端外部导入图片,并存储在智能终端内。此处导入的人物肖像图片必须与用户的需求紧密结合,如果用户需要目标人物的正面肖像视频片段,则用户提供的人物肖像图片必须包含目标人物的面部元素,如果用户需要包含目标人物背影的视频片段,则用户提供的人物肖像图片必须包含目标人物的背影,同时,当目标人物数量为1时,用户提供的人物肖像图片也必须为包含目标人物单独一人的图片;当目标人物数量大于1时,用户需提供目标人物的相应的人物肖像图片,所有图片中均不可出现除目标人物以外的其他人,但可为同时并仅包含多个目标人物的人物肖像图片。为提供结果的精确度,将建议用户从待剪辑视频中截取满足上述要求的图片作为人物肖像图片,比对结果更加准确。The portrait feature extraction module 13 is configured to acquire a character portrait image having a portrait element and extract a character portrait of the portrait element in the image after the video to be clipped is acquired. Sign. The method of obtaining a portrait picture includes not only importing the picture in the smart terminal, but also importing the picture from outside the smart terminal and storing it in the smart terminal. The portrait image imported here must be closely integrated with the user's needs. If the user needs a frontal portrait video clip of the target person, the user-provided portrait image must contain the facial element of the target person, if the user needs a video containing the target person's back. Fragment, the user-provided portrait image must contain the back of the target character. At the same time, when the number of target characters is 1, the user-provided portrait image must also be a picture containing the target person alone; when the number of target characters is greater than one The user needs to provide the corresponding portrait image of the target person, and no other person except the target person may appear in all the pictures, but may be a portrait picture of the person who simultaneously includes only a plurality of target characters. In order to provide the accuracy of the results, it is recommended that the user intercepts the image satisfying the above requirements from the video to be clipped as a portrait image, and the comparison result is more accurate.

视频片段获取模块12,与所述视频获取模块11及所述肖像特征提取模块13连接,获取待剪辑视频文件以及人物肖像图片并提取人物肖像特征后,需要根据提取的人物肖像特征对视频中的画面的人物进行比对,获取包含与提取的人物肖像特征相符的人物的视频片段。在此过程中,需要根据用户的需求建立策略,既包括目标人物正面画面与目标人物背影画面的区分,目标人物数量的区分,也包括目标人物与其他人物的区分。首先,用户需要的视频片段若为只含目标人物正面肖像的画面则只需要获取含目标人物正面肖像的画面,用户需要的视频片段若为只含目标人物背影的画面则只需要获取含目标人物背影的画面,若二者都需要,则含目标人物正面肖像的画面与含目标人物背影的画面之间应为逻辑或的关系。其次,用户的目标人物数量超过1时,则需要考虑各人物之间的关系,当用户需要各目标人物同时出现的画面时,各目标人物肖像特征之间的关系应为逻辑与,当用户需要各目标人物任一出现的画面即可时,各目标人物肖像特征之间的关系应为逻辑或,除此之外,各目标人物之间的逻辑关系均可用户需求确定,如其中两者是逻辑与关系,另一者与该两者为逻辑或的关系。以电视剧为例,若用户需要某一男配角与女主角的所有对戏集锦,则两人物应同时出现,二者之间应为逻辑与的关系,则只获取二者人物肖像均出现的画面,既包括背影也包括正面。最后,关于目标人物与其他人物的区分,若用户需要只包含目标人物的画面,则画面中提取的人物数量应与画面中目标人物数量一致。The video segment obtaining module 12 is connected to the video acquiring module 11 and the portrait feature extracting module 13 to obtain a to-be-edited video file and a portrait image and extract the character portrait feature, and then needs to be based on the extracted character portrait feature in the video. The characters of the screen are compared to obtain a video clip containing the character corresponding to the extracted portrait feature. In this process, it is necessary to establish a strategy according to the needs of the user, including the distinction between the frontal image of the target person and the back image of the target person, the distinction between the number of target characters, and the distinction between the target person and other characters. First, if the video clip that the user needs is a screen containing only the front portrait of the target person, only the front portrait of the target person needs to be obtained. If the video clip that the user needs is only the image of the target person's back, only the target person needs to be acquired. The picture of the back, if both are needed, there should be a logical OR relationship between the picture containing the frontal portrait of the target person and the picture containing the back of the target person. Secondly, when the number of target characters of the user exceeds 1, the relationship between the characters needs to be considered. When the user needs the screen of the simultaneous appearance of each target character, the relationship between the portrait features of each target character should be logical and when the user needs When any of the target characters appears, the relationship between the portrait features of each target person should be logical or. In addition, the logical relationship between the target characters can be determined by the user's needs, such as two of them. Logic and relationship, the other is a logical OR relationship with the two. Take TV dramas as an example. If the user needs all the highlights of a supporting actor and the heroine, then the two characters should appear at the same time. The relationship between the two should be logical and the only picture of both characters appears. Both the back and the front. Finally, regarding the distinction between the target person and other characters, if the user needs to include only the picture of the target person, the number of characters extracted in the picture should be consistent with the number of target characters in the picture.

获取待剪辑视频中包含与目标人物肖像特征相符的人物的视频片段的过程如下,对待剪辑视频进行分帧,获取每一帧画面,通过图像变换技术、图像增强技术、图像识别技术以及图像分割技术将每一帧画面中的人物肖像元素提取出来,再通过取样提取人物肖像元素中的人物肖像特征,一一与目标人物图片中提取的人物肖像元素比对,二者相一致时,则该画面即为含该目标人物的画面,相连的画面即形成一个视频片段。The process of obtaining a video clip of a character to be clipped that matches the character of the target person is as follows: the clipped video is framed, and each frame is acquired, through image transform technology, image enhancement technology, image recognition technology, and image segmentation technology. Extracting the portrait elements in each frame, and extracting the portrait features in the portrait elements by sampling, and comparing them with the portrait elements extracted from the target figures, if the two match, the picture That is, the picture containing the target person, the connected picture forms a video segment.

在上述电视剧例子中,将人物肖像特征分为面部肖像特征与身形轮廓特征,若男配角的面部肖像特征为A1,身形轮廓特征为A2,女主角的面部肖像特征为B1,身形轮廓特征为B2,画面人物肖像元素数量为N,则只有该男配角与该女主角对戏,既包括正面也包括背影,且不包含其他人的画面应包含的特征的逻辑关系应为(A1 orA2)and(B1 or B2)and(N=2)。In the above TV drama example, the portrait feature is divided into a facial portrait feature and a body contour feature. If the facial portrait feature of the supporting role is A1, the body contour feature is A2, and the facial character feature of the heroine is B1, the body contour feature For B2, the number of portrait characters in the picture is N, then only the supporting actor plays against the heroine, including both the front and the back, and the logical relationship of the features that should not be included in the picture of other people should be (A1 or A2) And(B1 or B2)and(N=2).

视频拼接模块14,与所述视频片段获取模块12连接,获取待剪辑视频中包含与目标人物肖像特征相符的人物的视频片段后,需要将获取的片段进行拼接,拼接时应按照一定的顺序,既可以按照时间顺序,也可以按照画面中人物由少到多或由多到少的顺序,其中人物由少到多的变化即为仅含目标人物到含其他人的变化,由多到少与之相反,也可以按照目标人物占视频画面的比例由小到大或由大到小的顺序,后两种顺序均应辅以时间顺序。以按目标人物占视频画面的比例为标准的顺序为例,该人物占视频画面的比例可通过该人物肖像元素在视频画面中的面积除以视频画面的面积计算,记为占比率,此计算应在在识别每一帧画面后进行。当某一画面的人物占比率较高时,若该画面在某一被选取的视频片段中,则该视频片段视为一体,不论该视频片段其它画面中占比率高 低,均按时间顺序拼接,防止单纯由占比率进行拼接造成画面断裂不连续。相当于用视频片段中画面的最大占比率进行比较,并按最大占比率大小顺序进行拼接,同时,占比率相同时,也按时间顺序拼接。The video splicing module 14 is connected to the video segment obtaining module 12, and after acquiring the video segment of the character to be clipped that matches the character of the target person, the spliced segment needs to be spliced, and the splicing should be in a certain order. It can be in chronological order or in the order of the characters in the picture from less to more or from more to less. The change of characters from small to many is the change from only the target person to other people, from more to less. Conversely, the order of the target characters in the video screen may be in the order of small to large or large to small, and the latter two sequences shall be supplemented by chronological order. For example, the order of the target person's proportion of the video screen is taken as an example. The ratio of the character to the video screen can be calculated by dividing the area of the portrait element in the video screen by the area of the video screen, and calculating the ratio. It should be done after identifying each frame of the picture. When the proportion of the character of a certain picture is high, if the picture is in a selected video segment, the video segment is regarded as one body, regardless of the proportion of other pictures in the video segment. Low, all spliced in chronological order, preventing the splicing of the picture by the ratio splicing to cause the picture to be broken. It is equivalent to comparing the maximum ratio of the pictures in the video clip, and splicing in the order of the largest ratio. At the same time, when the ratio is the same, it is also spliced in chronological order.

除了将获取的视频片段进行拼接外,当用户需求为不含某一人物或某多个人物的视频,则需要将待剪辑视频中除获取的视频片段外的剩余视频片段进行拼接,去除获取的视频片段。In addition to splicing the acquired video clips, when the user needs to include a video of a certain character or a plurality of characters, it is necessary to splicing the remaining video clips other than the obtained video clips in the video to be clipped, and removing the acquired video clips. Video clip.

参阅图7,在一优选实施例中,肖像特征提取模块13具体包括:Referring to FIG. 7, in a preferred embodiment, the portrait feature extraction module 13 specifically includes:

图片获取单元,获取待剪辑视频后,为了实现以目标人物为中心的视频剪辑,需要获取具有人物肖像元素的人物肖像图片,获取人物肖像图片的方式既包括导入智能终端内的图片,也包括从智能终端外部导入图片,并存储在智能终端内。After the image acquisition unit obtains the video to be edited, in order to realize the video clip centered on the target person, it is necessary to obtain the portrait image of the person having the portrait element, and the manner of obtaining the portrait image includes both the image imported into the smart terminal and the The smart terminal externally imports the picture and stores it in the smart terminal.

肖像元素识别单元,与所述图片获取单元连接,获取具有人物肖像元素的人物肖像图片后,由于图片中可能存在具有干扰因素的背景,因此,需要将图片中的人物肖像元素提取出来,此处首先需要通过图像变换,如傅里叶变换、沃尔什-阿达玛变换以及离散卡夫纳-勒维变换将图像从时域变换到频域,再通过图像增强技术将频域图像中的高频突变分量强化,强化图像边缘,图像边缘被强化后,则需要通过图像识别技术通过提取特征、建立索引build以及查询步骤识别图片中的人物肖像元素,通过图像分割技术提取人物肖像元素。此处的提取特征操作需以外部的相应人物肖像数据库为基础,通过对人物肖像数据库中肖像元素的采样建立不同肖像元素的识别模型,以区分不同的肖像元素,如对数据库中大量面部肖像的采样建立面部肖像的识别模型,在对人物肖像进行识别过程中利用该模型进行识别,当在图片中识别到与该模型一致的部分时,即认为该部分为面部肖像元素。The portrait element identification unit is connected to the picture acquisition unit, and after acquiring the person portrait picture having the person portrait element, since there may be a background with interference factors in the picture, the person portrait element in the picture needs to be extracted, here Firstly, the image is transformed from the time domain to the frequency domain by image transform, such as Fourier transform, Walsh-Hadamard transform and discrete Kafner-Levy transform, and then the image in the frequency domain image is high. The frequency mutation component is strengthened, and the edge of the image is strengthened. After the image edge is strengthened, the image recognition technology is used to extract the character portrait element in the image by extracting the feature, establishing the index build and the query step, and extracting the portrait element by the image segmentation technology. The extraction feature operation here is based on the external corresponding portrait database, and the recognition model of different portrait elements is established by sampling the portrait elements in the portrait database to distinguish different portrait elements, such as a large number of facial portraits in the database. The recognition model for creating a facial portrait is sampled, and the model is used for recognition in the process of recognizing a portrait. When a portion consistent with the model is recognized in the image, the portion is considered to be a facial portrait element.

肖像特征提取单元,与所述肖像元素识别单元连接,提取人物肖像元素后,需要提取其中的肖像特征,此处应对人物肖像图片进行区分,当人物肖像图片中的人物肖像元素为人物背影时,应提取人物背影的身形轮廓特征,包括身体轮廓,各部分的比例等特征,形成第一身形轮廓特征;当人物肖像图片中的人物肖像元素为人物正面肖像时,应提取人物正面肖像的面部肖像特征,包括面部肤色、五官形状大小、位置距离关系以及面部具有识别意义的特征,如嘴角的黑痣等特征,也可以在面部进行均匀取样,并记录面部肖像在画面中的面积大小。a portrait feature extraction unit is connected to the portrait element recognition unit, and after extracting the portrait element, the portrait feature needs to be extracted, where the portrait image should be distinguished, and when the portrait element in the portrait image is the back of the character, The silhouette of the figure of the character should be extracted, including the contour of the body, the proportion of each part, etc., to form the first figure outline feature; when the portrait element in the portrait picture is the front portrait of the character, the front portrait of the character should be extracted. The facial portrait features, including the facial skin color, the size of the facial features, the positional distance relationship, and the features of the facial recognition, such as the black scorpion of the corner of the mouth, can also be uniformly sampled on the face and record the size of the face portrait in the picture.

参阅图8,一优选实施例中,视频片段获取模块12具体包括:Referring to FIG. 8, in a preferred embodiment, the video segment obtaining module 12 specifically includes:

元素提取单元,为了对待剪辑视频中的人物肖像进行识别,需要首先对视频进行分帧,形成一帧一帧的画面,并提取每一帧画面中的人物肖像元素,首先需要通过图像变换,如傅里叶变换、沃尔什-阿达玛变换以及离散卡夫纳-勒维变换将图像从时域变换到频域,再通过图像增强技术将频域图像中的高频突变分量强化,强化图像边缘,图像边缘被强化后,则需要通过图像识别技术通过提取特征、建立索引build以及查询步骤识别图片中的人物肖像元素,最后通过图像分割技术提取人物肖像元素,提取的人物肖像元素包括人物背影元素与人物面部元素。此处的提取特征操作需以外部的相应人物肖像数据库为基础,通过对人物肖像数据库中肖像元素的采样建立不同肖像元素的识别模型,以区分不同的肖像元素,如对数据库中大量面部肖像的采样建立面部肖像的识别模型,在对人物肖像进行识别过程中利用该模型进行识别,当在图片中识别到与该模型一致的部分时,即认为该部分为面部肖像元素。The element extraction unit, in order to recognize the portrait of the person in the clip video, needs to first frame the video to form a frame of one frame and one frame, and extract the portrait element in each frame, first through image transformation, such as Fourier transform, Walsh-Hadamard transform and discrete Kafner-Levy transform transform the image from time domain to frequency domain, and then enhance the high frequency mutation component in the frequency domain image by image enhancement technology to enhance the image. Edge, after the image edge is strengthened, it is necessary to identify the portrait element in the image by extracting features, building an index build and query step through image recognition technology, and finally extracting the portrait element by image segmentation technology, and the extracted character portrait element includes the character back view. Elements and character facial elements. The extraction feature operation here is based on the external corresponding portrait database, and the recognition model of different portrait elements is established by sampling the portrait elements in the portrait database to distinguish different portrait elements, such as a large number of facial portraits in the database. The recognition model for creating a facial portrait is sampled, and the model is used for recognition in the process of recognizing a portrait. When a portion consistent with the model is recognized in the image, the portion is considered to be a facial portrait element.

特征提取单元,与所述元素提取单元连接,在待剪辑视频的画面中提取到人物背影元素后,需对人物背影元素进行采样,提取身体轮廓、各部分的比例等身形轮廓特征形成第二身形轮廓特征,此处的提取方法应与提取人物第一身形轮廓特征中的方法保持一致;在待剪辑视频的画面中提取到人物面部元素后,需对人物面部元素进行采样,提取包括面部肤色、五官形状大小、位置距离关系以及面部具有识别意义的特征,如嘴角的 黑痣等特征的面部肖像特征形成第二面部肖像特征,此处的提取方法应与提取人物第一面部肖像特征中的方法保持一致。存在目标人物有可能会戴墨镜,因此在提取面部肖像特征时需要对墨镜部分删除,只考虑剩余面部肖像部分,删除墨镜部分的步骤包括通过对大量墨镜数据库的采样形成由轮廓线条与颜色等特征构成的墨镜模型,通过墨镜模型建立索引对面部元素进行查询,当查询到与墨镜模型一致的部分时,认为该部分为墨镜部分,删除该部分。The feature extraction unit is connected to the element extraction unit, and after extracting the back view element of the character in the picture of the video to be edited, the back view element of the character is sampled, and the outline of the body, the proportion of each part, and the like are extracted to form a second shape. The outline contour feature, the extraction method here should be consistent with the method of extracting the first figure outline feature of the character; after extracting the face element of the character in the picture of the video to be edited, the face element of the character needs to be sampled, and the extraction includes Facial skin color, facial shape shape, position distance relationship, and facial recognition features, such as the corner of the mouth The facial portrait features of features such as black scorpion form a second facial portrait feature, and the extraction method here should be consistent with the method of extracting the first facial portrait feature of the character. There is a possibility that the target person may wear sunglasses, so it is necessary to delete the sunglasses portion when extracting the facial portrait feature, and only the remaining facial portrait portion is considered. The step of deleting the sunglasses portion includes forming features such as contour lines and colors by sampling a large number of sunglasses database. The constructed sunglasses model is indexed by the sunglasses model indexing. When the part matching the sunglasses model is inquired, the part is considered to be the sunglasses part, and the part is deleted.

背影画面获取单元,与所述特征提取单元连接,获取第二身形轮廓特征与第一身形轮廓特征后,根据第一身形轮廓特征建立索引,将第二身形轮廓特征缩放至与第一身形轮廓特征相同的尺寸,并根据索引对缩放后的第二身形轮廓特征进行采样查询,如身体的轮廓线条是否吻合,身体各部分的比例是否吻合等等,取第一相似度阈值为90%,当吻合度均大于等于第一相似度阈值时认为第二身形轮廓特征与第一身形轮廓特征的相似度大于等于第一相似度阈值,认为该第二身形轮廓特征对应的人物与第一身形轮廓特征对应的人物为同一人,获取该第二身形轮廓特征对应的画面,为人物背影画面。此处第一相似度阈值的标准可上下进行调整,满足一定的识别准确度均可。由于背影识别的难度较大,容易识别错误,因此,设置较高的相似度阈值有利于提高准确度,为了防止相似度阈值太高时遗漏画面,可设置相似度在85%-90%之间时弹出该画面,由用户进行选择是否获取该画面,以此减少画面的遗漏。a back image acquisition unit is connected to the feature extraction unit, and after acquiring the second body contour feature and the first body contour feature, the index is established according to the first body contour feature, and the second body contour feature is scaled to The shape of the contour is the same size, and the scaled second contour contour feature is sampled according to the index, such as whether the body contour line is consistent, whether the proportion of each part of the body is consistent, etc., taking the first similarity threshold 90%, when the degree of coincidence is greater than or equal to the first similarity threshold, the similarity between the second body contour feature and the first body contour feature is considered to be greater than or equal to the first similarity threshold, and the second body contour feature is considered to correspond to The character corresponding to the first figure contour feature is the same person, and the picture corresponding to the second body shape contour feature is obtained, which is a character back picture. Here, the standard of the first similarity threshold can be adjusted up and down to meet a certain recognition accuracy. Since the back recognition is difficult and easy to identify errors, setting a higher similarity threshold is beneficial to improve the accuracy. To prevent the missing picture when the similarity threshold is too high, the similarity can be set between 85% and 90%. This screen pops up, and the user selects whether to obtain the screen, thereby reducing the omission of the screen.

正面画面获取单元,与所述特征提取单元连接,获取第二面部肖像特征与第一面部肖像特征后,根据第一面部肖像特征建立索引,将第二面部肖像特征缩放至与第一面部肖像特征相同的尺寸,并根据索引对缩放后的第二面部肖像特征进行比对,如面部肤色是否相同、五官形状大小、位置距离关系是否吻合以及面部具有识别意义的特征,如嘴角的黑痣是否相同等等,取第二相似度阈值为85%,当吻合度均大于等于第二相似度阈值时,认为第二面部肖像特征与第一面部肖像特征的相似度大于等于第二相似度阈值,认为该第二面部肖像特征对应的人物与第一面部肖像特征对应的人物为同一人,获取该第二面部肖像特征对应的画面,为人物正面画面。此处第二相似度阈值的标准可上下进行调整,满足一定的识别准确度均可。由于面部肖像特征较多,比对更准确,因此面部肖像特征比对时阈值即第二相似度阈值比身形轮廓特征比对时阈值即第一相似度阈值稍低。同样,对于面部肖像特征的比对,设置较高的相似度阈值有利于提高准确度,为了防止相似度阈值太高时遗漏画面,可设置相似度在80%-85%之间时弹出该画面,由用户进行选择是否获取该画面,以此减少画面的遗漏。a front picture acquiring unit, connected to the feature extracting unit, after acquiring the second facial portrait feature and the first facial portrait feature, indexing according to the first facial portrait feature, and scaling the second facial portrait feature to the first face The portrait features the same size, and the scaled second facial portrait features are compared according to the index, such as whether the facial skin color is the same, the facial features, the positional distance relationship, and the facial recognition feature, such as the black corner of the mouth. Whether the 相似 is the same or the like, the second similarity threshold is 85%, and when the degree of coincidence is greater than or equal to the second similarity threshold, the similarity between the second facial portrait feature and the first facial portrait feature is considered to be greater than or equal to the second similarity. The degree threshold is that the person corresponding to the second facial portrait feature and the person corresponding to the first facial portrait feature are the same person, and the screen corresponding to the second facial portrait feature is acquired as the front view of the character. Here, the standard of the second similarity threshold can be adjusted up and down to meet a certain recognition accuracy. Since the facial portrait features are more and the alignment is more accurate, the facial portrait feature comparison threshold, that is, the second similarity threshold, is slightly lower than the body contour feature comparison threshold, that is, the first similarity threshold. Similarly, for the comparison of facial portrait features, setting a higher similarity threshold is beneficial to improve the accuracy. In order to prevent missing images when the similarity threshold is too high, the screen can be popped up when the similarity is between 80% and 85%. The user selects whether to obtain the picture, thereby reducing the omission of the picture.

剪切单元,与所述背影画面获取单元及所述正面画面获取单元连接,获取人物背影画面与人物正面画面后,查询相邻帧的画面是否也被获取,当相邻帧的画面被获取时,将连续帧画面视为一体从待剪辑视频中剪切下来,形成视频片段。The cutting unit is connected to the back image acquiring unit and the front screen acquiring unit, and after acquiring the back image of the character and the front view of the character, whether the image of the adjacent frame is also acquired, when the image of the adjacent frame is acquired The continuous frame picture is regarded as an integral cut from the video to be clipped to form a video segment.

参阅图9,在一优选实施例中,视频拼接模块14具体包括:Referring to FIG. 9, in a preferred embodiment, the video splicing module 14 specifically includes:

分离单元,获取视频片段或待剪辑视频中除所述视频片段外的剩余视频片段后,需将每一片段中的音频信息与视频信息分离,此处的视频信息不包括音频信息,同时需要记录音频信息与视频信息的位置关系,并将音频信息与视频信息提取出来,形成音频部分与视频部分。The separating unit, after acquiring the video clip or the remaining video clips except the video clip in the video to be clipped, separates the audio information in each segment from the video information, where the video information does not include the audio information, and needs to be recorded. The positional relationship between the audio information and the video information, and the audio information and the video information are extracted to form an audio part and a video part.

拼接单元,获取每一片段的音频部分与视频部分后,需要将音频部分与视频部分单元按顺序进行拼接,形成完全由音频部分构成的完整音频部分与完全由视频部分构成的完整视频部分。After the splicing unit obtains the audio portion and the video portion of each segment, the audio portion and the video portion unit need to be spliced in order to form a complete audio portion composed entirely of the audio portion and a complete video portion composed entirely of the video portion.

同步单元,获取完整音频部分与完整视频部分后,根据所记录的音频信息与视频信息的位置关系,将完整音频部分与完整视频部分进行同步,形成最终完整的视频。The synchronization unit, after acquiring the complete audio part and the complete video part, synchronizes the complete audio part with the complete video part according to the recorded position relationship of the audio information and the video information to form a final complete video.

参阅图10,为符合本发明另一优选实施例中基于智能终端的视频剪辑系统100,在视频片段获取模块12与视频拼接模块14之间,所述视频剪辑系统100还包括以下部件: Referring to FIG. 10, in accordance with a smart terminal-based video editing system 100 in accordance with another preferred embodiment of the present invention, between the video segment acquisition module 12 and the video splicing module 14, the video editing system 100 further includes the following components:

视频片段筛选模块15,获取视频片段后,为进一步提高准确度,通过向用户推送获取的视频片段,由用户进行筛选,并可进行删除操作剔除识别错误的无关视频片段。The video segment screening module 15 obtains the video segment, and further improves the accuracy. By pushing the obtained video segment to the user, the user performs screening, and the deletion operation may be performed to remove the unrelated video segment that identifies the error.

本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本公开的其它实施方案。本公开旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由权利要求指出。Other embodiments of the present disclosure will be apparent to those skilled in the <RTIgt; The present disclosure is intended to cover any variations, uses, or adaptations of the present disclosure, which are in accordance with the general principles of the present disclosure and include common general knowledge or conventional technical means in the art that are not disclosed in the present disclosure. . The specification and examples are to be regarded as illustrative only,

应当注意的是,本发明的实施例有较佳的实施性,且并非对本发明作任何形式的限制,任何熟悉该领域的技术人员可能利用上述揭示的技术内容变更或修饰为等同的有效实施例,但凡未脱离本发明技术方案的内容,依据本发明的技术实质对以上实施例所作的任何修改或等同变化及修饰,均仍属于本发明技术方案的范围内。 It should be noted that the embodiments of the present invention are preferred embodiments, and are not intended to limit the scope of the present invention. Any one skilled in the art may use the above-disclosed technical contents to change or modify the equivalent embodiments. Any modification or equivalent changes and modifications of the above embodiments in accordance with the technical spirit of the present invention are still within the scope of the technical solutions of the present invention.

Claims (10)

一种基于智能终端的视频剪辑方法,其特征在于,包括以下步骤:A video editing method based on a smart terminal, comprising the following steps: 获取待剪辑视频文件,并存储于所述智能终端内;Obtaining a video file to be edited and storing in the smart terminal; 获取具有人物肖像元素的人物肖像图片,并提取所述人物肖像元素的人物肖像特征;Obtaining a portrait image of a person having a portrait element and extracting a portrait feature of the portrait element; 获取所述待剪辑视频中包含与所述人物肖像特征相符的人物的视频片段;Obtaining, in the video to be edited, a video segment of a character that matches the character portrait feature; 将所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段进行拼接。Splicing the video clip or the remaining video clips other than the video clip in the video to be clipped. 如权利要求1所述的视频剪辑方法,其特征在于,A video editing method according to claim 1, wherein 获取具有人物肖像元素的人物肖像图片,并提取所述人物肖像元素的人物肖像特征的步骤包括:The steps of obtaining a portrait image of a person portrait element and extracting the portrait feature of the portrait element include: 获取具有人物肖像元素的人物肖像图片,并存储于所述智能终端内;Obtaining a portrait picture of a person having a portrait element and storing it in the smart terminal; 识别所述人物肖像图片中的人物肖像元素;Identifying a portrait element in the portrait image of the person; 提取所述人物肖像元素的身形轮廓特征为第一身形轮廓特征,提取所述人物肖像元素的面部肖像特征为第一面部肖像特征。The body contour feature of the character portrait element is extracted as a first body contour feature, and the facial portrait feature of the person portrait element is extracted as a first facial portrait feature. 如权利要求2所述的视频剪辑方法,其特征在于,A video editing method according to claim 2, wherein 获取所述待剪辑视频中包含与所述人物肖像特征相符的人物的视频片段的步骤包括:Obtaining, in the video to be edited, a video segment of the character that matches the character portrait feature includes: 将待剪辑视频进行拆分,获取每一帧画面,提取所述每一帧画面中的待比对人物肖像元素,包括人物背影元素与人物面部元素;Splitting the video to be clipped, obtaining each frame of the picture, and extracting the to-be-obtained portrait elements in the frame of each frame, including the character back element and the character face element; 提取所述人物背影元素的身形轮廓特征为第二身形轮廓特征,提取所述人物面部元素的面部肖像特征为第二面部肖像特征;Extracting a body contour feature of the character back element as a second body shape feature, and extracting a face portrait feature of the character face element as a second face portrait feature; 对所述第二身形轮廓特征与所述第一身形轮廓特征进行比对,获取相似度大于等于第一相似度阈值时所述第二身形轮廓特征对应的画面为人物背影画面;Comparing the second body contour feature with the first body contour feature, and obtaining a picture corresponding to the second body contour feature when the similarity is greater than or equal to the first similarity threshold is a character back image; 对所述第二面部肖像特征与所述第一面部肖像特征进行比对,获取相似度大于等于第二相似度阈值时所述第二面部肖像特征对应的画面为人物正面画面;Comparing the second facial portrait feature with the first facial portrait feature, and acquiring a picture corresponding to the second facial portrait feature when the similarity is greater than or equal to the second similarity threshold is a front view of the character; 从所述待剪辑视频中剪切所述人物背影画面与所述人物正面画面,形成视频片段。Cutting the character back picture from the character front picture from the to-be-edited video to form a video segment. 如权利要求1所述的视频剪辑方法,其特征在于,A video editing method according to claim 1, wherein 将所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段进行拼接的步骤包括:The step of splicing the video clip or the remaining video clips other than the video clip in the video to be clipped includes: 分离所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段中的音频信息与视频信息,形成音频部分与视频部分;Separating the audio information and the video information in the remaining video segments except the video segment in the video clip or the video to be clipped to form an audio portion and a video portion; 将所述音频部分与所述视频部分单独进行拼接形成完整音频部分与完整视频部分;Separating the audio portion and the video portion separately to form a complete audio portion and a complete video portion; 将所述完整音频部分与所述完整视频部分进行同步。Synchronizing the complete audio portion with the full video portion. 如权利要求1-4任一所述的视频剪辑方法,其特征在于,在获取所述待剪辑视频中包含与所述人物肖像特征相符的人物的视频片段的步骤与将所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段进行拼接的步骤之间,所述视频剪辑方法还包括:A video editing method according to any one of claims 1 to 4, wherein the step of acquiring a video clip of a character corresponding to the person portrait feature in the video to be clipped and the step of placing the video clip or Between the steps of splicing the remaining video segments except the video segment in the clip video, the video editing method further includes: 向用户推送所述视频片段,由用户进行筛选,剔除无关的视频片段。The video clip is pushed to the user and filtered by the user to remove irrelevant video clips. 一种基于智能终端的视频剪辑系统,其特征在于,包括:A video editing system based on a smart terminal, comprising: 视频获取模块,获取待剪辑视频文件,并存储于所述智能终端内;a video acquisition module, which acquires a video file to be edited and stores the video file in the smart terminal; 肖像特征提取模块,获取具有人物肖像元素的人物肖像图片,并提取所述人物肖像元素的人物肖像特征;a portrait feature extraction module, acquiring a portrait image of a person having a portrait element, and extracting a portrait feature of the portrait element; 视频片段获取模块,与所述视频获取模块及所述肖像特征提取模块连接,获取所述待剪辑视频中包含与所述人物肖像特征相符的人物的视频片段;a video segment acquisition module, configured to connect to the video acquisition module and the portrait feature extraction module, and acquire a video segment of the to-be-edited video that includes a character that matches the character portrait feature; 视频拼接模块,与所述视频片段获取模块连接,将所述视频片段或待剪辑视频中除所述视频片段外的剩余视频片段进行拼接。The video splicing module is connected to the video segment obtaining module to splicing the video clip or the remaining video segments except the video segment in the video to be clipped. 如权利要求6所述的视频剪辑系统,其特征在于,The video editing system of claim 6 wherein: 所述肖像特征提取模块包括: The portrait feature extraction module includes: 图片获取单元,获取具有人物肖像元素的人物肖像图片,并存储于所述智能终端内;;a picture obtaining unit, which acquires a portrait picture of a person having a portrait element and stores it in the smart terminal; 肖像元素识别单元,与所述图片获取单元连接,识别所述人物肖像图片中的人物肖像元素;a portrait element identification unit connected to the picture acquisition unit to identify a person portrait element in the person portrait picture; 肖像特征提取单元,与所述肖像元素识别单元连接,提取所述人物肖像元素的身形轮廓特征为第一身形轮廓特征,提取所述人物肖像元素的面部肖像特征为第一面部肖像特征。a portrait feature extraction unit, coupled to the portrait element recognition unit, extracting a body profile feature of the person portrait element as a first body profile feature, and extracting a face portrait feature of the person portrait element as a first face portrait feature . 如权利要求7所述的视频剪辑系统,其特征在于,The video editing system of claim 7 wherein: 所述视频片段获取模块包括:The video segment obtaining module includes: 元素提取单元,将待剪辑视频进行拆分,获取每一帧画面,提取所述每一帧画面中的待比对人物肖像元素,包括人物背影元素与人物面部元素;An element extracting unit splits the video to be clipped, obtains a frame of each frame, and extracts a pair of portrait elements in the frame of each frame, including a character back element and a character face element; 特征提取单元,与所述元素提取单元连接,提取所述人物背影元素的身形轮廓特征为第二身形轮廓特征,提取所述人物面部元素的面部肖像特征为第二面部肖像特征;a feature extraction unit, connected to the element extraction unit, extracting a body contour feature of the character back image element as a second body shape contour feature, and extracting a face portrait feature of the character face element as a second face portrait feature; 背影画面获取单元,与所述特征提取单元连接,对所述第二身形轮廓特征与所述第一身形轮廓特征进行比对,获取相似度大于等于第一相似度阈值时所述第二身形轮廓特征对应的画面为人物背影画面;a back image acquisition unit, connected to the feature extraction unit, and comparing the second body contour feature with the first body contour feature, and acquiring the second when the similarity is greater than or equal to the first similarity threshold The picture corresponding to the figure outline feature is a character back picture; 正面画面获取单元,与所述特征提取单元连接,对所述第二面部肖像特征与所述第一面部肖像特征进行比对,获取相似度大于等于第二相似度阈值时所述第二面部肖像特征对应的画面为人物正面画面;a front view acquiring unit, connected to the feature extracting unit, comparing the second facial portrait feature with the first facial portrait feature, and acquiring the second facial when the similarity is greater than or equal to a second similarity threshold The picture corresponding to the portrait feature is the front view of the character; 剪切单元,与所述背影画面获取单元及所述正面画面获取单元连接,从所述待剪辑视频中剪切所述人物背影画面与所述人物正面画面,形成视频片段。The cutting unit is connected to the back view picture acquiring unit and the front picture acquiring unit, and cuts the character back picture from the character front picture from the to-be-edited video to form a video segment. 如权利要求6所述的视频剪辑系统,其特征在于,The video editing system of claim 6 wherein: 所述视频拼接模块包括:The video splicing module includes: 分离单元,分离所述视频片段中的音频信息与视频信息,形成音频部分与视频部分;a separating unit, separating audio information and video information in the video segment to form an audio portion and a video portion; 拼接单元,与所述分离单元连接,将所述音频部分与所述视频部分单独进行拼接形成完整音频部分与完整视频部分;a splicing unit, connected to the separating unit, splicing the audio portion and the video portion separately to form a complete audio portion and a complete video portion; 同步单元,与所述拼接单元连接,将所述完整音频部分与所述完整视频部分进行同步。a synchronization unit coupled to the tiling unit to synchronize the complete audio portion with the complete video portion. 如权利要求6-9任一所述的视频剪辑系统,其特征在于,在视频片段获取模块与视频拼接模块之间,所述视频剪辑系统还包括:The video editing system of any of claims 6-9, wherein between the video segment acquisition module and the video splicing module, the video editing system further comprises: 视频片段筛选模块,向用户推送所述视频片段,由用户进行筛选,剔除无关的视频片段。 The video clip screening module pushes the video clip to the user, and the user filters the unrelated video clips.
PCT/CN2017/095540 2017-08-02 2017-08-02 Video editing method and video editing system based on intelligent terminal Ceased WO2019023953A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/095540 WO2019023953A1 (en) 2017-08-02 2017-08-02 Video editing method and video editing system based on intelligent terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/095540 WO2019023953A1 (en) 2017-08-02 2017-08-02 Video editing method and video editing system based on intelligent terminal

Publications (1)

Publication Number Publication Date
WO2019023953A1 true WO2019023953A1 (en) 2019-02-07

Family

ID=65232285

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/095540 Ceased WO2019023953A1 (en) 2017-08-02 2017-08-02 Video editing method and video editing system based on intelligent terminal

Country Status (1)

Country Link
WO (1) WO2019023953A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111953919A (en) * 2019-05-17 2020-11-17 成都鼎桥通信技术有限公司 Video recording method and device of handheld terminal in video single call

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521565A (en) * 2011-11-23 2012-06-27 浙江晨鹰科技有限公司 Garment identification method and system for low-resolution video
JP2013196518A (en) * 2012-03-21 2013-09-30 Casio Comput Co Ltd Image processing apparatus, image processing method and program
CN103577063A (en) * 2012-07-23 2014-02-12 Lg电子株式会社 Mobile tmerinal and control method thereof
CN103827913A (en) * 2011-09-27 2014-05-28 三星电子株式会社 Apparatus and method for clipping and sharing content in portable terminal
CN104820711A (en) * 2015-05-19 2015-08-05 深圳久凌软件技术有限公司 Video retrieval method for figure target in complex scene
CN106021496A (en) * 2016-05-19 2016-10-12 海信集团有限公司 Video search method and video search device
CN106534967A (en) * 2016-10-25 2017-03-22 司马大大(北京)智能系统有限公司 Video editing method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103827913A (en) * 2011-09-27 2014-05-28 三星电子株式会社 Apparatus and method for clipping and sharing content in portable terminal
CN102521565A (en) * 2011-11-23 2012-06-27 浙江晨鹰科技有限公司 Garment identification method and system for low-resolution video
JP2013196518A (en) * 2012-03-21 2013-09-30 Casio Comput Co Ltd Image processing apparatus, image processing method and program
CN103577063A (en) * 2012-07-23 2014-02-12 Lg电子株式会社 Mobile tmerinal and control method thereof
CN104820711A (en) * 2015-05-19 2015-08-05 深圳久凌软件技术有限公司 Video retrieval method for figure target in complex scene
CN106021496A (en) * 2016-05-19 2016-10-12 海信集团有限公司 Video search method and video search device
CN106534967A (en) * 2016-10-25 2017-03-22 司马大大(北京)智能系统有限公司 Video editing method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111953919A (en) * 2019-05-17 2020-11-17 成都鼎桥通信技术有限公司 Video recording method and device of handheld terminal in video single call
CN111953919B (en) * 2019-05-17 2022-11-04 成都鼎桥通信技术有限公司 Video recording method and device of handheld terminal in video single call

Similar Documents

Publication Publication Date Title
US10872416B2 (en) Object oriented image editing
CN101706793B (en) Method and device for searching picture
CN106933465A (en) A content display method based on an intelligent desktop and an intelligent desktop terminal
JP5358083B2 (en) Person image search device and image search device
US6578040B1 (en) Method and apparatus for indexing of topics using foils
US9478054B1 (en) Image overlay compositing
CN105760461A (en) Automatic album establishing method and device
CN110866236B (en) Private picture display method, device, terminal and storage medium
US10229323B2 (en) Terminal and method for managing video file
WO2021203823A1 (en) Image classification method and apparatus, storage medium, and electronic device
CN105302315A (en) Image processing method and device
WO2013060269A1 (en) Association relationship establishment method and device
WO2010027481A1 (en) Indexing related media from multiple sources
CN105513007A (en) Mobile terminal based photographing beautifying method and system, and mobile terminal
WO2013049374A2 (en) Photograph digitization through the use of video photography and computer vision technology
CN103604271A (en) Intelligent-refrigerator based food recognition method
US20170139911A1 (en) Address book based picture matching method and terminal
CN101334780A (en) Method and system for searching figure image and recording medium for storing image metadata
CN103702117A (en) Image processing apparatus, image processing method, and program
TWI472936B (en) Human photo search system
CN110110147A (en) A kind of method and device of video frequency searching
JP2006079458A (en) Image transmission system, method, and program
CN105159959A (en) Image file processing method and system
WO2019144840A1 (en) Method and apparatus for acquiring video semantic information
CN107527604A (en) A kind of photo display methods and user terminal

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17920352

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17920352

Country of ref document: EP

Kind code of ref document: A1