[go: up one dir, main page]

CN111210111A - Urban environment assessment method and system based on online learning and crowdsourcing data analysis - Google Patents

Urban environment assessment method and system based on online learning and crowdsourcing data analysis Download PDF

Info

Publication number
CN111210111A
CN111210111A CN201911332515.0A CN201911332515A CN111210111A CN 111210111 A CN111210111 A CN 111210111A CN 201911332515 A CN201911332515 A CN 201911332515A CN 111210111 A CN111210111 A CN 111210111A
Authority
CN
China
Prior art keywords
urban environment
model
street view
city
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911332515.0A
Other languages
Chinese (zh)
Other versions
CN111210111B (en
Inventor
张一杨
马小雯
舒元昊
张惠根
林兴萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CETHIK Group Ltd
Original Assignee
CETHIK Group Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CETHIK Group Ltd filed Critical CETHIK Group Ltd
Priority to CN201911332515.0A priority Critical patent/CN111210111B/en
Publication of CN111210111A publication Critical patent/CN111210111A/en
Application granted granted Critical
Publication of CN111210111B publication Critical patent/CN111210111B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Tourism & Hospitality (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Evolutionary Biology (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Business, Economics & Management (AREA)
  • Artificial Intelligence (AREA)
  • Marketing (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Primary Health Care (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an urban environment assessment method and system based on online learning and crowdsourcing data analysis, wherein the method comprises the following steps: collecting urban environment assessment data from a crowdsourcing data platform, and preprocessing the collected urban environment assessment data by utilizing a crowdsourcing algorithm; constructing an urban environment evaluation model, and training the urban environment evaluation model by adopting preprocessed urban environment evaluation data; collecting new model training data, constructing an online learning algorithm, optimizing the urban environment evaluation model at regular time by utilizing the online learning algorithm based on the new model training data, and outputting urban impression attribute comparison results to be evaluated by utilizing the latest urban environment evaluation model. The method and the device fully consider the differences of different population evaluations, can update the evaluation model in real time, have strong universality and obviously improve the accuracy of the evaluation result.

Description

Urban environment assessment method and system based on online learning and crowdsourcing data analysis
Technical Field
The application belongs to the field of smart cities, and particularly relates to a city environment assessment method and system based on online learning and crowdsourcing data analysis.
Background
The expansion of urban size and the acceleration of urbanization process have brought about great challenges to urban development. To address the challenge, smart cities have come. Urban environment, i.e. natural and artificial external conditions affecting urban human activities, is one of the important indicators for measuring urban development. Urban environment assessment is an important component in smart city research, and the final purpose of the urban environment assessment is to improve urban environment, improve citizen satisfaction, assist in making relevant policies and realize urban sustainable development.
The urban environment includes urban economic environment, social environment, ecological environment, aesthetic environment, and the like. The urban environment assessment traditionally adopts a method based on field investigation, and the method has high cost and low benefit and is difficult to analyze from a macroscopic perspective. With the development of sensor technology and the arrival of the big data era, a large number of street view pictures are provided for urban environment assessment research by a new data acquisition means, and the street view assessment can be used as a new branch of urban environment assessment. The street view picture has the characteristics of wide distribution range, large data volume and detailed content, and can reflect the urban state from the micro and macro level simultaneously. Meanwhile, the current rapidly-developed deep learning technology, especially a computer vision model, is introduced into an urban environment assessment task, so that the image feature extraction cost is reduced, the authenticity of urban environment assessment is improved, and the application field of smart cities is expanded.
City environment assessment based on street view (i.e., street view assessment) related research has recently become a new direction in the field of smart cities. At present, the city environment assessment related research based on street view mainly focuses on improving the scoring accuracy of a model to city feeling through paired street view pictures, and ignores the current situations that the subjectivity of city environment assessment is strong and the individual difference of a labeling result is large, so that the current assessment result is large in difference and low in accuracy, and the existing assessment system lacks the real-time model updating capability, cannot cope with changeable street view environment data, and is poor in universality.
Disclosure of Invention
The application aims to provide an urban environment assessment method and system based on online learning and crowdsourcing data analysis, differences of assessment of different crowds are fully considered, an assessment model can be updated in real time, universality is high, and accuracy of assessment results is remarkably improved.
In order to achieve the purpose, the technical scheme adopted by the application is as follows:
a city environment assessment method based on online learning and crowd-sourced data analysis comprises the following steps:
step S1, collecting urban environment assessment data from the crowd-sourced data platform, and preprocessing the collected urban environment assessment data by using a crowd-sourced algorithm;
step S2, constructing an urban environment assessment model, and training the urban environment assessment model by adopting preprocessed urban environment assessment data;
and S3, collecting new model training data, constructing an online learning algorithm, optimizing the urban environment evaluation model at regular time by using the online learning algorithm based on the new model training data, and outputting an urban impression attribute comparison result to be evaluated by using the latest urban environment evaluation model.
Preferably, the collecting urban environment assessment data from the crowd-sourced data platform comprises:
s1.1, releasing paired street view pictures on a crowd-sourced data platform;
s1.2, receiving a comparison result of a marker on the crowdsourcing data platform on each street view picture in pairs;
and S1.3, taking the comparison result as a label generated by the current annotator on the current paired street view pictures, and establishing a one-to-one correspondence relationship among the annotator, the paired street view pictures and the label.
Preferably, the collected urban environment assessment data is preprocessed by using a crowdsourcing algorithm, and the preprocessing comprises the following steps:
s1.4, setting the total number of samples to be marked as N, namely, setting the total logarithm of the street view picture as N, and acquiring a comparison result of K markers, namely labels, from a crowd data platform, wherein the total class number of the labels is I;
step S1.5, for the nth sample, the following relation is present:
Figure BDA0002330046700000021
wherein P represents the authenticity label of the nth sample
Figure BDA0002330046700000022
Probability of i, let us note
Figure BDA0002330046700000023
I represents the label class I, and I ∈ (1, …, I);
s1.6, each marker corresponds to an I multiplied by I confusion matrix, and the confusion matrix corresponding to the kth marker is recorded as pi(k)
Figure BDA0002330046700000024
Denotes the probability that the kth annotator annotates a sample with a true label a with b, a denotes label class a and a ∈ (1, …, I), b denotes label class b and b ∈ (1, …, I), and
Figure BDA0002330046700000025
step S1.7, constructing conditional probabilities as follows:
Figure BDA0002330046700000026
wherein S isnRepresents the nth sample, QniTrue tag representing nth sample under current parameters
Figure BDA0002330046700000027
Is the probability of being i,
Figure BDA0002330046700000028
representing the probability set of the corresponding real label of 1, … I, … I of each sample so as to
Figure BDA0002330046700000029
True tag representing nth sample
Figure BDA00023300467000000210
A probability of i, i
Figure BDA00023300467000000211
Pi represents a confusion matrix corresponding to each marker;
s1.8, calculating probability sets of the true labels corresponding to the samples to be 1, … I and … I respectively by adopting a crowdsourcing algorithm
Figure BDA0002330046700000031
And the confusion matrix pi corresponding to each label is obtained according to calculation
Figure BDA0002330046700000032
And pi, by the conditional probability formula QniThe conditional probability of each sample corresponding to all comparison results can be calculated, and the comparison result corresponding to the maximum conditional probability of each sample is selected as the final label of the current sample.
Preferably, the probability set with the true label i corresponding to each sample is calculated by adopting a crowdsourcing algorithm
Figure BDA0002330046700000033
And the confusion matrix pi corresponding to each label person comprises:
the adopted crowdsourcing algorithm is EM algorithm, and EM algorithm is adopted for calculation
Figure BDA0002330046700000034
And pi, the calculation process is as follows:
definition of QniThe initial values of (a) are:
Figure BDA0002330046700000035
wherein k represents the number of the annotator,
Figure BDA0002330046700000036
representing the number of times all annotators labeled the nth sample as i,
Figure BDA0002330046700000037
denotes the number of times the kth tagger labels the nth sample as b, thus QniThe initial value of (1) is the number of times that all annotators mark the nth sample as i divided by the total number of times that all annotators mark the nth sample;
defining M steps as follows: constructing an auxiliary function, maximizing the auxiliary function by using maximum likelihood estimation to update parameters
Figure BDA0002330046700000038
And pi; wherein the auxiliary function is:
Figure BDA0002330046700000039
define step E as follows: according to Bayesian formula by
Figure BDA00023300467000000310
And pi update QniThe formula is updated as follows:
Figure BDA00023300467000000311
circularly executing the step E and the step M until the end condition of the EM algorithm is met, and obtaining the final parameters
Figure BDA00023300467000000312
And pi.
Preferably, the constructing of the urban environment assessment model includes:
s2.1, constructing a twin network, wherein the twin network consists of two city impression scoring models with the same weight, each city impression scoring model takes one of the pair of street view pictures as input and outputs a city impression attribute score of the street view picture; the city impression scoring model comprises a computer vision model used for extracting features in street view pictures and a full connection layer used for outputting city impression attribute scoring according to the extracted features;
and S2.2, constructing a logistic regression model of the twin network, wherein the twin network and the logistic regression model form a city environment evaluation model, the logistic regression model takes the difference value of two city impression attribute scores output by the twin network as input, takes the probability that the subjective feeling generated by a first street view picture in the street view pictures is greater than that of a second street view picture in degree as output, and the dependent variable of the logistic regression model is set to be represented by 0, 0.5 and 1, namely the set output result is represented by 0, 0.5 and 1.
Preferably, the acquiring new model training data includes:
1) collecting new urban environment evaluation data from a crowd-sourced data platform;
2) and collecting user feedback data, wherein the user feedback data is a judgment result of judging whether the user accords with the subjective impression of the user according to the city impression attribute comparison result output by the city environment evaluation model.
Preferably, the constructing an online learning algorithm, and the optimizing the urban environment assessment model at regular time by using the online learning algorithm based on new model training data includes:
s3.1, representing the input of a city environment evaluation model by x, namely representing the paired street view pictures of the input twin network; f represents an urban environment evaluation model; representing the parameters of the urban environment assessment model by theta; with ypThe output of the city environment evaluation model, that is, the probability that the subjective feeling generated on the first street view picture in the pair of street view pictures output by the logistic regression model is greater than that of the second street view picture, is represented, the city environment evaluation model may be represented as:
yp=f(x|θ)
step S3.2, establishing a loss function as follows:
Figure BDA0002330046700000041
wherein, ytRepresenting the actual city impression attribute comparison result of the paired street view pictures;
step S3.3, deriving the loss function to obtain gradient values ξ corresponding to all parameters, and after the random gradient is decreased, updating the parameters of the urban environment assessment model to θ' as follows:
θ′=θ-ηξ
here, η is a learning rate.
The application also provides a city environment assessment system based on online learning and crowdsourcing data analysis, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to realize the steps of the city environment assessment method based on online learning and crowdsourcing data analysis in any technical scheme.
According to the urban environment assessment method and system based on online learning and crowdsourcing data analysis, the crowdsourcing algorithm is adopted to improve the reliability of the marking data, the crowdsourcing algorithm is used to consider the differences of different markers, and the online learning algorithm is used to optimize the assessment model at regular time, so that the assessment model has high universality, and meanwhile, the accuracy of the assessment result is obviously improved.
Drawings
Fig. 1 is a flowchart of an urban environment assessment method based on online learning and crowd-sourced data analysis according to the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application.
It should be understood that steps in this application are not limited to being performed in the exact order described, and that steps may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternating with other steps or at least a portion of the sub-steps or stages of other steps.
In one embodiment, the urban environment assessment method based on online learning and crowdsourcing data analysis is provided and used for obtaining subjective feeling tendency and evaluation of a city to people through visual element information such as urban landscape provided by street view images, so that the obtained subjective feeling tendency and evaluation are used as important indexes for city construction.
As shown in fig. 1, the urban environment assessment method based on online learning and crowd-sourced data analysis includes:
and step S1, collecting urban environment assessment data from the crowd-sourced data platform, and preprocessing the collected urban environment assessment data by using a crowd-sourced algorithm.
The data acquired through the crowdsourcing data platform are comprehensive, the crowd is wide, and the crowdsourcing data platform has a better effect on the training of the model. When data are collected through a crowdsourcing data platform, the following steps are mainly executed:
and S1.1, releasing the paired street view pictures on a crowd-sourced data platform. After the city impression attributes are released to the crowdsourcing data platform, please note the annotator to generate the comparison result of subjective feelings of different degrees in the paired street view pictures according to the specific city impression attributes (such as security, aesthetic and the like).
And S1.2, receiving a comparison result of the annotator on the crowdsourcing data platform on each pair of street view pictures.
And S1.3, taking the comparison result as a label generated by the current annotator on the current paired street view pictures, and establishing a one-to-one correspondence relationship among the annotator, the paired street view pictures and the label.
To ensure the comprehensiveness of the comparison result, in one embodiment, the comparison result is set to three categories, i.e., the tag has three categories. If one of the street view pictures in the pair is set as a first street view picture and the other is set as a second street view picture, the three types of comparison results are respectively as follows:
the tag category 1 is that subjective feeling generated on the first street view picture is greater than that of the second street view picture in degree; the label category 2 is that subjective feeling generated on the first street view picture is equal to the second street view picture in degree; the tag category 3, the subjective feeling produced to the first street view picture is less than that of the second street view picture.
In addition, in order to ensure the comprehensiveness of the acquired data, the subjective feeling of the annotator can be generated under the influence of aesthetic standards of the annotator or under the influence of specific professional knowledge (such as industry evaluation standards).
Because the data collected on the crowdsourcing data platform relates to a wide range of people, and multiple annotators annotate the street view pictures with the same input street view picture, that is, multiple comparison results (which may be the same or different) are obtained for the same street view picture, the final tags of the street view pictures with the same pair of street view pictures need to be determined by integrating multiple pieces of information.
In the prior art, "majority voting" is usually used, but this method ignores the variability of labeled individuals. In order to avoid adverse effects on the result output by the final evaluation model due to the differences, in one embodiment, a crowdsourcing algorithm is used for respectively modeling the capability of the annotator and the comparison result, and an EM algorithm is used for solving model parameters to complete data preprocessing. And the crowdsourcing algorithm can consider that a bad annotator deliberately beats labels randomly, and can screen out the bad annotator by utilizing the annotator capability model so as to reduce the influence on the comparison result.
Preprocessing the collected urban environment assessment data by utilizing a crowdsourcing algorithm, wherein the preprocessing comprises the following steps:
step S1.4, setting the total number of samples to be labeled as N, that is, the total number of pairs of street view pictures is N, obtaining the comparison result of K-bit annotators, that is, the tags from the crowd data platform, and the total number of types of the tags is I type, and dividing the comparison result into three types according to the foregoing known comparison result in this embodiment, that is, I is 3.
Step S1.5, for the nth sample, the following relation is present:
Figure BDA0002330046700000061
wherein P represents the authenticity label of the nth sample
Figure BDA0002330046700000062
Probability of i, let us note
Figure BDA0002330046700000063
I represents the label class I, and I ∈ (1, …, I);
s1.6, each marker corresponds to an I multiplied by I confusion matrix, and the confusion matrix corresponding to the kth marker is recorded as pi(k)
Figure BDA0002330046700000071
Denotes the probability that the kth annotator annotates a sample with a true label a with b, a denotes label class a and a ∈ (1, …, I), b denotes label class b and b ∈ (1, …, I), and
Figure BDA0002330046700000072
step S1.7, constructing conditional probabilities as follows:
Figure BDA0002330046700000073
wherein S isnRepresents the nth sample, QniTrue tag representing nth sample under current parameters
Figure BDA0002330046700000074
Is the probability of being i,
Figure BDA0002330046700000075
representing the probability set of the corresponding real label of 1, … I, … I of each sample so as to
Figure BDA0002330046700000076
True tag representing nth sample
Figure BDA0002330046700000077
A probability of i, i
Figure BDA0002330046700000078
Pi represents a confusion matrix corresponding to each marker;
s1.8, calculating probability sets of the true labels corresponding to the samples to be 1, … I and … I respectively by adopting a crowdsourcing algorithm
Figure BDA0002330046700000079
And the confusion matrix pi corresponding to each label is obtained according to calculation
Figure BDA00023300467000000710
And pi, by the conditional probability formula QniThe conditional probability of each sample corresponding to all comparison results can be calculated, and the comparison result corresponding to the maximum conditional probability of each sample is selected as the final label of the current sample.
Wherein the calculation is carried out
Figure BDA00023300467000000711
In order to better conform to the application environment of the crowdsourcing algorithm in the embodiment, the EM algorithm is taken as an example for explanation. It should be understood that the present implementation algorithm is not limited to the EM algorithm.
Parameters defining the EM algorithm: assuming the number of samples to be N; defining an observed value x(j)Wherein j takes the value from 1 to N and represents the sample number; defining hidden variable gammajkWhen the local j samples are labeled by the kth class marker, γjkValue 1, otherwise γjkThe value is 0.
The main steps of the EM algorithm are parameter initialization, and then the steps E and M are repeated until the model converges. Wherein the parameters in the initialization of the parameters, the steps E and the steps M are defined as follows.
Definition of QniThe initial values of (a) are:
Figure BDA00023300467000000712
wherein k represents the number of the annotator,
Figure BDA00023300467000000713
representing the number of times all annotators labeled the nth sample as i,
Figure BDA00023300467000000714
denotes the number of times the kth tagger labels the nth sample as b, thus QniThe initial value of (1) is the number of times that all annotators mark the nth sample as i divided by the total number of times that all annotators mark the nth sample;
defining M steps as follows: constructing an auxiliary function, maximizing the auxiliary function by using maximum likelihood estimation to update parameters
Figure BDA0002330046700000081
And pi; wherein the auxiliary function is:
Figure BDA0002330046700000082
define step E as follows: according to Bayesian formula by
Figure BDA0002330046700000083
And pi update QniNumber ofThe value, update formula is as follows:
Figure BDA0002330046700000084
circularly executing the step E and the step M until the end condition of the EM algorithm is met, and obtaining the final parameters
Figure BDA0002330046700000085
And pi.
Therefore, the urban environment assessment data is preprocessed, differences of subjective feelings of different annotators are considered when the labels of the street view pictures in pairs are finally selected, the reliability of the data is effectively improved, and a foundation is laid for the output accuracy of a later-stage assessment model.
And S2, constructing an urban environment evaluation model, and training the urban environment evaluation model by adopting the preprocessed urban environment evaluation data.
Corresponding to the twin street view picture, the twin network is applied for identification, and the process of constructing the urban environment assessment model in one embodiment is as follows:
s2.1, constructing a twin network, wherein the twin network consists of two city impression scoring models with the same weight, each city impression scoring model takes one of the pair of street view pictures as input, and outputs the city impression attribute score of the street view picture.
And the adopted city impression scoring model comprises a computer vision model (such as a VGG model) for extracting features in the street view picture and a full connection layer for outputting the city impression attribute scoring according to the extracted features.
The twin network adopts the structure in the prior art and is not the focus of the improvement of the application.
And S2.2, constructing a logistic regression model of the twin network, wherein the twin network and the logistic regression model form a city environment evaluation model, the logistic regression model takes the difference value of two city impression attribute scores output by the twin network as input, the probability that subjective feeling generated on a first street view picture in a pair of street view pictures is greater than that of a second street view picture in degree as output, and dependent variables of the logistic regression model are set to be represented by 0, 0.5 and 1, namely the set output result is represented by 0, 0.5 and 1.
The dependent variables 0, 0.5 and 1 of the urban environment assessment model respectively represent that the probability that the subjective feeling generated on the first street view picture is greater than that of the second street view picture is 0%, 50% and 100%, and the corresponding crowd-sourced data label has the category that the subjective feeling generated on the first street view picture is less than, equal to or greater than that of the second street view picture.
And S3, collecting new model training data, constructing an online learning algorithm, optimizing the urban environment evaluation model at regular time by using the online learning algorithm based on the new model training data, and outputting an urban impression attribute comparison result to be evaluated by using the latest urban environment evaluation model.
The urban environment evaluation model is updated regularly in order to ensure the accuracy of the output result of the model in real time and adapt to the change of the environment and the subjective feeling of people, and meanwhile, the evaluation model can be optimized in a targeted manner to adapt to the subjective feeling of users.
The new model training data thus collected includes the following two parts:
1) new urban environment assessment data is collected from the crowd-sourced data platform.
2) And acquiring user feedback data, wherein the user feedback data is a judgment result of judging whether the user accords with the subjective impression of the user according to the urban impression comparison result output by the urban environment evaluation model.
When user feedback data is collected, for example, an urban travel route recommendation system based on an urban environment evaluation model can be established, paired street view pictures and model prediction results can be given, and a user can judge whether the comparison result accords with self subjective impression, so that the urban environment evaluation result of a specific user is optimized.
The goal of the online learning algorithm is to further optimize the urban environment assessment model with the new data obtained in the previous step. In one embodiment, in order to ensure accurate optimization of online learning, the online learning method used is a random gradient descent method. The basic idea of the method is that for each newly input sample, the existing model is used for obtaining a prediction result, a loss function is constructed according to the real result of the sample and the model prediction result, and finally a gradient descent method is used for updating model parameters.
The method comprises the following specific steps:
s3.1, representing the input of a city environment evaluation model by x, namely representing the paired street view pictures of the input twin network; f represents an urban environment evaluation model; representing the parameters of the urban environment assessment model by theta; with ypThe probability that the subjective feeling generated by the first street view picture in the pair of street view pictures output by the logistic regression model is greater than that of the second street view picture is represented as the output of the urban environment assessment model, and the urban environment assessment model can be represented as follows:
yp=f(x|θ)
step S3.2, establishing a loss function as follows:
Figure BDA0002330046700000091
wherein, ytRepresenting the actual city impression attribute comparison result of the paired street view pictures;
step S3.3, deriving the loss function to obtain gradient values ξ corresponding to all parameters, and after the random gradient is decreased, updating the parameters of the urban environment assessment model to θ' as follows:
θ′=θ-ηξ
here, η is a learning rate.
And performing online learning optimization at regular time in order to keep the model optimal so as to improve the accuracy of the urban impression attribute comparison result and provide reliable data for urban construction.
In another embodiment, a city environment assessment system based on online learning and crowd-sourced data analysis is further provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the city environment assessment method based on online learning and crowd-sourced data analysis according to any embodiment when executing the computer program.
In this embodiment, an urban environment assessment system based on online learning and crowd-sourced data analysis is a computer device, which may be a terminal. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a city environment assessment method based on online learning and crowd-sourced data analysis. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
The urban environment assessment system based on online learning and crowdsourcing data analysis provided by the embodiment comprises a crowdsourcing data platform, an urban environment evaluation model and an online learning module, and aims to obtain reliable urban environment evaluation data through a crowdsourcing method and establish an online learning mechanism to achieve the aim of updating the model in real time.
For further limitation of the urban environment assessment system based on online learning and crowd-sourced data analysis, reference may be made to the above-mentioned limitation on the urban environment assessment method based on online learning and crowd-sourced data analysis, and details are not repeated.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (8)

1. A city environment assessment method based on online learning and crowd-sourced data analysis is characterized in that the city environment assessment method based on online learning and crowd-sourced data analysis comprises the following steps:
step S1, collecting urban environment assessment data from the crowd-sourced data platform, and preprocessing the collected urban environment assessment data by using a crowd-sourced algorithm;
step S2, constructing an urban environment assessment model, and training the urban environment assessment model by adopting preprocessed urban environment assessment data;
and S3, collecting new model training data, constructing an online learning algorithm, optimizing the urban environment evaluation model at regular time by using the online learning algorithm based on the new model training data, and outputting an urban impression attribute comparison result to be evaluated by using the latest urban environment evaluation model.
2. The urban environment assessment method based on online learning and crowdsourcing data analysis of claim 1, wherein collecting urban environment assessment data from a crowdsourcing data platform comprises:
s1.1, releasing paired street view pictures on a crowd-sourced data platform;
s1.2, receiving a comparison result of a marker on the crowdsourcing data platform on each street view picture in pairs;
and S1.3, taking the comparison result as a label generated by the current annotator on the current paired street view pictures, and establishing a one-to-one correspondence relationship among the annotator, the paired street view pictures and the label.
3. The urban environment assessment method based on online learning and crowdsourcing data analysis according to claim 2, wherein preprocessing the collected urban environment assessment data by using a crowdsourcing algorithm comprises:
s1.4, setting the total number of samples to be marked as N, namely, setting the total logarithm of the street view picture as N, and acquiring a comparison result of K markers, namely labels, from a crowd data platform, wherein the total class number of the labels is I;
step S1.5, for the nth sample, the following relation is present:
Figure FDA0002330046690000011
wherein P represents the authenticity label of the nth sample
Figure FDA0002330046690000012
Probability of i, let us note
Figure FDA0002330046690000013
I represents a label class I, and I ∈ (1.,. I);
s1.6, each marker corresponds to an I multiplied by I confusion matrix, and the confusion matrix corresponding to the kth marker is recorded as pi(k)
Figure FDA0002330046690000014
Denotes the probability that the kth annotator annotates a sample with a true label a with b, a denotes the label class a and a ∈ (1...., I), b denotes the label class b and b ∈ (1...., I), and
Figure FDA0002330046690000015
step S1.7, constructing conditional probabilities as follows:
Figure FDA0002330046690000016
wherein S isnRepresents the nth sample, QniTrue tag representing nth sample under current parameters
Figure FDA0002330046690000017
Is the probability of being i,
Figure FDA0002330046690000018
a probability set representing that the real label corresponding to each sample is 1
Figure FDA0002330046690000019
True tag representing nth sample
Figure FDA0002330046690000021
A probability of i, i
Figure FDA0002330046690000022
Pi represents a confusion matrix corresponding to each marker;
s1.8, calculating a probability set of 1, a
Figure FDA0002330046690000023
And the confusion matrix pi corresponding to each label is obtained according to calculation
Figure FDA0002330046690000024
And pi, by the conditional probability formula QniThe conditional probability of each sample corresponding to all comparison results can be calculated, and the comparison result corresponding to the maximum conditional probability of each sample is selected as the final label of the current sample.
4. The city environment assessment method based on online learning and crowdsourcing data analysis according to claim 3, wherein the crowdsourcing algorithm is adopted to calculate the probability set of the true label i corresponding to each sample
Figure FDA0002330046690000025
And the confusion matrix pi corresponding to each label person comprises:
the adopted crowdsourcing algorithm is EM algorithm, and EM algorithm is adopted for calculation
Figure FDA0002330046690000026
And pi, the calculation process is as follows:
definition of QniThe initial values of (a) are:
Figure FDA0002330046690000027
wherein k represents the number of the annotator,
Figure FDA0002330046690000028
representing the number of times all annotators labeled the nth sample as i,
Figure FDA0002330046690000029
denotes the number of times the kth tagger labels the nth sample as b, thus QniThe initial value of (1) is the number of times that all annotators mark the nth sample as i divided by the total number of times that all annotators mark the nth sample;
defining M steps as follows: constructing an auxiliary function, maximizing the auxiliary function by using maximum likelihood estimation to update parameters
Figure FDA00023300466900000210
And pi; wherein the auxiliary function is:
Figure FDA00023300466900000211
define step E as follows: according to Bayesian formula by
Figure FDA00023300466900000212
And pi update QniThe formula is updated as follows:
Figure FDA00023300466900000213
circularly executing the step E and the step M until the end condition of the EM algorithm is met, and obtaining the final parameters
Figure FDA00023300466900000214
And pi.
5. The urban environment assessment method based on online learning and crowd-sourced data analysis according to claim 1, wherein the building of the urban environment assessment model comprises:
s2.1, constructing a twin network, wherein the twin network consists of two city impression scoring models with the same weight, each city impression scoring model takes one of the pair of street view pictures as input and outputs a city impression attribute score of the street view picture; the city impression scoring model comprises a computer vision model used for extracting features in street view pictures and a full connection layer used for outputting city impression attribute scoring according to the extracted features;
and S2.2, constructing a logistic regression model of the twin network, wherein the twin network and the logistic regression model form a city environment evaluation model, the logistic regression model takes the difference value of two city impression attribute scores output by the twin network as input, takes the probability that the subjective feeling generated by a first street view picture in the street view pictures is greater than that of a second street view picture in degree as output, and the dependent variable of the logistic regression model is set to be represented by 0, 0.5 and 1, namely the set output result is represented by 0, 0.5 and 1.
6. The method for urban environment assessment based on online learning and crowd-sourced data analysis according to claim 5, wherein the collecting new model training data comprises:
1) collecting new urban environment evaluation data from a crowd-sourced data platform;
2) and collecting user feedback data, wherein the user feedback data is a judgment result of judging whether the user accords with the subjective impression of the user according to the city impression attribute comparison result output by the city environment evaluation model.
7. The urban environment assessment method based on online learning and crowdsourcing data analysis according to claim 6, wherein the constructing of an online learning algorithm, and the timing optimization of the urban environment assessment model by using the online learning algorithm based on new model training data comprises:
s3.1, representing the input of a city environment evaluation model by x, namely representing the paired street view pictures of the input twin network; f represents an urban environment evaluation model; representing the parameters of the urban environment assessment model by theta; with ypThe output of the city environment evaluation model, that is, the probability that the subjective feeling generated on the first street view picture in the pair of street view pictures output by the logistic regression model is greater than that of the second street view picture, is represented, the city environment evaluation model may be represented as:
yp=f(x|θ)
step S3.2, establishing a loss function as follows:
Figure FDA0002330046690000031
wherein, ytRepresenting the actual city impression attribute comparison result of the paired street view pictures;
step S3.3, deriving the loss function to obtain gradient values ξ corresponding to all parameters, and after the random gradient is decreased, updating the parameters of the urban environment assessment model to θ' as follows:
θ′=θ-ηξ
here, η is a learning rate.
8. An urban environment assessment system based on online learning and crowd-sourced data analysis, comprising a memory and a processor, wherein the memory stores a computer program, and the processor implements the steps of the urban environment assessment method based on online learning and crowd-sourced data analysis according to any one of claims 1 to 7 when executing the computer program.
CN201911332515.0A 2019-12-22 2019-12-22 An urban environment assessment method and system based on online learning and crowdsourcing data analysis Active CN111210111B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911332515.0A CN111210111B (en) 2019-12-22 2019-12-22 An urban environment assessment method and system based on online learning and crowdsourcing data analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911332515.0A CN111210111B (en) 2019-12-22 2019-12-22 An urban environment assessment method and system based on online learning and crowdsourcing data analysis

Publications (2)

Publication Number Publication Date
CN111210111A true CN111210111A (en) 2020-05-29
CN111210111B CN111210111B (en) 2023-10-13

Family

ID=70789245

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911332515.0A Active CN111210111B (en) 2019-12-22 2019-12-22 An urban environment assessment method and system based on online learning and crowdsourcing data analysis

Country Status (1)

Country Link
CN (1) CN111210111B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111679829A (en) * 2020-06-11 2020-09-18 北京百度网讯科技有限公司 Determination method and device for user interface design
CN112417556A (en) * 2020-11-18 2021-02-26 同济大学 BIM forward design method based on image measurable intelligent evaluation
CN113379284A (en) * 2021-06-24 2021-09-10 哈尔滨工业大学 Indoor environment condition equivalence determination method and determination system based on environment experience probability quality function
CN114565300A (en) * 2022-03-04 2022-05-31 中国科学院生态环境研究中心 A method, system and electronic device for quantifying public subjective emotion
CN114611989A (en) * 2022-03-28 2022-06-10 广州市怡地环保有限公司 Urban environment condition assessment method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130177235A1 (en) * 2012-01-05 2013-07-11 Philip Meier Evaluation of Three-Dimensional Scenes Using Two-Dimensional Representations
CN110163224A (en) * 2018-01-23 2019-08-23 天津大学 It is a kind of can on-line study auxiliary data mask method
CN110580499A (en) * 2019-08-20 2019-12-17 北京邮电大学 Method and system for deep learning object detection based on crowdsourced repeated labels

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130177235A1 (en) * 2012-01-05 2013-07-11 Philip Meier Evaluation of Three-Dimensional Scenes Using Two-Dimensional Representations
CN110163224A (en) * 2018-01-23 2019-08-23 天津大学 It is a kind of can on-line study auxiliary data mask method
CN110580499A (en) * 2019-08-20 2019-12-17 北京邮电大学 Method and system for deep learning object detection based on crowdsourced repeated labels

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张帆等: "大数据背景下的虚拟地理认知实验方法" *
甘欣悦;佘天唯;龙瀛;: "街道建成环境中的城市非正规性 基于北京老城街景图片的人工打分与机器学习相结合的识别探索" *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111679829A (en) * 2020-06-11 2020-09-18 北京百度网讯科技有限公司 Determination method and device for user interface design
CN112417556A (en) * 2020-11-18 2021-02-26 同济大学 BIM forward design method based on image measurable intelligent evaluation
CN113379284A (en) * 2021-06-24 2021-09-10 哈尔滨工业大学 Indoor environment condition equivalence determination method and determination system based on environment experience probability quality function
CN114565300A (en) * 2022-03-04 2022-05-31 中国科学院生态环境研究中心 A method, system and electronic device for quantifying public subjective emotion
CN114565300B (en) * 2022-03-04 2022-12-23 中国科学院生态环境研究中心 Method and system for quantifying subjective emotion of public and electronic equipment
CN114611989A (en) * 2022-03-28 2022-06-10 广州市怡地环保有限公司 Urban environment condition assessment method and device

Also Published As

Publication number Publication date
CN111210111B (en) 2023-10-13

Similar Documents

Publication Publication Date Title
CN111210111B (en) An urban environment assessment method and system based on online learning and crowdsourcing data analysis
CN112863683B (en) Medical record quality control method and device based on artificial intelligence, computer equipment and storage medium
CN111582342B (en) Image identification method, device, equipment and readable storage medium
CN110781668B (en) Text information type identification method and device
CN113569129B (en) Click-through rate prediction model processing method, content recommendation method, device and equipment
CN108960409A (en) Labeled data generation method, equipment and computer readable storage medium
CN115146162A (en) Online course recommendation method and system
CN112149632B (en) Video recognition method, device and electronic equipment
CN113704373B (en) User identification method, device and storage medium based on movement trajectory data
CN115049397A (en) Method and device for identifying risk account in social network
CN115374189B (en) Block chain-based food safety tracing method, device and equipment
CN116844084B (en) A method and system for sports motion analysis and correction integrating blockchain.
CN117056452B (en) Knowledge point learning path construction method, device, equipment and storage medium
CN114491078A (en) Community project personnel foothold and peer personnel analysis method based on knowledge graph
CN113569041A (en) Text detection method, apparatus, computer equipment, and readable storage medium
CN107644272A (en) Student's exception learning performance Forecasting Methodology of Behavior-based control pattern
CN114997263A (en) Training rate analysis method, device, equipment and storage medium based on machine learning
CN113220847B (en) Neural network-based knowledge mastering degree evaluation method and device and related equipment
Chong-gao et al. Design of action correction assistant system in physical education teaching and training based on. NET platform
Ni et al. Sports dance action recognition system oriented to human motion monitoring and sensing
Cui et al. Modelling and simulation for table tennis referee regulation based on finite state machine
CN117237856B (en) Image recognition method, device, computer equipment and storage medium
US20230315745A1 (en) Information pushing method, apparatus, device, storage medium, and computer program product
CN118535749A (en) A power data governance method based on heterogeneous data resource graph
CN116166858B (en) Information recommendation method, device, equipment and storage medium based on artificial intelligence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant