Disclosure of Invention
The invention aims to provide a regional traffic flow prediction system and method facing to the Internet of vehicles to improve the traffic control capability and the utilization efficiency of Internet of vehicles resources in order to overcome the defects of the prior art.
The technical idea of the invention is as follows: the GPS data information of each vehicle user collected through the Internet of vehicles comprehensively considers weather, holidays, dates and time, and a prediction model is trained and learned by using a support vector regression machine, so that more accurate prediction is provided for regional traffic flow.
According to the above thought, the regional traffic flow prediction system for the internet of vehicles of the present invention is characterized by comprising:
the external influence data module records weather conditions of each day and data information of whether each day is a holiday or not and is used as an external influence data source of the data processing module;
the vehicle networking data module is used for recording GPS data information of all running vehicle users in the vehicle networking and is used as an internal influence data source of the data processing module;
the data processing module is used for carrying out numerical value quantization processing on external influence data input by the external influence data module and internal influence data input by the internet of vehicles data module to generate a multi-dimensional row vector and inputting the multi-dimensional row vector to the support vector regression module;
and the support vector regression module is used for carrying out training prediction by utilizing the multidimensional row vector input by the data processing module and learning a prediction model so as to predict the traffic flow at the future cycle moment.
According to the above idea, the present invention provides a method for predicting regional traffic flow using the above system, comprising the steps of:
1) initialization: determining a reference year, a predicted period T and a training sample number m;
2) the data processing module generates data of m cycle moments including the current cycle moment and m-1 cycle moments before the current cycle moment according to the initialized result and the data provided by the external influence data module and the Internet of vehicles data module;
3) the support vector regression module trains and learns a prediction model by using data generated by the data processing module, and predicts and outputs the traffic flow at the m +1 th cycle by using the prediction model;
4) when the predicted m +1 th cycle time becomes the historical time, updating the m +1 th cycle time to be the current cycle time;
5) and (5) circularly executing the steps 2) -4), and finishing uninterrupted prediction of the regional traffic flow at the next period time.
The invention has the following advantages:
firstly, the invention combines external influence data and GPS data of vehicles in the Internet of vehicles, generates multidimensional row vectors through special quantization processing of a data processing module, analyzes, trains and learns the row vectors by utilizing a support vector regression machine to obtain the internal relation between the regional traffic flow and weather, holidays, dates and time, and can construct a prediction model;
secondly, the traffic flow prediction can be carried out in a plurality of areas by utilizing the prediction model constructed by the invention, the traffic flow condition of the future period time of the plurality of areas can be analyzed, and the prediction result of the future period time of the plurality of areas can be obtained;
thirdly, the prediction result of the invention can indicate traffic dispersion and allocate the resources of the Internet of vehicles, thereby improving the capacity of traffic control and the utilization efficiency of the resources of the Internet of vehicles.
Detailed Description
The invention is described in detail below with reference to the figures and examples.
Referring to fig. 1, the traffic flow prediction system for the car networking area of the present invention includes: the system comprises an external influence data module 1, a vehicle networking data module 2, a data processing module 3 and a support vector regression module 4. Wherein:
the external influence data module 1 records the weather condition of each day and the data information of whether each day is a holiday or not, and inputs the data information into the data processing module 3 as an external influence data source of the data processing module 3;
the vehicle networking data module 2 is used for recording GPS data information of all driving vehicle users in the vehicle networking and inputting the data information into the data processing module 3 to be used as an internal influence data source of the data processing module 3;
the data processing module 3 is used for performing numerical quantization processing on external influence data input by the external influence data module 1 and internal influence data input by the internet of vehicles data module 2 to generate multi-dimensional row vectors and inputting the multi-dimensional row vectors into the support vector regression module 4 to serve as training data of the support vector regression module 4;
and the support vector regression module 4 is used for carrying out training prediction by using the multidimensional row vector input by the data processing module 3, learning a prediction model so as to predict the traffic flow at the future cycle moment and outputting a prediction result.
The data recorded by the external influence data module 1 at least comprises date, time and weather conditions, and the holiday label is 1, and the non-holiday label is 0.
The vehicle networking data module 2 records the GPS data of each driving vehicle user in the vehicle networking, and the data format at least comprises date, time and longitude and latitude.
The multidimensional row vector generated by the data processing module 3 is represented as:
wherein:
xweathera quantized value representing the weather condition, the value of which is set according to the weather condition, and the quantized value is set to 1 when the weather condition is one of severe weather such as thunderstorm, strong wind, hail, tornado, local heavy rainfall and snowstorm; when the weather condition is not severe weather, setting the weather condition as 0;
xhdaya quantized value representing whether or not the date is holiday, the value being set to 1 depending on whether or not the date is holiday, when the date is holiday; when the date is a non-holiday, setting the date to be 0, wherein the holiday comprises weekends and legal holidays;
xyeara quantified value representing the year of the date, the value of which is set according to a selected reference year, the selected reference year being set to 1, the quantified values of the other years being equal to the difference between the year and the reference year plus 1;
xweeka quantified value representing the number of weeks in a year, the value being set according to the number of weeks in a year, the first week being set to 1, and the quantified values of the number of weeks thereafter being sequentially incremented;
xdaya quantized value representing seven days of the week, the value of which is set according to the day of the week, monday is set to 1, and the quantized values thereafter are sequentially incremented;
xtimea quantized value representing the predicted time of day, the value being set according to the prediction period T, so that the quantized value is common to all the daysThe quantized value of the first prediction time is 1, the quantized values of the next prediction times are increased in sequence, and the quantized value of the last prediction time isWherein T is expressed in minutes;
the quantization value representing the number of vehicle users in the k-th prediction time zone i in one day is quantized in the following manner:
wherein,i represents the identity of a certain area,represents the number of vehicles in the area i at the (k-1) th and k-th predicted time in the day;representing the number of vehicles in the area i at the (k-1) th prediction time in the day, but not at the area i at the k-th prediction time;representing the number of vehicles in the area i at the (k-1) th predicted time in the day but not at the area i at the k-th predicted time;
the specific calculation formula is as follows:
wherein g represents GPS data for a user of a vehicle; g represents GPS data of all vehicle users; region _ i represents a statistical Region i; gkGPS data representing a vehicle user at the kth predicted time of day.
The support vector regression module 4 performs training prediction by using the multidimensional row vector input by the data processing module 3, and the learned prediction model is represented as follows:
wherein m is the number of samples of training data; the functional expression is as follows: in this example, a gaussian kernel is chosen, whose expression is: kappa (x)i,x)=exp(-g·||xi-x||2),||xi-x||2Representative vector xiSetting a penalty constant C of 80, a Gaussian kernel parameter g of 20 and an interval of 0.1 with the Euclidean distance between the vector x and the training;and b is a weight parameter for training the learned prediction model, and b is a bias parameter for training the learned prediction model.
Referring to fig. 2, the internet of vehicles area oriented traffic flow prediction method of the present invention includes the following steps:
step 1: and (5) initializing.
And determining the reference year, the prediction period T and the training sample number m. The reference year in this example is 2015 years, the prediction period T is 15 minutes, and the number of training samples
Step 2: the data processing module 3 generates 960 cycle moments of data of the current cycle moment and 959 previous cycle moments thereof according to the initialized result and the data provided by the external influence data module 1 and the internet of vehicles data module 2.
The specific implementation of this step is as follows:
2a) the data at each cycle instant is quantized as follows:
2a1) setting a quantized value x of the weather condition according to the weather conditionweather: if the weather condition is one of the severe weather of thunderstorm, strong wind, hail, tornado, local heavy rainfall and snowstorm, setting xweatherIs 1; if the weather condition is not severe, let xweatherIs 0;
2a2) according toSetting the quantized value x of whether the holiday is holiday or nothday: when the date is a holiday, x is sethdayIs 1; when the date is a non-holiday, set xhdayIs 0;
2a3) setting a quantized value x of a date year according to the yearyear: quantized value x of selected reference yearyearSet to 1, quantized value x of other yearsyearEqual to the difference between its year and the reference year plus 1;
2a4) setting a quantified value x of the number of weeks in a year according to the number of weeks in the yearweek: quantifying the value x of the first week in a yearweekSet to 1, the quantized value x of the number of its subsequent cyclesweekSequentially increasing progressively;
2a5) setting a quantization value x for seven days of a weekday: quantizing the value x of MondaydaySet to 1, the quantization value x thereafterdaySequentially increasing progressively;
2a6) setting a prediction time quantization value x of one day according to the prediction time and the prediction period T of the first daytime: the quantized value x of the first predicted time of daytimeSet to 1, the quantized value x of the subsequent prediction timetimeSequentially increasing the quantized value x of the last predicted time in a daytimeIs set to 96;
2a7) calculating the number of vehicle users in the k-th prediction time area i in one day
Let k ∈ {1,2, 3.., 96}, i denote the identity of a certain area, G denotes the GPS data of a certain vehicle user, G denotes the GPS data of all vehicle users, Region _ i denotes the statistical area i, GkGPS data representing a vehicle user at the kth predicted time of day.
The number of vehicles in zone i at both the (k-1) th and k-th predicted times of day is recorded asThe calculation formula is as follows:
the number of vehicles in the area i at the (k-1) th predicted time in the day but not at the area i at the k-th predicted time is recorded asThe calculation formula is as follows:
the number of vehicles in the area i at the (k-1) th predicted time in 7 days but not in the area i at the k th predicted time is recorded asThe calculation formula is as follows:
recording the number of vehicle users in the k-th prediction time zone i in one day asAnd according toCalculate outThe numerical value of (A):
2b) according to the quantization result of step 2a), for eachCarrying out format normalization processing on the data of the periodic time, namely carrying out x corresponding to each periodic time in the step 2a)weather、xhday、xyear、xweek、xday、xtimeAndthe processing forms a multi-dimensional row vector format, represented as:
and step 3: the support vector regression module 4 trains and learns the prediction model by using the data generated by the data processing module 3.
3a) Inputting the 960 periodic moments of data generated in step 2 into a support vector regression module 4, and carrying out [0,1] normalization processing on the data by the module;
3b) the support vector regression module 4 trains and learns a prediction model by using the data after the normalization processing in the step 3a), wherein the prediction model is as follows:
wherein, κ (x)iAnd x) is a Gaussian kernel function, and the expression is as follows: kappa (x)i,x)=exp(-g·||xi-x||2) (ii) a g is a Gaussian kernel parameter, and g is 20; | xi-x||2Representative vector xiEuclidean distance to vector x;weight parameters of the learned prediction model for training; and b is a bias parameter of the prediction model learned by training.
And 4, step 4: and predicting the regional traffic flow at the next period of time of the current period of time in the training data by using the prediction model learned by the support vector regression module 4, and finishing the updating of the current period of time.
4a) Predicting and outputting a result of the regional traffic flow at the 961 th cycle time by using the support vector regression-based prediction model trained and learned in the step 3 and combining vector data at the 961 th cycle time, and completing regional traffic flow prediction at the 961 th cycle time;
4b) when the predicted 961 th cycle time becomes the history time, the 961 th cycle time is updated to the current cycle time.
And 5: and (5) circularly executing the step 2 to the step 4 to finish uninterrupted prediction of the regional traffic flow at the next period.
The above description is only one specific example of the present invention and should not be construed as limiting the present invention, and it will be apparent to those skilled in the art that various modifications, equivalent substitutions, changes and the like can be made without departing from the spirit of the present invention, and these changes and modifications should be considered as falling within the scope of the present invention.