[go: up one dir, main page]

WO2024108265A1 - A system for predicting property value indicators - Google Patents

A system for predicting property value indicators Download PDF

Info

Publication number
WO2024108265A1
WO2024108265A1 PCT/AU2023/051201 AU2023051201W WO2024108265A1 WO 2024108265 A1 WO2024108265 A1 WO 2024108265A1 AU 2023051201 W AU2023051201 W AU 2023051201W WO 2024108265 A1 WO2024108265 A1 WO 2024108265A1
Authority
WO
WIPO (PCT)
Prior art keywords
property
data
machine learning
land
learning model
Prior art date
Application number
PCT/AU2023/051201
Other languages
French (fr)
Inventor
Gene NICHOLSON
George-Adrian CIOBANU
Original Assignee
AICloud Pty Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU2022903555A external-priority patent/AU2022903555A0/en
Application filed by AICloud Pty Ltd filed Critical AICloud Pty Ltd
Publication of WO2024108265A1 publication Critical patent/WO2024108265A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/16Real estate
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0206Price or cost determination based on market factors

Definitions

  • the present disclosure relates to a system for predicting property value indicators and to a method of predicting property value indicators.
  • Such property value indicators may include gross rental values (GRVs) indicative of the annual gross rental amount that a property might reasonably be expected to produce; unimproved values (UVs) indicative of land value, exclusive of value attributable to a property that exists on the land; and capital values (CV) indicative of the total value inclusive of land value and value attributable to a property that exists on the land.
  • GMVs gross rental values
  • UVs unimproved values
  • CV capital values
  • property value indicators may also be used to identify outlier transactions that for example potentially indicate erroneous reported transaction values, or deliberate reporting of inappropriate transaction amounts that do not reflect true value and are therefore potentially non-compliant with relevant tax legislation, such as transactions between related parties that refer to valuations significantly below an expected amount. Such erroneous or inappropriate transfer amounts can result in payment of a transfer tax amount that is too low.
  • Property value indicators may also be used by financial institutions, for example to assess the value of proposed loan security, or by insurance institutions.
  • Typical existing approaches for obtaining property value indicators for a large number of properties include valuations generated using a mass valuation tool, such as a rule based expert system, time series analysis tool, and/or a multiple regression analysis tool.
  • a mass valuation tool such as a rule based expert system, time series analysis tool, and/or a multiple regression analysis tool.
  • the task of producing valuations for a large number of valuations is typically a very long process that can take several years to complete.
  • the quality of produced valuations may be low or the consistency of the quality of the produced values may be low.
  • a system for predicting a property value indicator indicative of financial value of a real estate property at a selected date of valuation comprising: at least one machine learning model trained using a training data set to predict property value indicators at a selectable date of valuation, the training data set including first property data indicative of attributes of a plurality of properties as model inputs and actual financial transaction data associated with the properties at specific transaction dates as model outputs; and a custom property data generator arranged to determine custom second property data using the first property data; wherein the at least one machine learning model is also trained using the custom second property data as inputs; and wherein property value indicators are inferred for a selected date of valuation by the at least one machine learning model using the first and second property data as inputs to the at least one machine learning model but not using transaction values as inputs to the at least one machine learning model.
  • the first property data includes property data indicative of characteristics of properties.
  • the characteristics may include configuration information associated with each parcel of land, property feature information, configuration information associated with each building disposed on each parcel of land, and/or the year that a building on a parcel of land was built.
  • the property attributes include information indicative of whether a property is associated with multiple parcels of land and, if so, which parcels of land are associated with the property; whether the property is for commercial, residential, mixed, industrial or farming use; information indicative of the sub market area in which each property is located; and zoning classifications assigned to each property.
  • the custom second property data is determined using data stored in at least one stored lookup table.
  • the custom second property data includes data indicative of a quality and/or usability of a property.
  • the custom second property data may include frontage quality data indicative of a frontage quality of a property.
  • the custom second property data may include shape quality data indicative of a shape quality of a property.
  • the shape quality data includes data determined using the following formula: where min_area_per_dwelling is a field that defines the minimum area allowed for development of a building on a parcel of land, and ven_land_area is a field that specifies the sum of all land areas associated with a property.
  • the data in the min_area_per_dwelling field is contained in a lookup table.
  • the shape quality data includes data determined using the following formula: where land_area is a field that specifies the land area of a parcel of land, boundary_line_count is a field that specifies the number of boundary lines in a plot of land, that is, the number of segments of a polygon representing the shape of the parcel of land, and angles_u45 is a field that specifies the number of angles in the polygon representing the shape of the parcel of land that are less than 45°.
  • the custom second property data may include development potential data indicative of a development potential of a property.
  • the system is arranged to determine based on the first property data whether a building disposed on the property has been demolished within a defined period after a transaction date, and to add a demolish financial amount to the property value indicator associated with the property if the property has been demolished within the defined period.
  • the demolish financial amount is determined using the following formula: where k is a constant
  • the system includes a validation component arranged to validate operation of the at least one machine learning model.
  • the validation component is arranged to validate operation of the at least one machine learning model by: excluding a defined first subset of financial transaction data associated with a plurality of properties from the training data set used to train the at least one machine learning model; and subsequently comparing the excluded financial transaction data with corresponding inferred property value indicators.
  • the first subset of financial transaction data is defined based on a defined start validation financial transaction date and a defined end validation transaction date.
  • the defined start and end validation transaction dates may be user definable.
  • the validation component is arranged to validate operation of the at least one machine learning model by: excluding a defined second subset of financial transaction data associated with a plurality of properties from the training data set used to train the at least one machine learning model; excluding first and second property data associated with the excluded financial transaction data from the training data set used to train the at least one machine learning model; and subsequently comparing the excluded financial transaction data with corresponding inferred property value indicators.
  • the defined second subset of financial transaction data is a subset of the first subset of financial transaction data.
  • the second subset of financial transaction data may be selected randomly from the first subset of financial transaction data.
  • the second subset of financial transaction data is a user selectable proportion of the first subset of financial transaction data.
  • the validation component is arranged to validate operation of the at least one machine learning model by: excluding a defined third subset of financial transaction data associated with a plurality of properties from the training data set used to train the at least one machine learning model; and subsequently comparing the excluded financial transaction data with corresponding inferred property value indicators; wherein the third subset of financial transaction data is defined based on transaction dates within a defined period of the date of valuation.
  • the defined period is user definable.
  • the system includes a data standardisation component arranged to standardise the first property data so that the first property data is in a format compatible with the at least one machine learning model.
  • the system includes an outlier remover component arranged to remove property data that is considered to be unlikely to be correct.
  • the system includes a missing value inferrer arranged to add data to a first property data field if the data for the property data field can be reasonably inferred.
  • the missing value inferrer is arranged to add data to a property data field using a lookup table.
  • the lookup table includes data indicative of average building areas, and the system is arranged to infer a building area of a property by determining whether the building area of the property is above or below an average property area using the first property data and the lookup table.
  • the first property data includes names of buyers and sellers involved in a financial transaction
  • the system is arranged to determine whether the financial transaction was between related parties using the names of buyers and sellers.
  • the system includes a data anonymiser arranged to remove data indicative of the buyer and seller names associated with properties after the determination is made as to whether the financial transaction was between related parties using the names of buyers and sellers.
  • the system is arranged to facilitate reception of property value assessment criteria used by the system to predict the property value indicators.
  • the property value assessment criteria may include a date of valuation, at least one property value indication type that defines the or each type of property value indication predicted by the system, a residential or industrial property type, and/or a date of effect.
  • the property value indication type may include a gross rental value, an unimproved value and/or a capital value.
  • the system includes a plurality of machine learning models, each machine learning model associated with a combination of one property value indication type and one of residential or industrial property type.
  • the system includes a database and the first property information is obtained from an external data source and stored in the database, the system extracting first property data from the database and using the extracted first property data to train the at least one machine learning model and infer property value indicators.
  • the system includes an online accessible user interface that may be a web-based user interface.
  • the at least one machine learning model comprises at least one Gradient Boosting Machine (GBM) algorithm that uses an xboost library.
  • GBM Gradient Boosting Machine
  • the system is arranged to facilitate selection of a revaluation option wherein at least one machine learning model is trained using current first and second property data to produce at least one trained machine learning model, and the at least one trained machine learning model is used to infer property value indicators for all properties in a geographical area, or an interim assessment wherein property value indicators for a defined set of properties are inferred using an existing at least one trained machine learning model.
  • the system is arranged to compare predicted property value indicators associated with at least 2 property value assessments for at least 2 different dates of valuation and to produce property value change results.
  • the property value change results include information indicative of changes in mean property values, median property values and/or total property values.
  • the property value change results are based only on properties that have not changed in the period between the at least 2 different dates of valuation.
  • the property value change results are based on all properties covered by the property value assessments.
  • the property value change results include total value data indicative of the total financial value of all properties covered by the property value assessments, valuation property count data indicative of the total number of properties covered by the property value assessments, value change data indicative of the total financial value change of the total number of properties covered by the property value assessments, and a count change section indicative of a change in the total number of properties.
  • a method of predicting a property value indicator indicative of financial value of a real estate property at a selected date of valuation comprising: receiving first property data indicative of attributes of a plurality of properties; determining custom second property data using the first property data; training at least one machine learning model using a training data set to predict property value indicators at a selectable date of valuation, the training data set including the first and second property data as model inputs and actual financial transaction data associated with the properties at specific transaction dates as model outputs; and inferring property value indicators for a selected date of valuation by the at least one machine learning model using the first and second property data as inputs to the at least one machine learning model but not using transaction values as inputs to the at least one machine learning model.
  • the method includes, for each new property value indicator revaluation, training the at least one machine learning model using current first and second property data and current actual financial transaction data to produce at least one trained machine learning model, and using the at least one trained machine learning model to infer property value indicators for properties in a geographical area.
  • the method includes, for each interim assessment, inferring property value indicators using an existing at least one trained machine learning model.
  • Figure 1 is a schematic block diagram of a system for predicting property values in accordance with an embodiment of the present invention
  • Figure 2 is a block diagram illustrating functional application components of the system shown in Figure 1 ;
  • Figure 3 is a block diagram illustrating application components of a pre-processing application component shown in Figure 2;
  • Figure 4 is a block diagram illustrating application components of a custom field generator application component shown in Figure 3;
  • Figure 5 is a flow diagram illustrating a method of predicting a property value indicator in accordance with an aspect of the present invention
  • Figure 6 is a representation of a home page of a user interface of the system shown in Figure 1 ;
  • Figure 7 is a representation of a revaluation page of a user interface of the system shown in Figure 1 ;
  • Figure 8 is a representation of a new assessment criteria page of a user interface of the system shown in Figure 1 ;
  • Figure 9 is a representation of an assessment criteria pane of the assessment criteria page shown in Figure 8.
  • Figure 10 is a representation of a preparation page of a user interface of the system shown in Figure 1 ;
  • Figure 11 is a representation of a processing page of a user interface of the system shown in Figure 1 ;
  • Figure 12 is a representation of a results page of a user interface of the system shown in Figure 1 ;
  • Figure 13 is a representation of a review page of a user interface of the system shown in Figure 1 ;
  • Figure 14 is a representation of a performance page that includes an error distribution associated with a blind validation test
  • Figure 15 is a representation of a map page of a user interface of the system shown in Figure 1 showing a first map view
  • Figure 16 is a representation of the map page of Figure 14 showing a second map view
  • Figures 17a to 17e show example tables of an example relational database structure of the system shown in Figure 1 ;
  • Figure 18 is a representation of a change analytics page of a user interface of the system shown in Figure 1 ;
  • Figure 19 is a representation of an assessment comparison selection page of a user interface of the system shown in Figure 1 ;
  • Figures 20a and 20b are representations of a selected assessments page of a user interface of the system shown in Figure 1 ;
  • Figure 21 is a representation of a change analytics processing page of a user interface of the system shown in Figure 1 ;
  • Figure 22 is a representation of a property index results page of a user interface of the system shown in Figure 1 ;
  • Figure 23 is a representation of a property index results page of a user interface of the system shown in Figure 1 .
  • the present disclosure relates to a system for and method of predicting property value indicators.
  • the term ‘property’ will be understood to mean a vacant parcel of land, or a developed parcel of land, that is, a parcel of land on which a building is disposed.
  • property value indicator means any numerical value that is indicative of a financial value of a property, and in the embodiments described in this specification, the following examples of property value indicators are used:
  • the Gross Rental Value (GRV) of a property is the annual rental financial amount that would be expected to be received for a property at a defined date. If a property is vacant land, the GRV may for example be calculated as a defined percentage of the Unimproved Value (UV) of the property, such as 3% of the UV.
  • UV Unimproved Value
  • UV Unimproved Value
  • the Capital Value (CV) of a property is the expected sale financial amount that would be expected to be received for the property, inclusive of both the financial value of a parcel of land associated with a property and the financial value of any building disposed on the property.
  • property value indicator is synonymous with the term ‘transaction value’ used in the property valuation industry to mean GRV, UV and/or CV.
  • transaction value is used to refer to financial values associated with actual GRV, UV and CV transactions that have occurred
  • property value indicator is used to refer to predicted financial values for GRVs, UVs and CVs.
  • Date of Valuation is a defined point in time at which properties in an assessment area are assessed to determine property value indicators (ie. predicted values) for GRV, UV and/or CV.
  • Figure 1 shows a schematic block diagram of a system 10 for predicting property value indicators according to an embodiment of the invention.
  • the system 10 in this example is accessible through a wide area network such as the Internet 12, and is arranged to enable multiple authorised users to interact with the system 10, for example using a computing device that may be any suitable device including a personal computer 14, tablet computer 16 and/or a smartphone 18.
  • the system 10 is also accessible by a system administrator, such as using a suitable computing device 20, for example in order to manage user authorisations.
  • the system 10 uses machine learning to predict property value indicators based on property information obtained from a suitable source of property information, for example a source associated with a government authority, and based on custom property information produced using the property information.
  • a suitable source of property information for example a source associated with a government authority
  • the present inventors have realised that property value indicators that are fit for purpose can be obtained using machine learning if custom property information is produced that simulates ‘street level’ judgement of a human valuer, and the custom property information is used with conventional property information to train a machine learning model and infer data outcomes using the trained model.
  • custom property information is produced based on the existing property information using defined algorithms, as described in more detail below.
  • the system 10 is arranged to predict property value indicators based on property information obtained from a source 22 associated with a government authority, and in an example the property information is imported to the system 10 periodically from the government source 22, for example every month.
  • the property value indicators are predicted using the property data and custom property data as inputs to the machine learning models, and without using actual transaction values as machine learning model inputs.
  • properties are treated equally irrespective of whether significant, little or no actual transaction evidence exists and a consistent valuation approach is taken to each assessed property.
  • the imported property information includes the following attribute information associated with a property:
  • VEN unique valuation entity number
  • property attribute data including any other property feature data may be imported, the important aspect being that the property information is indicative of an attribute of a property and includes information that could be considered to be a factor relevant to consideration of the value of a property.
  • the imported data is stored in a relational database 24 that includes several related tables.
  • the tables include: a valuation entity number (VEN) table 26 for storing unique VEN numbers received from the data source 22; a land table 28 for storing unique property identifiers received from the data source 22, each property identifier indicative of a parcel of land that may or may not include an associated building disposed on the parcel of land; a zoning table 30 for storing a zoning classification assigned to each parcel of land; a features table 32 for storing property feature information, for example that includes information indicative of the year a building on a property was built, whether an ocean view exists, the floor area of a building, the number of bedrooms of a building, and building and land shape information; a classifications table 34 for storing information indicative of whether the property is for commercial, residential, mixed, industrial or farming use; a transactions table 36 for storing information indicative of actual sale amounts and/or rental amounts of properties and the relevant dates of the transactions; and a sub market areas table 38 for
  • control unit 40 that in the present embodiment includes a processor arranged to implement programs associated with program data 42 stored in a data storage device 44 using memory 46.
  • the data storage device 44 also stores custom data 48 produced using the property data stored in the relational database 24, and for example using lookup table data 50, such as table data usable to facilitate correction or inference of missing data in the relational database 24.
  • the lookup table data 50 may also include other tables, such as a minimum area table indicative of the minimum area allowed for development of a building on a parcel of land in each geographical area.
  • the data storage device 44 also stores machine learning model data 52 indicative of one or more machine learning models that are capable of being trained using the imported property data and custom data, and used to predict property value indicators.
  • the machine learning model(s) are based on at least one Gradient Boosting Machine (GBM) algorithm that uses an xgboost library, although it will be understood that any suitable machine learning framework may be used.
  • GBM Gradient Boosting Machine
  • a RMSLE Root Mean Squared Logarithmic Error
  • the data storage device 44 also stores data indicative of trained machine learning models 54, each trained model 54 associated with a previous assessment and capable of being implemented to perform new assessments based on the trained model, for example as an interim assessment.
  • a machine learning model is defined for each of the following property value indicators: i) GRV for residential properties; ii) GRV for industrial properties; iii) UV for residential properties; iv) UV for industrial properties; v) CV for residential properties; and vi) CV for industrial properties.
  • a single machine learning model may be used for GRV, UV and CV and both residential and industrial properties, or multiple machine learning models specific to a combination of one or more of the above property value indicators i) to vi) may be used.
  • system 10 is implemented using a client - server model with the system accessible by a client computing device through the Internet, for example by accessing a dedicated web page and providing authentication information.
  • client computing device for example by accessing a dedicated web page and providing authentication information.
  • other implementations are possible.
  • Example functional program components 56 implemented by the system 10 based on the stored program data 42 are shown in Figure 2.
  • the functional program components 56 include a criteria selection application 58 arranged to facilitate selection by a user of criteria that will be used by the system 10 to train the machine learning model(s) of the system 10; criteria that will be used by the system to validate operation of the system 10, in particular in terms of accuracy of the predicted property value indicators; and criteria that will be used to set the parameters of the desired predicted property value indicators, such as the relevant date of valuation and the geographical scope of the predictions.
  • a criteria selection application 58 arranged to facilitate selection by a user of criteria that will be used by the system 10 to train the machine learning model(s) of the system 10; criteria that will be used by the system to validate operation of the system 10, in particular in terms of accuracy of the predicted property value indicators; and criteria that will be used to set the parameters of the desired predicted property value indicators, such as the relevant date of valuation and the geographical scope of the predictions.
  • the criteria selection application 58 receives training and implementation criteria from an operator using a dedicated user interface that in this example is accessed through a web site by providing suitable authentication details.
  • the selectable criteria include the following:
  • DoV date of valuation
  • An evidence date range defined by specifying start and end evidence dates, the evidence date range defining a training information set having transaction dates within the start and end evidence dates and being used to train the machine learning model(s).
  • a validation date range defined by specifying start and end validation dates, the validation date range defining a validation information set having transaction dates within the start and end validation dates, and being used to validate operation of the system 10.
  • Transaction values are removed from training data having transaction dates within the validation date range, and the removed transaction values are subsequently compared with property value indicators predicted by the system to provide an indication of performance of the system.
  • a blind testing ratio that defines the percentage of the properties in the validation date range that will be used to assess performance of the system 10 using a blind testing methodology.
  • the properties used for blind testing are selected randomly from the validation information set. With blind testing, all property information for the blind testing properties is removed from the training information set, and the transaction values of the removed properties are subsequently compared with property value indicators predicted by the system for the removed properties to provide an indication of performance of the system.
  • Transaction values are removed from training data having transaction dates within the regulation date range, and the removed transaction values are subsequently compared with property value indicators predicted by the system to provide an indication of performance of the system.
  • the regulation period covers a period that is close to and spans the date of valuation, and in this way relatively recent actual GRV, UV and CV values can be compared against system predicted GRV, UV and CV values.
  • the regulation period spans a period 2/3 months before and 2/3 months after the date of valuation.
  • CV capital values
  • CV capital value
  • UV property value indicators An unimproved values (UV) field that when selected causes UV property value indicators to be produced by the system.
  • UV unimproved value
  • GRV gross rental value
  • GEV gross rental value
  • the program components 56 also include a data ingester 60 that extracts property information from the relational database 24 according to the criteria defined using the criteria selection application 58, and a pre-processing component 62 that processes the extracted property information prior to training the machine learning model(s) and inferring new property value indicators.
  • the program components 56 also include a model training component 64 arranged to implement training of the machine learning model(s), a model validation component 66 arranged to implement validation of operation of the machine learning model(s), and a model application component 68 arranged to apply the machine learning model(s) to property data and custom data in order to infer new property value indicators.
  • Example functional components of the pre-processing component 62 are shown in Figure 3.
  • the pre-processing component 62 includes a data standardisation component 70 that standardises the property data received from the data source 22 so that the data used to train the machine learning model(s) and produce predicted property value indicators is in a format compatible with the machine learning model(s).
  • the data standardisation component 70 carries out the following actions:
  • the data standardisation component 70 modifies the data where necessary so that all data set labels separate words using underscore lines.
  • the data standardisation component 70 modifies the data where necessary so that all percentage values are numbers between 0 and 100.
  • All data values that are expected to be numerical are considered to be valid or not valid according to whether the values appear to be numerical or not. For example, ‘x123’ is considered to be invalid, ‘0123’ is considered to be valid, ‘0123’ is considered to be valid, ‘-123’ is considered to be valid, ‘+123’ is considered to be valid, ‘1.2.3’ is considered to be invalid, ‘123‘ is considered to be valid, ‘$123’ is considered to be valid, ‘1e3’ is considered to be valid and ‘1+3’ is considered to be invalid.
  • the pre-processing component 62 also includes an outlier remover component 74 that removes data values considered to be unlikely to be correct, for example if a data value indicates that a property was built after the current date or the property has a land area less than 0.
  • the pre-processing component 62 also includes a missing value inferrer 76 that adds data values to some fields if the data value can be reasonably inferred. For example, if a data field associated with the presence of a garage at a property is blank, it can be inferred that no garage exists. The standard data format used by the system to represent this is ‘O’, so in this case the missing value inferrer 76 adds ‘0’ to the garage data field. For other data values that are missing but cannot be inferred simply from the absence of a data value, but may be inferred from other information, such as a data field that contains a value for house area, a lookup table 50 is used to estimate the house area.
  • a missing value inferrer 76 adds data values to some fields if the data value can be reasonably inferred. For example, if a data field associated with the presence of a garage at a property is blank, it can be inferred that no garage exists. The standard data format used by the system to represent this
  • the lookup table for house area includes values for average house areas in each defined geographical area, such as each suburb or local government area (LGA), based on the number of rooms in the house and a data field allowance.
  • the allowance data field provides an indication as to whether the rooms in the house are smaller or larger than average, for example in percentage terms. If the allowance field is blank, the missing value inferrer 76 infers that the value of the allowance field is zero, which indicates that the room sizes of the house are of average size.
  • the pre-processing component 62 also includes a data anonymiser 78 that removes data indicative of the names associated with properties. For example, the present embodiment uses the names of buyers and sellers involved in a property transaction to obtain an indication as to whether an actual property transaction was between related parties, and after this information is obtained the name information is removed.
  • the pre-processing component 62 also includes a custom field data generator 80 arranged to generate custom property data for use in custom data fields that are used to train the machine learning model(s), and for use as inputs when property value indicators are inferred by the trained machine learning model(s).
  • a custom field data generator 80 arranged to generate custom property data for use in custom data fields that are used to train the machine learning model(s), and for use as inputs when property value indicators are inferred by the trained machine learning model(s).
  • the custom property data provides the machine learning model(s) with simulated ‘street level’ information of a human valuer, so that subtle, human-intuition based value aspects of a property not directly available through existing property information data sets can be incorporated into the trained model(s) and therefore the property value indicators that are predicted by the model(s).
  • custom property data is produced by the custom field data generator 80: frontage_ 1 frontage_2
  • the frontage_1 custom field data is calculated using a frontage_1 custom field generator 82
  • the frontage_2 custom field data is calculated using a frontage_2 custom field generator 84
  • the shape_quality_1 custom field data is calculated using a shape_quality_1 custom field generator 86
  • the shape_quality_2 custom field data is calculated using a shape_quality_2 custom field generator 88
  • the development_potential custom field data is calculated using a development_field custom field generator 90
  • the vacant demolition indicator custom field data is calculated using a vacant/demolition field generator 92.
  • the frontage_1 field provides a numerical indication of frontage characteristics of a parcel of land.
  • frontage_2 provides a numerical indication of frontage characteristics of a parcel of land.
  • the frontage_2 field therefore defines a ratio of total frontage of a parcel of land to the perimeter length of the parcel of land, and therefore provides a further numerical indication of the frontage characteristics of the parcel of land.
  • the shape_quality_1 field includes custom property data calculated according to the following formula: where min_area_per_dwelling is a field that defines the minimum area allowed for development of a building on a parcel of land, and ven_land_area is a field that specifies the sum of all land areas associated with a valuation entity number (VEN), since a VEN can include one or more individual parcels of land.
  • min_area_per_dwelling is a field that defines the minimum area allowed for development of a building on a parcel of land
  • ven_land_area is a field that specifies the sum of all land areas associated with a valuation entity number (VEN), since a VEN can include one or more individual parcels of land.
  • the data in the min_area_per_dwelling field is contained in a lookup table 50 stored in the data storage device 44.
  • the shape_quality_1 field provides a numerical indication of the quality of the shape of a parcel of land, for example in terms of utility value of the land shape.
  • the shape_quality_2 field includes custom property data calculated according to the following formula: where land_area is a field that specifies the land area of a parcel of land, boundary_line_count is a field that specifies the number of boundary lines in a plot of land, that is, the number of segments of a polygon representing the shape of the parcel of land, and angles_u45 is a field that specifies the number of angles in the polygon representing the shape of the parcel of land that are less than 45°.
  • the shape_quality_2 field provides a further numerical indication of the quality of the shape of a parcel of land, for example in terms of utility value of the land shape.
  • the development_potential field provides a numerical indication of the potential number of subdivisions available for a parcel of land, and therefore the development potential of the parcel of land. vacant demolition indicator
  • the vacant_demolition_indicatortie ⁇ d includes custom property data determined according to the following rules:
  • sale_analysed_status is a field that defines the status of a parcel of land at date of sale, such as vacant ‘V’, demolished ‘D’ or inproved T;
  • transaction date is a field that includes transaction dates;
  • classification is a field that defines the current recorded status of a parcel of land;
  • purpose is a field that indicates the type of transaction associated with the transaction date, such as ‘GRV’, ‘LIV’ or ‘CV’.
  • Rule 3 if the parcel of land was deemed ‘vacant (demolition)’ according to Rule 2, amend the value in the transaction price field by adding a defined demolition cost defined in a demolition cost field, in the present example $20,000, to the transaction price if the transaction date is in a defined transaction year. If the transaction date is not in the defined transaction year, then modify the defined demolition cost for earlier (or later) transaction years using a compounding yearly 2% inflation rate.
  • a defined demolition cost defined in a demolition cost field in the present example $20,000
  • the vacant_demolition_indicatortie ⁇ d includes property information determined according to the following rule:
  • custom property data for example that provides an indication of the usability and/or quality of a parcel of land and/or a building disposed on the land may be produced by the custom field data generator 80.
  • the system 10 also generates related party data for inclusion in a related_parties field that indicates whether buying and selling parties are related using the names of the buying and selling parties.
  • the related_parties field is used to indicate transaction values that occur between relatives in order to determine whether a recorded transaction value is significantly below market rate, and therefore relevant transaction-based taxes are potentially too low.
  • Figure 5 shows a flow diagram 100 illustrating steps 102 to 126 of a method of predicting property value indicators according to an embodiment of the present invention.
  • Figures 6 to 16 show example pages of a user interface of the system 10 implemented using a web site.
  • machine learning models are used, one for each of the following combinations - GRV/residential, GRV/industrial, UV/residential, UV/industrial, CV/residential and CV/industriaL
  • the system 10 is arranged to periodically import property data from a data source 22, typically a government source of property information.
  • the imported property data is indicative of attributes of properties in a regional area, such as a national geographical area, a state geographical area or a local government authority geographical area.
  • the property attributes include features associated with the properties in the defined geographical area such as area of the relevant parcel of land, number of sides of a polygon representing the shape of a parcel of land, number of bedrooms in a building disposed on a parcel of land, whether the property has a garage, and so on.
  • the property data also includes data associated with previous actual property financial transactions associated with the properties, in the present embodiment including gross rental values (GRVs), unimproved values (UVs) and capital values (CVs).
  • GUVs gross rental values
  • UVs unimproved values
  • CVs capital values
  • a user desiring to obtain GRV, UV and/or CV property value indicators for a desired geographical area first logs into the system by accessing a website associated with the system and entering login details. After successful login, a home page 130 is displayed, as shown in Figure 6.
  • the home page 130 includes a residential selection menu 132 and an industrial selection menu 134 that are respectively usable to carry out residential and industrial property valuations.
  • a revaluation assessment generates property value indicators for the entire geographical area based on a new training set, whilst an interim assessment generates property value indicators based on one or more existing machine learning models.
  • the interim assessments may also be limited to a specific sub-geographical area or specific properties, such as properties created since the last assessment was carried out.
  • the user has selected a residential revaluation assessment, and in response an assessments page 136 is displayed, as shown in Figure 7 and indicated at step 104.
  • the assessments page 136 includes a list of existing assessments 138 that may be in progress assessments 140, wherein criteria has been at least partially defined for an assessment but the assessment has not yet been performed, and completed assessments 142, wherein the assessments have been performed by the system 10 and property value indicators determined.
  • Each in progress assessment 140 includes a continue button 144 usable to continue with an in progress assessment, and each completed assessment 142 includes a view outcome button 144 usable to view the outcome of the relevant assessment.
  • the assessments page 136 also includes a create assessment button 148 usable to create and implement a new assessment and thereby determine new property value indicators for properties in a defined geographical area.
  • Selection of the create assessment button 148 causes a new assessment criteria page 150 to be displayed, as shown in Figure 8.
  • Selection of the interim assessment option causes a similar page to be displayed with similar selection criteria available, but also an option to select a previous assessment.
  • Completed assessments are associated with existing trained machine learning models 54 that are stored in the data storage device 44.
  • the new assessments criteria page 150 includes a new assessment criteria pane 152, an enlarged view of which is shown in Figure 9.
  • the new assessment criteria pane 152 is used to receive assessment criteria information from the user, as indicated at step 104.
  • the assessment criteria information includes the following:
  • the system 10 ingests the relevant property data from the relational database 24 and carries out pre-processing actions on the data using the pre-processing component 62, as indicated at steps 106 to 118.
  • the pre-processing actions include standardisation of the data 106, wherein the property data is modified as necessary so as to be in a standard format used by the system 10; addition of missing values 110, wherein data values are added to fields if the data value can be reasonably inferred; outlier removal 112, wherein data values that are considered unlikely to be correct are removed; and data anonymisation 114, wherein data indicative of the names associated with properties is removed after an assessment has been made as to whether a relationship exists between seller and buyer.
  • the pre-processing actions may also include predefined data filtering 116 that for example removes particular properties from use as training data based on predefined filters that may include any one or more of the following: Lower/upper transaction price limit;
  • the pre-processing actions also include custom data generation 118, wherein custom property data for use in custom data fields is produced by the custom field generator 80.
  • the custom data is used to train the machine learning model(s), and as inputs to the trained machine learning model when property value indicators are inferred by the machine learning model(s).
  • custom fields described above are generated - frontage_1, frontage_2, shape_quality_ 1, shape_quality_2, development_potential, and vacant_demolition_indicator.
  • the system 10 trains the relevant machine learning models using property data and custom property data with associated transaction dates within the defined evidence date range 156 as model inputs, and the transaction data as outputs, but excluding the following:
  • the validation step involves carrying out the following performance tests:
  • a validation test wherein the property data and custom property data associated with the transaction values excluded from training according to the defined evidence date range 156 is provided as input to the machine learning models, and property value indicators (in the present example GRV, UV and CV values) inferred by the machine learning models for the relevant transaction dates of the excluded transaction values are compared to the excluded transaction values.
  • property value indicators in the present example GRV, UV and CV values
  • a blind test wherein the excluded property data and custom property data according to the defined validation date range 158 and blind testing ratio 160 is provided as input to the machine learning models, and the property value indicators (in the present example GRV, UV and CV) inferred by the machine learning models for the relevant transaction dates of the excluded transaction values are compared to the excluded transaction values.
  • the property value indicators in the present example GRV, UV and CV
  • the property data and custom property data associated with properties in the defined geographical area are used as inputs to the machine learning models, and the machine learning models produce property value indicators (in the present example residential GRV, UV and CV) that constitute predicted values for GRV, UV and CV at the date of valuation.
  • property value indicators in the present example residential GRV, UV and CV
  • Figure 10 shows a preparation page 180 displayed to the user after the assessment criteria has been selected.
  • the preparation page 180 includes property identification data that identifies the sub-geographical areas, in this example the local government authorities (LGAs), that will be included in the assessment.
  • LGAs local government authorities
  • Figure 11 shows a processing page 188 displayed to a user during the training, validation and inference steps 120, 122 and 124, the processing page 188 including a processing graphic 190 showing the geographical area covered by the assessment. As indicated by step 126, after processing is complete, the predicted property value indicators are available for output to the user.
  • FIG 12 is a results page 192 showing a summary 194 of the assessment criteria selected using the new assessment criteria page 150 shown in Figures 8 and 9 and used for the assessment.
  • the results page 192 also includes a statistics panel 196 that includes summary information associated with the assessment.
  • a review page 200 is shown in Figure 13.
  • the review page 200 illustrates the relative influence of features 202 of the machine learning model inputs.
  • the first 2 features 202 are longitude and latitude location coordinate features.
  • a performance page 206 is shown in Figure 14, the performance page 206 showing system performance metrics obtained by carrying out the validation tests, and including an error distribution 208 indicating a median error of approximately 1%.
  • the ‘hit rate decay’ that is, the differences in error across the geographical area covered by the assessment is low since the peak is relatively steep on both sides of the distribution.
  • a map page 210 is shown in Figures 15 and 16, Figure 15 displaying a first map view 212 showing a relatively large portion of the assessed geographical area, and Figure 16 displaying a second map view 216 showing a relatively small portion of the assessed geographical area.
  • the first map view 212 includes property clusters 214 that when selected cause the second map view 216 associated with the property cluster 214 to be displayed.
  • Each assessed property 218 is shown in the second map view 216 and illustrated using a coloured indicator, such as a coloured dot.
  • the assessed properties are selectable, and when selected relevant information associated with the property is displayed, including the predicted property valuation indicators.
  • the system 10 is arranged to enable a user to download the results of an assessment, for example as a CSV file.
  • the system 10 may also be arranged to analyse the predicted property value indicators to determine whether any of the property value indicators appear to be of interest, such as too low or too high, for example by marking properties that are considered to be within expectations using a first colour indicator and marking properties that are considered to be unexpected with a different second colour indicator.
  • the machine learning models determine property value indicators based only on property characteristics defined by the property data and custom property data.
  • the property value indicators for properties that have significant actual associated transaction data are not coloured by the actual transaction data, and properties are treated equally irrespective of whether significant, little or no actual transaction evidence exists.
  • the system 10 is able to achieve a low ‘hit rate decay’ with respect to the accuracy of the predicted property value indicators.
  • the ‘hit rate decay’ is a measure of how consistent the error is between predicted property value indicators and actual transaction values during model validation across the assessed geographical area.
  • predicted property value indicators may be used for any purpose that uses property value indications to carry out an action, such as to calculate rates and/or taxes, to identify erroneous reported transaction values and thereby applicable tax amounts that are too low, by financial institutions to assess the value of property, for example as proposed loan security, or by insurance institutions.
  • the present system and method is able to produce property value indications significantly faster than is possible with conventional valuation techniques, and is therefore significantly more cost effective and efficient than conventional techniques.
  • the system was able to produce property value indications for 1 ,000,000 properties in about 30 minutes.
  • the present system and method is flexible in that it is able to generate property value indications for any valuation date and/or any individual property or group of properties, and to generate interim valuations quickly and cost effectively.
  • the present system and method is able to produce three different types of property value predictions (GRV, UV and CV) essentially in parallel.
  • the system 10 may also be arranged to enable a user to compare predicted property value indicators associated with assessments carried out for different dates of valuation. In this way, a user is able to compare predicted property value indicators (GRV, UV and/or CV) associated with different dates of valuations, for example to provide an indication of changes between the dates of valuation of mean property values and/or total property values.
  • a user in order to carry out a change analysis a user selects a change analytics menu 230 which causes a change analytics page 232 to be displayed, as shown in Figure 18.
  • the change analytics page 232 includes a list of existing change analyses 234 that can be selected by the user to view the results of previously carried out change analyses.
  • Selection of a create new change analysis button causes an assessment comparison selection page 236 to be displayed, as shown in Figure 19.
  • the assessment comparison selection page 236 includes a first assessment selection drop down box 238 and a second assessment selection drop down box 240 usable to select 2 existing assessments that are desired to be compared.
  • a selected assessments page 242 is displayed, as shown in Figures 20a and 20b.
  • the selected assessments page 242 shows the first and second selected assessments in the first and second assessment selection drop down boxes 238, 240, the property data files 242 and assessment criteria 244 used for the first assessment, as shown in Figure 20a, and the property data files 246 and assessment criteria 248 used for the second assessment, as shown in Figure 20b.
  • Figure 21 shows a change analytics processing page 250 displayed to a user as the change analytics processing progresses, the processing page 250 including a processing graphic 252 showing the geographical area covered by the change analytics.
  • a results page 254 is displayed, as shown in Figures 22 and 23.
  • a property index link 256 and a value change link 258 a user is able to display property index results 260, as shown in Figure 22, or value change results 261 , as shown in Figure 23.
  • the property index results 260 show change analytic data for properties that have not changed in the period between the selected first and second assessments, and therefore properties that have not been renovated or in substance altered in the period are included in the change analytics, but properties that have changed, for example because the properties have been renovated or demolished, are excluded. In this way, the property index results 260 provide an indication of property value changes based purely on changes in property market values.
  • the property index results 260 include property index data separate sections for gross renal values (GRV) 262, unimproved values (UV) 264 and capital values (CV) 266, with each section including metro results 270a, 272a, 274a associated with properties located in a metropolitan area, country results 270b, 272b, 274b associated with properties located in a country area and state results 270c, 272c, 274c that represent combined metropolitan and country results.
  • GMV gross renal values
  • UV unimproved values
  • CV capital values
  • the property index results 260 include a property count column 280, a mean property percentage value increase column 282, a Q1 percentage value increase column 284, and a median value increase column 286.
  • the value change results 261 show total change analytic data for all properties covered by the assessments.
  • the value change results 262 include a total valuation section 290 that shows total value data for all properties covered by the assessments for both the first and second assessments, a valuation property count section 292 that shows counts for the total number of properties covered by the assessments for both the first and second assessments, a change valuation section 294 that shows percentage value change data for all properties, and a count change section 296 that shows percentage count change data for all properties.
  • Each section 290, 292, 294, 296 includes separate metro 298, country 300 and state 302 results corresponding to metropolitan properties, country properties and combined metropolitan and country properties.
  • first and second assessments statewide GRV and UV increased by 246.08% and 228.89% respectively, and the total number of GRV and UV properties s nationwide increased by 130.17% and 129.61% respectively.

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Theoretical Computer Science (AREA)
  • Tourism & Hospitality (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Game Theory and Decision Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A system is disclosed for predicting a property value indicator indicative of financial value of a real estate property at a selected date of valuation. The system comprises at least one machine learning model trained using a training data set to predict property value indicators at a selectable date of valuation, the training data set including first property data indicative of attributes of a plurality of properties as model inputs and actual financial transaction data associated with the properties at specific transaction dates as model outputs, and a custom property data generator arranged to determine custom second property data using the first property data. At least one machine learning model is also trained using the custom second property data as inputs, and property value indicators are inferred for a selected date of valuation by the at least one machine learning model using the first and second property data as inputs to the at least one machine learning model but not using transaction values as inputs to the at least one machine learning model.

Description

A SYSTEM FOR PREDICTING PROPERTY VALUE INDICATORS
Field of the Invention
The present disclosure relates to a system for predicting property value indicators and to a method of predicting property value indicators.
Background of the Invention
The value of property is perceived by governments and the public as a fair and equitable basis for determining some rates and taxes, and for this purpose it is common for properties to be regularly re-valued, for example every 1 to 3 years, in order to obtain the appropriate property value indicators to use for new rates and taxes determinations. Such property value indicators may include gross rental values (GRVs) indicative of the annual gross rental amount that a property might reasonably be expected to produce; unimproved values (UVs) indicative of land value, exclusive of value attributable to a property that exists on the land; and capital values (CV) indicative of the total value inclusive of land value and value attributable to a property that exists on the land.
In addition to government applications for determining rate and tax amounts, property value indicators may also be used to identify outlier transactions that for example potentially indicate erroneous reported transaction values, or deliberate reporting of inappropriate transaction amounts that do not reflect true value and are therefore potentially non-compliant with relevant tax legislation, such as transactions between related parties that refer to valuations significantly below an expected amount. Such erroneous or inappropriate transfer amounts can result in payment of a transfer tax amount that is too low.
Property value indicators may also be used by financial institutions, for example to assess the value of proposed loan security, or by insurance institutions.
Typical existing approaches for obtaining property value indicators for a large number of properties include valuations generated using a mass valuation tool, such as a rule based expert system, time series analysis tool, and/or a multiple regression analysis tool.
However, such approaches suffer from one or more limitations in that they are: labour and/or resource intensive; and/or reliant on human expertise and software familiarity; and/or overly reliant on previous transaction values; and/or difficult to manage on a large scale; and/or an oversimplification of the complex valuation process; and/or prone to human bias and subjectivity where human input is required; and/or prone to lack of consistency where human input is required and multiple people are used to assess different properties in a geographical area.
Notwithstanding the use of mass valuation tools by valuers, the task of producing valuations for a large number of valuations is typically a very long process that can take several years to complete. In addition, at least for some of the valuation approaches, the quality of produced valuations may be low or the consistency of the quality of the produced values may be low.
It is also known to carry out manual valuations that do not use mass valuation tools but require detailed individual kerbside and computer inspections. Such assessments are considered to provide the most accurate valuations since they necessarily incorporate human ‘street level’ judgment that is specific to each property.
However, manual valuations are extremely inefficient and labour intensive to the extent that it is impractical to use manual valuations for mass valuation of properties.
Summary of the Invention
In accordance with a first aspect of the present invention, there is provided a system for predicting a property value indicator indicative of financial value of a real estate property at a selected date of valuation, the system comprising: at least one machine learning model trained using a training data set to predict property value indicators at a selectable date of valuation, the training data set including first property data indicative of attributes of a plurality of properties as model inputs and actual financial transaction data associated with the properties at specific transaction dates as model outputs; and a custom property data generator arranged to determine custom second property data using the first property data; wherein the at least one machine learning model is also trained using the custom second property data as inputs; and wherein property value indicators are inferred for a selected date of valuation by the at least one machine learning model using the first and second property data as inputs to the at least one machine learning model but not using transaction values as inputs to the at least one machine learning model.
In an embodiment, the first property data includes property data indicative of characteristics of properties. The characteristics may include configuration information associated with each parcel of land, property feature information, configuration information associated with each building disposed on each parcel of land, and/or the year that a building on a parcel of land was built.
In an embodiment, the property attributes include information indicative of whether a property is associated with multiple parcels of land and, if so, which parcels of land are associated with the property; whether the property is for commercial, residential, mixed, industrial or farming use; information indicative of the sub market area in which each property is located; and zoning classifications assigned to each property.
In an embodiment, the custom second property data is determined using data stored in at least one stored lookup table.
In an embodiment, the custom second property data includes data indicative of a quality and/or usability of a property.
The custom second property data may include frontage quality data indicative of a frontage quality of a property.
In an embodiment, the frontage quality data includes data determined using the following formula: frontage_ 1 = Z (boundary linejength) if boundary line_usage_code = 1 where boundary linejength is a field that includes the length of a land boundary, and boundary Jine_usage_code is a field that indicates the type of land boundary using a numerical code, and code T indicates that the boundary line is a frontage boundary line.
In an embodiment, the frontage quality data includes data determined using the following formula: frontage_2 = frontage_ 1 1 Z (boundary line length) The custom second property data may include shape quality data indicative of a shape quality of a property.
In an embodiment, the shape quality data includes data determined using the following formula:
Figure imgf000005_0001
where min_area_per_dwelling is a field that defines the minimum area allowed for development of a building on a parcel of land, and ven_land_area is a field that specifies the sum of all land areas associated with a property.
In this embodiment, the data in the min_area_per_dwelling field is contained in a lookup table.
In an embodiment, the shape quality data includes data determined using the following formula:
Figure imgf000005_0002
where land_area is a field that specifies the land area of a parcel of land, boundary_line_count is a field that specifies the number of boundary lines in a plot of land, that is, the number of segments of a polygon representing the shape of the parcel of land, and angles_u45 is a field that specifies the number of angles in the polygon representing the shape of the parcel of land that are less than 45°.
The custom second property data may include development potential data indicative of a development potential of a property.
In an embodiment, the development potential data includes data determined using the following formula: development_potential = ven_land_area I min_area_per_dwelling where min_area_per_dwelling is a field that defines the minimum area allowed for development of a building on a parcel of land, and ven_land_area is a field that specifies the sum of all land areas associated with a property.
In an embodiment, the system is arranged to determine based on the first property data whether a building disposed on the property has been demolished within a defined period after a transaction date, and to add a demolish financial amount to the property value indicator associated with the property if the property has been demolished within the defined period.
In an embodiment, the demolish financial amount is determined using the following formula:
Figure imgf000006_0001
where k is a constant
In an embodiment, the system includes a validation component arranged to validate operation of the at least one machine learning model.
In an embodiment, the validation component is arranged to validate operation of the at least one machine learning model by: excluding a defined first subset of financial transaction data associated with a plurality of properties from the training data set used to train the at least one machine learning model; and subsequently comparing the excluded financial transaction data with corresponding inferred property value indicators.
In an embodiment, the first subset of financial transaction data is defined based on a defined start validation financial transaction date and a defined end validation transaction date. The defined start and end validation transaction dates may be user definable.
In an embodiment, the validation component is arranged to validate operation of the at least one machine learning model by: excluding a defined second subset of financial transaction data associated with a plurality of properties from the training data set used to train the at least one machine learning model; excluding first and second property data associated with the excluded financial transaction data from the training data set used to train the at least one machine learning model; and subsequently comparing the excluded financial transaction data with corresponding inferred property value indicators.
In an embodiment, the defined second subset of financial transaction data is a subset of the first subset of financial transaction data.
In an embodiment, the second subset of financial transaction data may be selected randomly from the first subset of financial transaction data.
In an embodiment, the second subset of financial transaction data is a user selectable proportion of the first subset of financial transaction data.
In an embodiment, the validation component is arranged to validate operation of the at least one machine learning model by: excluding a defined third subset of financial transaction data associated with a plurality of properties from the training data set used to train the at least one machine learning model; and subsequently comparing the excluded financial transaction data with corresponding inferred property value indicators; wherein the third subset of financial transaction data is defined based on transaction dates within a defined period of the date of valuation.
In an embodiment, the defined period is user definable.
In an embodiment, the system includes a data standardisation component arranged to standardise the first property data so that the first property data is in a format compatible with the at least one machine learning model.
In an embodiment, the system includes an outlier remover component arranged to remove property data that is considered to be unlikely to be correct. In an embodiment, the system includes a missing value inferrer arranged to add data to a first property data field if the data for the property data field can be reasonably inferred.
In an embodiment, the missing value inferrer is arranged to add data to a property data field using a lookup table.
In an embodiment, the lookup table includes data indicative of average building areas, and the system is arranged to infer a building area of a property by determining whether the building area of the property is above or below an average property area using the first property data and the lookup table.
In an embodiment, the first property data includes names of buyers and sellers involved in a financial transaction, and the system is arranged to determine whether the financial transaction was between related parties using the names of buyers and sellers.
In an embodiment, the system includes a data anonymiser arranged to remove data indicative of the buyer and seller names associated with properties after the determination is made as to whether the financial transaction was between related parties using the names of buyers and sellers.
In an embodiment, the system is arranged to facilitate reception of property value assessment criteria used by the system to predict the property value indicators. The property value assessment criteria may include a date of valuation, at least one property value indication type that defines the or each type of property value indication predicted by the system, a residential or industrial property type, and/or a date of effect.
The property value indication type may include a gross rental value, an unimproved value and/or a capital value.
In an embodiment, the system includes a plurality of machine learning models, each machine learning model associated with a combination of one property value indication type and one of residential or industrial property type.
In an embodiment, the system includes a database and the first property information is obtained from an external data source and stored in the database, the system extracting first property data from the database and using the extracted first property data to train the at least one machine learning model and infer property value indicators. In an embodiment, the system includes an online accessible user interface that may be a web-based user interface.
In an embodiment, the at least one machine learning model comprises at least one Gradient Boosting Machine (GBM) algorithm that uses an xboost library.
In an embodiment, the system is arranged to facilitate selection of a revaluation option wherein at least one machine learning model is trained using current first and second property data to produce at least one trained machine learning model, and the at least one trained machine learning model is used to infer property value indicators for all properties in a geographical area, or an interim assessment wherein property value indicators for a defined set of properties are inferred using an existing at least one trained machine learning model.
In an embodiment, the system is arranged to compare predicted property value indicators associated with at least 2 property value assessments for at least 2 different dates of valuation and to produce property value change results.
In an embodiment, the property value change results include information indicative of changes in mean property values, median property values and/or total property values.
In an embodiment, the property value change results are based only on properties that have not changed in the period between the at least 2 different dates of valuation.
In an embodiment, the property value change results are based on all properties covered by the property value assessments.
In an embodiment, the property value change results include total value data indicative of the total financial value of all properties covered by the property value assessments, valuation property count data indicative of the total number of properties covered by the property value assessments, value change data indicative of the total financial value change of the total number of properties covered by the property value assessments, and a count change section indicative of a change in the total number of properties.
In accordance with a second aspect of the present invention, there is provided a method of predicting a property value indicator indicative of financial value of a real estate property at a selected date of valuation, the method comprising: receiving first property data indicative of attributes of a plurality of properties; determining custom second property data using the first property data; training at least one machine learning model using a training data set to predict property value indicators at a selectable date of valuation, the training data set including the first and second property data as model inputs and actual financial transaction data associated with the properties at specific transaction dates as model outputs; and inferring property value indicators for a selected date of valuation by the at least one machine learning model using the first and second property data as inputs to the at least one machine learning model but not using transaction values as inputs to the at least one machine learning model.
In an embodiment, the method includes, for each new property value indicator revaluation, training the at least one machine learning model using current first and second property data and current actual financial transaction data to produce at least one trained machine learning model, and using the at least one trained machine learning model to infer property value indicators for properties in a geographical area.
In an embodiment, the method includes, for each interim assessment, inferring property value indicators using an existing at least one trained machine learning model.
Brief Description of the Drawings
The present invention will now be described by way of example only with reference to the accompanying drawings, in which:
Figure 1 is a schematic block diagram of a system for predicting property values in accordance with an embodiment of the present invention;
Figure 2 is a block diagram illustrating functional application components of the system shown in Figure 1 ;
Figure 3 is a block diagram illustrating application components of a pre-processing application component shown in Figure 2;
Figure 4 is a block diagram illustrating application components of a custom field generator application component shown in Figure 3;
Figure 5 is a flow diagram illustrating a method of predicting a property value indicator in accordance with an aspect of the present invention; Figure 6 is a representation of a home page of a user interface of the system shown in Figure 1 ;
Figure 7 is a representation of a revaluation page of a user interface of the system shown in Figure 1 ;
Figure 8 is a representation of a new assessment criteria page of a user interface of the system shown in Figure 1 ;
Figure 9 is a representation of an assessment criteria pane of the assessment criteria page shown in Figure 8;
Figure 10 is a representation of a preparation page of a user interface of the system shown in Figure 1 ;
Figure 11 is a representation of a processing page of a user interface of the system shown in Figure 1 ;
Figure 12 is a representation of a results page of a user interface of the system shown in Figure 1 ;
Figure 13 is a representation of a review page of a user interface of the system shown in Figure 1 ;
Figure 14 is a representation of a performance page that includes an error distribution associated with a blind validation test;
Figure 15 is a representation of a map page of a user interface of the system shown in Figure 1 showing a first map view;
Figure 16 is a representation of the map page of Figure 14 showing a second map view;
Figures 17a to 17e show example tables of an example relational database structure of the system shown in Figure 1 ;
Figure 18 is a representation of a change analytics page of a user interface of the system shown in Figure 1 ; Figure 19 is a representation of an assessment comparison selection page of a user interface of the system shown in Figure 1 ;
Figures 20a and 20b are representations of a selected assessments page of a user interface of the system shown in Figure 1 ;
Figure 21 is a representation of a change analytics processing page of a user interface of the system shown in Figure 1 ;
Figure 22 is a representation of a property index results page of a user interface of the system shown in Figure 1 ; and
Figure 23 is a representation of a property index results page of a user interface of the system shown in Figure 1 .
Detailed Description of Embodiments
The present disclosure relates to a system for and method of predicting property value indicators.
In this specification, the term ‘property’ will be understood to mean a vacant parcel of land, or a developed parcel of land, that is, a parcel of land on which a building is disposed.
It will be understood that in this specification the term ‘property value indicator’ means any numerical value that is indicative of a financial value of a property, and in the embodiments described in this specification, the following examples of property value indicators are used:
Gross Rental Value (GRV);
Unimproved Value (UV); and Capital Value (CV)
GRV
The Gross Rental Value (GRV) of a property is the annual rental financial amount that would be expected to be received for a property at a defined date. If a property is vacant land, the GRV may for example be calculated as a defined percentage of the Unimproved Value (UV) of the property, such as 3% of the UV.
UV
The Unimproved Value (UV) of a property is the expected sale financial amount that would be expected to be received for a parcel of land associated with a property. Accordingly, the UV is land value only, and does not include the financial value of any building disposed on the land.
CV
The Capital Value (CV) of a property is the expected sale financial amount that would be expected to be received for the property, inclusive of both the financial value of a parcel of land associated with a property and the financial value of any building disposed on the property.
The term ‘property value indicator’ is synonymous with the term ‘transaction value’ used in the property valuation industry to mean GRV, UV and/or CV. In the present specification, ‘transaction value’ is used to refer to financial values associated with actual GRV, UV and CV transactions that have occurred, and the term ‘property value indicator’ is used to refer to predicted financial values for GRVs, UVs and CVs.
In the present specification, it will understood that the Date of Valuation (DoV) is a defined point in time at which properties in an assessment area are assessed to determine property value indicators (ie. predicted values) for GRV, UV and/or CV.
Referring to the drawings, Figure 1 shows a schematic block diagram of a system 10 for predicting property value indicators according to an embodiment of the invention.
The system 10 in this example is accessible through a wide area network such as the Internet 12, and is arranged to enable multiple authorised users to interact with the system 10, for example using a computing device that may be any suitable device including a personal computer 14, tablet computer 16 and/or a smartphone 18. The system 10 is also accessible by a system administrator, such as using a suitable computing device 20, for example in order to manage user authorisations.
The system 10 uses machine learning to predict property value indicators based on property information obtained from a suitable source of property information, for example a source associated with a government authority, and based on custom property information produced using the property information.
The present inventors have realised that property value indicators that are fit for purpose can be obtained using machine learning if custom property information is produced that simulates ‘street level’ judgement of a human valuer, and the custom property information is used with conventional property information to train a machine learning model and infer data outcomes using the trained model. In this way, subtle, humanintuition based value aspects of a property that are not directly available through existing property information data sets can be incorporated into the machine learning models and a simulated human valuation produced. Such custom property information is produced based on the existing property information using defined algorithms, as described in more detail below.
In this example, the system 10 is arranged to predict property value indicators based on property information obtained from a source 22 associated with a government authority, and in an example the property information is imported to the system 10 periodically from the government source 22, for example every month.
In the present embodiments, it will be understood that the property value indicators are predicted using the property data and custom property data as inputs to the machine learning models, and without using actual transaction values as machine learning model inputs. As a consequence, properties are treated equally irrespective of whether significant, little or no actual transaction evidence exists and a consistent valuation approach is taken to each assessed property.
Typically, the imported property information includes the following attribute information associated with a property:
- a unique valuation entity number (VEN) indicative of one unit of valuation, in this example a valuation for one or more properties associated with the unit of valuation;
- a unique property identifier for each property, that is, each parcel of land and any associated building disposed on the parcel of land;
- configuration information associated with each parcel of land, such as the shape and area of the parcel of land;
- property characteristic information, such as whether a garage exists or whether a pool exists;
- configuration information associated with each building disposed on a parcel of land, such as the shape and area of the building footprint;
- transaction information indicative of the sale amount and/or rental amount of properties and the relevant dates of the transactions;
- information indicative of whether a property is associated with multiple parcels of land and, if so, which parcels of land are associated with the property;
- the year that a building on a parcel of land was built;
- whether a property has an ocean view;
- the number of bedrooms in each building on a parcel of land;
- whether the property is for commercial, residential, mixed, industrial or farming use;
- information indicative of the sub market area, such as the suburb, in which a property is located; and
- the zoning classifications assigned to each parcel of land.
However, it will be understood that additional, or different, property attribute data including any other property feature data may be imported, the important aspect being that the property information is indicative of an attribute of a property and includes information that could be considered to be a factor relevant to consideration of the value of a property.
In this example, the imported data is stored in a relational database 24 that includes several related tables. In the present embodiment, the tables include: a valuation entity number (VEN) table 26 for storing unique VEN numbers received from the data source 22; a land table 28 for storing unique property identifiers received from the data source 22, each property identifier indicative of a parcel of land that may or may not include an associated building disposed on the parcel of land; a zoning table 30 for storing a zoning classification assigned to each parcel of land; a features table 32 for storing property feature information, for example that includes information indicative of the year a building on a property was built, whether an ocean view exists, the floor area of a building, the number of bedrooms of a building, and building and land shape information; a classifications table 34 for storing information indicative of whether the property is for commercial, residential, mixed, industrial or farming use; a transactions table 36 for storing information indicative of actual sale amounts and/or rental amounts of properties and the relevant dates of the transactions; and a sub market areas table 38 for storing information indicative of the sub market area, such as the suburb and/or local government authority (LGA), in which a property is located.
A further example relational database structure that shows example table fields is shown in Figure 17.
In this example, functionality of the system 10 is controlled, coordinated and implemented using a control unit 40 that in the present embodiment includes a processor arranged to implement programs associated with program data 42 stored in a data storage device 44 using memory 46.
In this example, the data storage device 44 also stores custom data 48 produced using the property data stored in the relational database 24, and for example using lookup table data 50, such as table data usable to facilitate correction or inference of missing data in the relational database 24.
The lookup table data 50 may also include other tables, such as a minimum area table indicative of the minimum area allowed for development of a building on a parcel of land in each geographical area.
The data storage device 44 also stores machine learning model data 52 indicative of one or more machine learning models that are capable of being trained using the imported property data and custom data, and used to predict property value indicators.
In the present example, the machine learning model(s) are based on at least one Gradient Boosting Machine (GBM) algorithm that uses an xgboost library, although it will be understood that any suitable machine learning framework may be used.
In this example, the xboost parameters used for training include the following: eta = 0.02 gamma = 0 subsample = 0.7 colsample_bytree = 0.7 max depth = 5 (for Residential) or 3 (for Industrial) nth read = 8 nrounds = 2000
In the present example, a RMSLE (Root Mean Squared Logarithmic Error) is used for the loss function of the GBM algorithm.
The data storage device 44 also stores data indicative of trained machine learning models 54, each trained model 54 associated with a previous assessment and capable of being implemented to perform new assessments based on the trained model, for example as an interim assessment.
In the present embodiment, a machine learning model is defined for each of the following property value indicators: i) GRV for residential properties; ii) GRV for industrial properties; iii) UV for residential properties; iv) UV for industrial properties; v) CV for residential properties; and vi) CV for industrial properties.
However, it will be understood that a single machine learning model may be used for GRV, UV and CV and both residential and industrial properties, or multiple machine learning models specific to a combination of one or more of the above property value indicators i) to vi) may be used.
In the present embodiment, the system 10 is implemented using a client - server model with the system accessible by a client computing device through the Internet, for example by accessing a dedicated web page and providing authentication information. However, it will be understood that other implementations are possible.
Example functional program components 56 implemented by the system 10 based on the stored program data 42 are shown in Figure 2.
In this example, the functional program components 56 include a criteria selection application 58 arranged to facilitate selection by a user of criteria that will be used by the system 10 to train the machine learning model(s) of the system 10; criteria that will be used by the system to validate operation of the system 10, in particular in terms of accuracy of the predicted property value indicators; and criteria that will be used to set the parameters of the desired predicted property value indicators, such as the relevant date of valuation and the geographical scope of the predictions.
In the present embodiment, the criteria selection application 58 receives training and implementation criteria from an operator using a dedicated user interface that in this example is accessed through a web site by providing suitable authentication details.
In the present embodiment, the selectable criteria include the following:
- A date of valuation (DoV) that corresponds to the date at which properties in a defined geographical area are assessed to determine property value indicators.
- An evidence date range defined by specifying start and end evidence dates, the evidence date range defining a training information set having transaction dates within the start and end evidence dates and being used to train the machine learning model(s).
- A validation date range defined by specifying start and end validation dates, the validation date range defining a validation information set having transaction dates within the start and end validation dates, and being used to validate operation of the system 10. Transaction values are removed from training data having transaction dates within the validation date range, and the removed transaction values are subsequently compared with property value indicators predicted by the system to provide an indication of performance of the system.
- A blind testing ratio that defines the percentage of the properties in the validation date range that will be used to assess performance of the system 10 using a blind testing methodology. The properties used for blind testing are selected randomly from the validation information set. With blind testing, all property information for the blind testing properties is removed from the training information set, and the transaction values of the removed properties are subsequently compared with property value indicators predicted by the system for the removed properties to provide an indication of performance of the system.
- A regulation period defined by specifying start and end regulation dates, the regulation period defining a regulation validation information set having transaction dates within the start and end regulation dates, and being used to validate operation of the system 10 for transaction dates that are relatively close to the date of valuation. Transaction values are removed from training data having transaction dates within the regulation date range, and the removed transaction values are subsequently compared with property value indicators predicted by the system to provide an indication of performance of the system. The regulation period covers a period that is close to and spans the date of valuation, and in this way relatively recent actual GRV, UV and CV values can be compared against system predicted GRV, UV and CV values. Typically, the regulation period spans a period 2/3 months before and 2/3 months after the date of valuation.
- A capital values (CV) field that when selected causes CV property value indicators to be produced by the system.
- A capital value (CV) date of effect that defines the date that the property valuations for CV will take effect, for example for rates and taxing purposes.
- An unimproved values (UV) field that when selected causes UV property value indicators to be produced by the system.
- An unimproved value (UV) date of effect that defines the date that the property valuations for UV will take effect, for example for rates and taxing purposes.
- A gross rental value (GRV) field that when selected causes GRV property value indicators to be produced by the system.
- A gross rental value (GRV) date of effect that defines the date that the property valuations for UV will take effect, for example for rates and taxing purposes.
The program components 56 also include a data ingester 60 that extracts property information from the relational database 24 according to the criteria defined using the criteria selection application 58, and a pre-processing component 62 that processes the extracted property information prior to training the machine learning model(s) and inferring new property value indicators. The program components 56 also include a model training component 64 arranged to implement training of the machine learning model(s), a model validation component 66 arranged to implement validation of operation of the machine learning model(s), and a model application component 68 arranged to apply the machine learning model(s) to property data and custom data in order to infer new property value indicators.
Example functional components of the pre-processing component 62 are shown in Figure 3.
In this embodiment, the pre-processing component 62 includes a data standardisation component 70 that standardises the property data received from the data source 22 so that the data used to train the machine learning model(s) and produce predicted property value indicators is in a format compatible with the machine learning model(s). In the present embodiment, the data standardisation component 70 carries out the following actions:
- Since datasets use different conventions for separating words in labels, such as commas, spaces and underscore lines, the data standardisation component 70 modifies the data where necessary so that all data set labels separate words using underscore lines.
- Since percentage values may be stored using different conventions, such as a number between 0 and 100 or a number between 0 and 1 , the data standardisation component 70 modifies the data where necessary so that all percentage values are numbers between 0 and 100.
- All data values that are expected to be numerical are considered to be valid or not valid according to whether the values appear to be numerical or not. For example, ‘x123’ is considered to be invalid, ‘0123’ is considered to be valid, ‘0123’ is considered to be valid, ‘-123’ is considered to be valid, ‘+123’ is considered to be valid, ‘1.2.3’ is considered to be invalid, ‘123‘ is considered to be valid, ‘$123’ is considered to be valid, ‘1e3’ is considered to be valid and ‘1+3’ is considered to be invalid.
- All data values that are considered to be dates are stored in the format YYYY- MM-DD, and any dates that are not in this format are converted.
The pre-processing component 62 also includes an outlier remover component 74 that removes data values considered to be unlikely to be correct, for example if a data value indicates that a property was built after the current date or the property has a land area less than 0.
The pre-processing component 62 also includes a missing value inferrer 76 that adds data values to some fields if the data value can be reasonably inferred. For example, if a data field associated with the presence of a garage at a property is blank, it can be inferred that no garage exists. The standard data format used by the system to represent this is ‘O’, so in this case the missing value inferrer 76 adds ‘0’ to the garage data field. For other data values that are missing but cannot be inferred simply from the absence of a data value, but may be inferred from other information, such as a data field that contains a value for house area, a lookup table 50 is used to estimate the house area. In this example, the lookup table for house area includes values for average house areas in each defined geographical area, such as each suburb or local government area (LGA), based on the number of rooms in the house and a data field allowance. The allowance data field provides an indication as to whether the rooms in the house are smaller or larger than average, for example in percentage terms. If the allowance field is blank, the missing value inferrer 76 infers that the value of the allowance field is zero, which indicates that the room sizes of the house are of average size.
The pre-processing component 62 also includes a data anonymiser 78 that removes data indicative of the names associated with properties. For example, the present embodiment uses the names of buyers and sellers involved in a property transaction to obtain an indication as to whether an actual property transaction was between related parties, and after this information is obtained the name information is removed.
The pre-processing component 62 also includes a custom field data generator 80 arranged to generate custom property data for use in custom data fields that are used to train the machine learning model(s), and for use as inputs when property value indicators are inferred by the trained machine learning model(s).
The custom property data provides the machine learning model(s) with simulated ‘street level’ information of a human valuer, so that subtle, human-intuition based value aspects of a property not directly available through existing property information data sets can be incorporated into the trained model(s) and therefore the property value indicators that are predicted by the model(s).
In the present embodiment, the following custom property data is produced by the custom field data generator 80: frontage_ 1 frontage_2
- shape_quality_ 1
- qhape_quality_2
- development_potential vacant demolition indicator
As shown in Figure 4, the frontage_1 custom field data is calculated using a frontage_1 custom field generator 82, the frontage_2 custom field data is calculated using a frontage_2 custom field generator 84, the shape_quality_1 custom field data is calculated using a shape_quality_1 custom field generator 86, the shape_quality_2 custom field data is calculated using a shape_quality_2 custom field generator 88, the development_potential custom field data is calculated using a development_field custom field generator 90, and the vacant demolition indicator custom field data is calculated using a vacant/demolition field generator 92. frontage_ 1
The frontage_1 field includes custom property data calculated according to the following formula: frontage_ 1 = Z (boundary linejength) if boundary line_usage_code = 1 where boundary linejength is a field that defines the length of a land boundary, and boundary line_usage_code is a field that indicates the type of land boundary using a numerical code. Code ‘T indicates that the boundary line is a frontage boundary line.
The frontage_1 field provides a numerical indication of frontage characteristics of a parcel of land. frontage_2
The frontage_2 field includes custom property data calculated according to the following formula: frontage_2 = frontage_ 1 1 Z (boundary line ength)
The frontage_2 field therefore defines a ratio of total frontage of a parcel of land to the perimeter length of the parcel of land, and therefore provides a further numerical indication of the frontage characteristics of the parcel of land. shape_quality_ 1
The shape_quality_1 field includes custom property data calculated according to the following formula:
Figure imgf000022_0001
where min_area_per_dwelling is a field that defines the minimum area allowed for development of a building on a parcel of land, and ven_land_area is a field that specifies the sum of all land areas associated with a valuation entity number (VEN), since a VEN can include one or more individual parcels of land.
In this embodiment, the data in the min_area_per_dwelling field is contained in a lookup table 50 stored in the data storage device 44.
The shape_quality_1 field provides a numerical indication of the quality of the shape of a parcel of land, for example in terms of utility value of the land shape. shape_quality_2
The shape_quality_2 field includes custom property data calculated according to the following formula:
Figure imgf000023_0001
where land_area is a field that specifies the land area of a parcel of land, boundary_line_count is a field that specifies the number of boundary lines in a plot of land, that is, the number of segments of a polygon representing the shape of the parcel of land, and angles_u45 is a field that specifies the number of angles in the polygon representing the shape of the parcel of land that are less than 45°.
The shape_quality_2 field provides a further numerical indication of the quality of the shape of a parcel of land, for example in terms of utility value of the land shape. development_potential
The development_potential field includes custom property data calculated according to the following formula: development_potential = ven_land_area I min_area_per_dwelling
The development_potential field provides a numerical indication of the potential number of subdivisions available for a parcel of land, and therefore the development potential of the parcel of land. vacant demolition indicator
The vacant_demolition_indicatortie\d includes custom property data determined according to the following rules:
Rule 1
If sale_analysed_status for a parcel of land is 'V' (vacant) or 'D' (demolished), classification is 'V' at the transaction date and for at least 180 days after, and purpose is 'GRV' at transaction date and for at least 180 days after, then the land is deemed ‘vacant (classic)’.
Where sale_analysed_status is a field that defines the status of a parcel of land at date of sale, such as vacant ‘V’, demolished ‘D’ or inproved T; transaction date is a field that includes transaction dates; classification is a field that defines the current recorded status of a parcel of land; and purpose is a field that indicates the type of transaction associated with the transaction date, such as ‘GRV’, ‘LIV’ or ‘CV’.
Rule 2 if sale_analysed_status for the parcel of land is T and classification is not 'V' at the transaction date, but becomes 'V' within 90 and 455 days after the transaction date and remains with a classification 'V' for at least 180 days, then the land is deemed ‘vacant (demolition)’.
Rule 3 if the parcel of land was deemed ‘vacant (demolition)’ according to Rule 2, amend the value in the transaction price field by adding a defined demolition cost defined in a demolition cost field, in the present example $20,000, to the transaction price if the transaction date is in a defined transaction year. If the transaction date is not in the defined transaction year, then modify the defined demolition cost for earlier (or later) transaction years using a compounding yearly 2% inflation rate.
For example, if a defined demolition cost is defined as a constant k (for example $20,000), the formula for demolition cost is as follows:
Figure imgf000024_0001
For capital value (CV) property value indicators, the vacant_demolition_indicatortie\d includes property information determined according to the following rule:
If classification is not 'V' at the transaction date and it does not become 'V' for at least 12 continuous months after, then the parcel of land is deemed ‘improved’.
However, it will be understood that other custom property data, for example that provides an indication of the usability and/or quality of a parcel of land and/or a building disposed on the land may be produced by the custom field data generator 80.
The system 10 also generates related party data for inclusion in a related_parties field that indicates whether buying and selling parties are related using the names of the buying and selling parties. The related_parties field is used to indicate transaction values that occur between relatives in order to determine whether a recorded transaction value is significantly below market rate, and therefore relevant transaction-based taxes are potentially too low.
An example implementation of the system during use will now be described with reference to Figures 5 to 16. Figure 5 shows a flow diagram 100 illustrating steps 102 to 126 of a method of predicting property value indicators according to an embodiment of the present invention. Figures 6 to 16 show example pages of a user interface of the system 10 implemented using a web site.
In the present example, 6 machine learning models are used, one for each of the following combinations - GRV/residential, GRV/industrial, UV/residential, UV/industrial, CV/residential and CV/industriaL
Referring to Figure 5, as indicated at step 102, the system 10 is arranged to periodically import property data from a data source 22, typically a government source of property information. The imported property data is indicative of attributes of properties in a regional area, such as a national geographical area, a state geographical area or a local government authority geographical area. The property attributes include features associated with the properties in the defined geographical area such as area of the relevant parcel of land, number of sides of a polygon representing the shape of a parcel of land, number of bedrooms in a building disposed on a parcel of land, whether the property has a garage, and so on. The property data also includes data associated with previous actual property financial transactions associated with the properties, in the present embodiment including gross rental values (GRVs), unimproved values (UVs) and capital values (CVs).
A user desiring to obtain GRV, UV and/or CV property value indicators for a desired geographical area first logs into the system by accessing a website associated with the system and entering login details. After successful login, a home page 130 is displayed, as shown in Figure 6. The home page 130 includes a residential selection menu 132 and an industrial selection menu 134 that are respectively usable to carry out residential and industrial property valuations.
Using the residential or the industrial selection menu 132, 134, the user is able to select a revaluation assessment option or an interim assessment option. A revaluation assessment generates property value indicators for the entire geographical area based on a new training set, whilst an interim assessment generates property value indicators based on one or more existing machine learning models. The interim assessments may also be limited to a specific sub-geographical area or specific properties, such as properties created since the last assessment was carried out.
In the present example, the user has selected a residential revaluation assessment, and in response an assessments page 136 is displayed, as shown in Figure 7 and indicated at step 104.
The assessments page 136 includes a list of existing assessments 138 that may be in progress assessments 140, wherein criteria has been at least partially defined for an assessment but the assessment has not yet been performed, and completed assessments 142, wherein the assessments have been performed by the system 10 and property value indicators determined.
Each in progress assessment 140 includes a continue button 144 usable to continue with an in progress assessment, and each completed assessment 142 includes a view outcome button 144 usable to view the outcome of the relevant assessment.
The assessments page 136 also includes a create assessment button 148 usable to create and implement a new assessment and thereby determine new property value indicators for properties in a defined geographical area.
Selection of the create assessment button 148 causes a new assessment criteria page 150 to be displayed, as shown in Figure 8. Selection of the interim assessment option causes a similar page to be displayed with similar selection criteria available, but also an option to select a previous assessment. Completed assessments are associated with existing trained machine learning models 54 that are stored in the data storage device 44.
The new assessments criteria page 150 includes a new assessment criteria pane 152, an enlarged view of which is shown in Figure 9. The new assessment criteria pane 152 is used to receive assessment criteria information from the user, as indicated at step 104. In this example, the assessment criteria information includes the following:
- date of valuation 154;
- start and end evidence dates that define the evidence date range 156;
- start and end validation dates that define the validation date range 158;
- blind testing ratio 160;
- regulation period 162;
- whether capital values will be included in the assessment 164 and, if so, the associated date of effect 166 of the capital values;
- whether unimproved values will be included in the assessment 168 and, if so, the associated date of effect 170 of the unimproved values;
- whether gross rental values values will be included in the assessment 172 and, if so, the associated date of effect 174 of the gross rental values.
Using the defined assessment criteria, the system 10 ingests the relevant property data from the relational database 24 and carries out pre-processing actions on the data using the pre-processing component 62, as indicated at steps 106 to 118.
The pre-processing actions include standardisation of the data 106, wherein the property data is modified as necessary so as to be in a standard format used by the system 10; addition of missing values 110, wherein data values are added to fields if the data value can be reasonably inferred; outlier removal 112, wherein data values that are considered unlikely to be correct are removed; and data anonymisation 114, wherein data indicative of the names associated with properties is removed after an assessment has been made as to whether a relationship exists between seller and buyer.
The pre-processing actions may also include predefined data filtering 116 that for example removes particular properties from use as training data based on predefined filters that may include any one or more of the following: Lower/upper transaction price limit;
Lower/upper limits of land area;
Lower/upper limits of development potential;
Minimum/maximum dates of transactions;
VEN type and classification;
Land type and zoning.
The pre-processing actions also include custom data generation 118, wherein custom property data for use in custom data fields is produced by the custom field generator 80. The custom data is used to train the machine learning model(s), and as inputs to the trained machine learning model when property value indicators are inferred by the machine learning model(s).
In the present example, the custom fields described above are generated - frontage_1, frontage_2, shape_quality_ 1, shape_quality_2, development_potential, and vacant_demolition_indicator.
As indicated at step 120, the system 10 then trains the relevant machine learning models using property data and custom property data with associated transaction dates within the defined evidence date range 156 as model inputs, and the transaction data as outputs, but excluding the following:
- transaction values for property data with transaction dates within the defined validation date range 158;
- property data, custom property data and transaction values for a percentage of the property data with transaction dates within the defined validation date range 158, the percentage defined by the blind testing ratio 160 and the property data selected randomly from the validation date range; and
- transaction values for property data with transaction dates within the regulation period.
In this example, since a residential assessment is carried out for GRV, UV and CV property value indicators, GRV/residential, UV/residential and CV/residential machine learning models are trained and used to infer property value indicators.
As discussed above, specific transaction data, and for blind testing also the related property data, is excluded for validation purposes. After completion of the training process, the performance of the machine learning models is validated, as indicated at step 122. The validation step involves carrying out the following performance tests:
- A validation test wherein the property data and custom property data associated with the transaction values excluded from training according to the defined evidence date range 156 is provided as input to the machine learning models, and property value indicators (in the present example GRV, UV and CV values) inferred by the machine learning models for the relevant transaction dates of the excluded transaction values are compared to the excluded transaction values.
- A blind test wherein the excluded property data and custom property data according to the defined validation date range 158 and blind testing ratio 160 is provided as input to the machine learning models, and the property value indicators (in the present example GRV, UV and CV) inferred by the machine learning models for the relevant transaction dates of the excluded transaction values are compared to the excluded transaction values.
- A regulation test wherein the property data and custom property data associated with transaction values excluded from training according to the defined regulation period 162 and that are close to the date of valuation is provided as input to the machine learning models, and the property value indicators (in the present example residential GRV, UV and CV) inferred by the machine learning models for the relevant transaction dates of the excluded transaction values are compared to the excluded transaction values.
After the validation step 122, the property data and custom property data associated with properties in the defined geographical area are used as inputs to the machine learning models, and the machine learning models produce property value indicators (in the present example residential GRV, UV and CV) that constitute predicted values for GRV, UV and CV at the date of valuation.
Figure 10 shows a preparation page 180 displayed to the user after the assessment criteria has been selected. The preparation page 180 includes property identification data that identifies the sub-geographical areas, in this example the local government authorities (LGAs), that will be included in the assessment.
Figure 11 shows a processing page 188 displayed to a user during the training, validation and inference steps 120, 122 and 124, the processing page 188 including a processing graphic 190 showing the geographical area covered by the assessment. As indicated by step 126, after processing is complete, the predicted property value indicators are available for output to the user.
A summary of the assessment is shown in Figure 12, which is a results page 192 showing a summary 194 of the assessment criteria selected using the new assessment criteria page 150 shown in Figures 8 and 9 and used for the assessment. The results page 192 also includes a statistics panel 196 that includes summary information associated with the assessment.
A review page 200 is shown in Figure 13. The review page 200 illustrates the relative influence of features 202 of the machine learning model inputs. For example, in the example shown in Figure 13, the first 2 features 202 are longitude and latitude location coordinate features.
A performance page 206 is shown in Figure 14, the performance page 206 showing system performance metrics obtained by carrying out the validation tests, and including an error distribution 208 indicating a median error of approximately 1%.
It will be understood from the error distribution 208 that the ‘hit rate decay’, that is, the differences in error across the geographical area covered by the assessment is low since the peak is relatively steep on both sides of the distribution.
A map page 210 is shown in Figures 15 and 16, Figure 15 displaying a first map view 212 showing a relatively large portion of the assessed geographical area, and Figure 16 displaying a second map view 216 showing a relatively small portion of the assessed geographical area. The first map view 212 includes property clusters 214 that when selected cause the second map view 216 associated with the property cluster 214 to be displayed. Each assessed property 218 is shown in the second map view 216 and illustrated using a coloured indicator, such as a coloured dot. The assessed properties are selectable, and when selected relevant information associated with the property is displayed, including the predicted property valuation indicators.
The system 10 is arranged to enable a user to download the results of an assessment, for example as a CSV file.
The system 10 may also be arranged to analyse the predicted property value indicators to determine whether any of the property value indicators appear to be of interest, such as too low or too high, for example by marking properties that are considered to be within expectations using a first colour indicator and marking properties that are considered to be unexpected with a different second colour indicator.
It will be understood that since actual transaction values are not used as inputs to the machine learning models, the machine learning models determine property value indicators based only on property characteristics defined by the property data and custom property data. As a consequence, the property value indicators for properties that have significant actual associated transaction data are not coloured by the actual transaction data, and properties are treated equally irrespective of whether significant, little or no actual transaction evidence exists.
By using only the property data and custom property data as inputs in this way, the system 10 is able to achieve a low ‘hit rate decay’ with respect to the accuracy of the predicted property value indicators. The ‘hit rate decay’ is a measure of how consistent the error is between predicted property value indicators and actual transaction values during model validation across the assessed geographical area.
It will be appreciated that the predicted property value indicators may be used for any purpose that uses property value indications to carry out an action, such as to calculate rates and/or taxes, to identify erroneous reported transaction values and thereby applicable tax amounts that are too low, by financial institutions to assess the value of property, for example as proposed loan security, or by insurance institutions.
It will be appreciated that the present system and method is able to produce property value indications significantly faster than is possible with conventional valuation techniques, and is therefore significantly more cost effective and efficient than conventional techniques. For example, in a particular embodiment implementation, the system was able to produce property value indications for 1 ,000,000 properties in about 30 minutes.
It will also be appreciated that the present system and method is flexible in that it is able to generate property value indications for any valuation date and/or any individual property or group of properties, and to generate interim valuations quickly and cost effectively.
It will also be appreciated that the present system and method is able to produce three different types of property value predictions (GRV, UV and CV) essentially in parallel. The system 10 may also be arranged to enable a user to compare predicted property value indicators associated with assessments carried out for different dates of valuation. In this way, a user is able to compare predicted property value indicators (GRV, UV and/or CV) associated with different dates of valuations, for example to provide an indication of changes between the dates of valuation of mean property values and/or total property values.
In an example implementation, in order to carry out a change analysis a user selects a change analytics menu 230 which causes a change analytics page 232 to be displayed, as shown in Figure 18. The change analytics page 232 includes a list of existing change analyses 234 that can be selected by the user to view the results of previously carried out change analyses.
Selection of a create new change analysis button (not shown) causes an assessment comparison selection page 236 to be displayed, as shown in Figure 19.
The assessment comparison selection page 236 includes a first assessment selection drop down box 238 and a second assessment selection drop down box 240 usable to select 2 existing assessments that are desired to be compared.
After selection of 2 existing assessments, a selected assessments page 242 is displayed, as shown in Figures 20a and 20b.
The selected assessments page 242 shows the first and second selected assessments in the first and second assessment selection drop down boxes 238, 240, the property data files 242 and assessment criteria 244 used for the first assessment, as shown in Figure 20a, and the property data files 246 and assessment criteria 248 used for the second assessment, as shown in Figure 20b.
Figure 21 shows a change analytics processing page 250 displayed to a user as the change analytics processing progresses, the processing page 250 including a processing graphic 252 showing the geographical area covered by the change analytics.
After completion of the change analysis, a results page 254 is displayed, as shown in Figures 22 and 23. Using a property index link 256 and a value change link 258, a user is able to display property index results 260, as shown in Figure 22, or value change results 261 , as shown in Figure 23. The property index results 260 show change analytic data for properties that have not changed in the period between the selected first and second assessments, and therefore properties that have not been renovated or in substance altered in the period are included in the change analytics, but properties that have changed, for example because the properties have been renovated or demolished, are excluded. In this way, the property index results 260 provide an indication of property value changes based purely on changes in property market values.
In this example, the property index results 260 include property index data separate sections for gross renal values (GRV) 262, unimproved values (UV) 264 and capital values (CV) 266, with each section including metro results 270a, 272a, 274a associated with properties located in a metropolitan area, country results 270b, 272b, 274b associated with properties located in a country area and state results 270c, 272c, 274c that represent combined metropolitan and country results.
The property index results 260 include a property count column 280, a mean property percentage value increase column 282, a Q1 percentage value increase column 284, and a median value increase column 286.
In the example shown, it can be seen that between the first and second assessments the statewide mean GRV, UV and CV increased by 189.45%, 192.6% and 166.99% respectively.
The value change results 261 show total change analytic data for all properties covered by the assessments.
In this example, the value change results 262 include a total valuation section 290 that shows total value data for all properties covered by the assessments for both the first and second assessments, a valuation property count section 292 that shows counts for the total number of properties covered by the assessments for both the first and second assessments, a change valuation section 294 that shows percentage value change data for all properties, and a count change section 296 that shows percentage count change data for all properties.
Each section 290, 292, 294, 296 includes separate metro 298, country 300 and state 302 results corresponding to metropolitan properties, country properties and combined metropolitan and country properties. In the example shown, it can be seen that between the first and second assessments statewide GRV and UV increased by 246.08% and 228.89% respectively, and the total number of GRV and UV properties statewide increased by 130.17% and 129.61% respectively.
In the claims which follow and in the preceding description, except where the context requires otherwise due to express language or necessary implication, the word “comprise” or variations such as “comprises” or “comprising” is used in an inclusive sense, i.e. to specify the presence of the stated features but not to preclude the presence or addition of further features in various embodiments.
Modifications and variations as would be apparent to a skilled addressee are deemed to be within the scope of the present invention.

Claims

Claims
1 . A system for predicting a property value indicator indicative of financial value of a real estate property at a selected date of valuation, the system comprising: at least one machine learning model trained using a training data set to predict property value indicators at a selectable date of valuation, the training data set including first property data indicative of attributes of a plurality of properties as model inputs and actual financial transaction data associated with the properties at specific transaction dates as model outputs; and a custom property data generator arranged to determine custom second property data using the first property data; wherein the at least one machine learning model is also trained using the custom second property data as inputs; and wherein property value indicators are inferred for a selected date of valuation by the at least one machine learning model using the first and second property data as inputs to the at least one machine learning model but not using transaction values as inputs to the at least one machine learning model.
2. A system as claimed in claim 1 , wherein the first property data includes property data indicative of characteristics of properties.
3. A system as claimed in claim 2, wherein the characteristics include configuration information associated with each parcel of land, property feature information, configuration information associated with each building disposed on each parcel of land, and/or the year that a building on a parcel of land was built.
4. A system as claimed in any one of claims 1 to 3, wherein the property attributes include information indicative of whether a property is associated with multiple parcels of land and, if so, which parcels of land are associated with the property; whether the property is for commercial, residential, mixed, industrial or farming use; information indicative of the sub market area in which each property is located; and zoning classifications assigned to each property.
5. A system as claimed in any one of the preceding claims, wherein the custom second property data is determined using data stored in at least one stored lookup table.
6. A system as claimed in any one of the preceding claims, wherein the custom second property data includes data indicative of a quality and/or usability of a property.
7. A system as claimed in claim 6, wherein the custom second property data includes frontage quality data indicative of a frontage quality of a property.
8. A system as claimed in claim 7, wherein the frontage quality data includes data determined using the following formula: frontage_ 1 = Z (boundary linejength) if boundary line_usage_code = 1 where boundary linejength is a field that includes the length of a land boundary, and boundary Jine_usage_code is a field that indicates the type of land boundary using a numerical code, and code T indicates that the boundary line is a frontage boundary line.
9. A system as claimed in claim 8, wherein the frontage quality data includes data determined using the following formula: frontage_2 = frontage_ 1 1 Z (boundary linejength)
10. A system as claimed in any one of claims 6 to 9, wherein the custom second property data includes shape quality data indicative of a shape quality of a property.
11. A system as claimed in claim 10, wherein the shape quality data includes data determined using the following formula:
Figure imgf000036_0001
where min_area_per_dwelling is a field that defines the minimum area allowed for development of a building on a parcel of land, and ven land_area is a field that specifies the sum of all land areas associated with a property.
12. A system as claimed in claim 11 , wherein the data in the min_area_per_dwelling field is contained in a lookup table.
13. A system as claimed in any one of claims 10 to 12, wherein the shape quality data includes data determined using the following formula: shape_quality_2 = Z (land area * angles_u451 boundary Jine_counf) Z (land area) where land_area is a field that specifies the land area of a parcel of land, boundary_line_count is a field that specifies the number of boundary lines in a plot of land, that is, the number of segments of a polygon representing the shape of the parcel of land, and angles_u45 is a field that specifies the number of angles in the polygon representing the shape of the parcel of land that are less than 45°.
14. A system as claimed in any one of the preceding claims, wherein the custom second property data includes development potential data indicative of a development potential of a property.
15. A system as claimed in claim 14, wherein the development potential data includes data determined using the following formula: development_potential = ven_land_area I min_area_per_dwelling where min_area_per_dwelling is a field that defines the minimum area allowed for development of a building on a parcel of land, and ven_land_area is a field that specifies the sum of all land areas associated with a property.
16. A system as claimed in any one of the preceding claims, wherein the system is arranged to determine based on the first property data whether a building disposed on the property has been demolished within a defined period after a transaction date, and to add a demolish financial amount to the property value indicator associated with the property if the property has been demolished within the defined period.
17. A system as claimed in claim 16, wherein the demolish financial amount is determined using the following formula:
Figure imgf000037_0001
where k is a constant
18. A system as claimed in any one of the preceding claims, wherein the system includes a validation component arranged to validate operation of the at least one machine learning model.
19. A system as claimed in claim 18, wherein the validation component is arranged to validate operation of the at least one machine learning model by: excluding a defined first subset of financial transaction data associated with a plurality of properties from the training data set used to train the at least one machine learning model; and subsequently comparing the excluded financial transaction data with corresponding inferred property value indicators.
20. A system as claimed in claim 19, wherein the first subset of financial transaction data is defined based on a defined start validation financial transaction date and a defined end validation transaction date.
21 . A system as claimed in claim 20, wherein the defined start and end validation transaction dates are user definable.
22. A system as claimed in any one of claims 18 to 21 , wherein the validation component is arranged to validate operation of the at least one machine learning model by: excluding a defined second subset of financial transaction data associated with a plurality of properties from the training data set used to train the at least one machine learning model; excluding first and second property data associated with the excluded financial transaction data from the training data set used to train the at least one machine learning model; and subsequently comparing the excluded financial transaction data with corresponding inferred property value indicators.
23. A system as claimed in claim 22, wherein the defined second subset of financial transaction data is a subset of the first subset of financial transaction data.
24. A system as claimed in claim 22 or claim 23, wherein the second subset of financial transaction data is selected randomly from the first subset of financial transaction data.
25. A system as claimed in any one of claims 22 to 24, wherein the second subset of financial transaction data is a user selectable proportion of the first subset of financial transaction data.
26. A system as claimed in any one of claims 18 to 25, wherein the validation component is arranged to validate operation of the at least one machine learning model by: excluding a defined third subset of financial transaction data associated with a plurality of properties from the training data set used to train the at least one machine learning model; and subsequently comparing the excluded financial transaction data with corresponding inferred property value indicators; wherein the third subset of financial transaction data is defined based on transaction dates within a defined period of the date of valuation.
27. A system as claimed in claim 26, wherein the defined period is user definable.
28. A system as claimed in any one of the preceding claims, wherein the system includes a data standardisation component arranged to standardise the first property data so that the first property data is in a format compatible with the at least one machine learning model.
29. A system as claimed in any one of the preceding claims, wherein the system includes an outlier remover component arranged to remove property data that is considered to be unlikely to be correct.
30. A system as claimed in any one of the preceding claims, wherein the system includes a missing value inferrer arranged to add data to a first property data field if the data for the property data field can be reasonably inferred.
31 . A system as claimed in claim 30, wherein the missing value inferrer is arranged to add data to a property data field using a lookup table.
32. A system as claimed in claim 31 , wherein the lookup table includes data indicative of average building areas, and the system is arranged to infer a building area of a property by determining whether the building area of the property is above or below an average property area using the first property data and the lookup table.
33. A system as claimed in any one of the preceding claims, wherein the first property data includes names of buyers and sellers involved in a financial transaction, and the system is arranged to determine whether the financial transaction was between related parties using the names of buyers and sellers.
34. A system as claimed in claim 33, wherein the system includes a data anonymiser arranged to remove data indicative of the buyer and seller names associated with properties after the determination is made as to whether the financial transaction was between related parties using the names of buyers and sellers.
35. A system as claimed in any one of the preceding claims, wherein the system is arranged to facilitate reception of property value assessment criteria used by the system to predict the property value indicators.
36. A system as claimed in claim 35, wherein the property value assessment criteria include a date of valuation, at least one property value indication type that defines the or each type of property value indication predicted by the system, a residential or industrial property type, and/or a date of effect.
37. A system as claimed in claim 36, wherein the property value indication type includes a gross rental value, an unimproved value and/or a capital value.
38. A system as claimed in any one of the preceding claims, wherein the system includes a plurality of machine learning models, each machine learning model associated with a combination of one property value indication type and one of residential or industrial property type.
39. A system as claimed in any one of the preceding claims, wherein the system includes a database and the first property information is obtained from an external data source and stored in the database, the system extracting first property data from the database and using the extracted first property data to train the at least one machine learning model and infer property value indicators.
40. A system as claimed in any one of the preceding claims, wherein the system includes an online accessible user interface that may be a web-based user interface.
41 . A system as claimed in any one of the preceding claims, wherein the at least one machine learning model comprises at least one Gradient Boosting Machine (GBM) algorithm.
42. A system as claimed in any one of the preceding claims, wherein the system is arranged to facilitate selection of a revaluation option wherein at least one machine learning model is trained using current first and second property data to produce at least one trained machine learning model, and the at least one trained machine learning model is used to infer property value indicators for all properties in a geographical area, or an interim assessment wherein property value indicators for a defined set of properties are inferred using an existing at least one trained machine learning model.
43. A system as claimed in any one of the preceding claims, wherein the system is arranged to compare predicted property value indicators associated with at least 2 property value assessments for at least 2 different dates of valuation and to produce property value change results.
44. A system as claimed in claim 43, wherein the property value change results include information indicative of changes in mean property values, median property values and/or total property values.
45. A system as claimed in claim 43 or claim 44, wherein the property value change results are based only on properties that have not changed in the period between the at least 2 different dates of valuation.
46. A system as claimed in claim 43 or claim 44, wherein the property value change results are based on all properties covered by the property value assessments.
47. A system as claimed in claim 46, wherein the property value change results include total value data indicative of the total financial value of all properties covered by the property value assessments, valuation property count data indicative of the total number of properties covered by the property value assessments, value change data indicative of the total financial value change of the total number of properties covered by the property value assessments, and a count change section indicative of a change in the total number of properties.
48. A method of predicting a property value indicator indicative of financial value of a real estate property at a selected date of valuation, the method comprising: receiving first property data indicative of attributes of a plurality of properties; determining custom second property data using the first property data; training at least one machine learning model using a training data set to predict property value indicators at a selectable date of valuation, the training data set including the first and second property data as model inputs and actual financial transaction data associated with the properties at specific transaction dates as model outputs; and inferring property value indicators for a selected date of valuation by the at least one machine learning model using the first and second property data as inputs to the at least one machine learning model but not using transaction values as inputs to the at least one machine learning model.
49. A method as claimed in claim 48, wherein the method includes, for each new property value indicator revaluation, training the at least one machine learning model using current first and second property data and current actual financial transaction data to produce at least one trained machine learning model, and using the at least one trained machine learning model to infer property value indicators for properties in a geographical area.
50. A method as claimed in claim 48 or claim 49, wherein the method includes, for each interim assessment, inferring property value indicators using an existing at least one trained machine learning model.
51 . A method as claimed in any one of claims 48 to 50, wherein the first property data includes property data indicative of characteristics of properties.
52. A method as claimed in claim 51 , wherein the characteristics include configuration information associated with each parcel of land, property feature information, configuration information associated with each building disposed on each parcel of land, and/or the year that a building on a parcel of land was built.
53. A method as claimed in any one of claims 48 to 52, wherein the property attributes include information indicative of whether a property is associated with multiple parcels of land and, if so, which parcels of land are associated with the property; whether the property is for commercial, residential, mixed, industrial or farming use; information indicative of the sub market area in which each property is located; and zoning classifications assigned to each property.
54. A method as claimed in any one of claims 48 to 53, comprising determining the custom second property data using data stored in at least one stored lookup table.
55. A method as claimed in any one of claims 48 to 54, wherein the custom second property data includes data indicative of a quality and/or usability of a property.
56. A method as claimed in claim 55, wherein the custom second property data includes frontage quality data indicative of a frontage quality of a property.
57. A method as claimed in claim 56, wherein the frontage quality data includes data determined using the following formula: frontage_ 1 = Z (boundary linejength) if boundary line_usage_code = 1 where boundary linejength is a field that includes the length of a land boundary, and boundary Jine_usage_code is a field that indicates the type of land boundary using a numerical code, and code T indicates that the boundary line is a frontage boundary line.
58. A method as claimed in claim 57, wherein the frontage quality data includes data determined using the following formula: frontage_2 = frontage_ 1 1 Z (boundary linejength)
59. A method as claimed in any one of claims 55 to 58, wherein the custom second property data includes shape quality data indicative of a shape quality of a property.
60. A method as claimed in claim 59, wherein the shape quality data includes data determined using the following formula:
Figure imgf000043_0001
where min_area_per_dwelling is a field that defines the minimum area allowed for development of a building on a parcel of land, and ven land_area is a field that specifies the sum of all land areas associated with a property.
61 . A method as claimed in claim 60, wherein the data in the min_area_per_dwelling field is contained in a lookup table.
62. A method as claimed in any one of claims 59 to 61 , wherein the shape quality data includes data determined using the following formula: shape_quality_2 = Z (land area * angles_u451 boundary Jine_counf) Z (land area) where land_area is a field that specifies the land area of a parcel of land, boundary_line_count is a field that specifies the number of boundary lines in a plot of land, that is, the number of segments of a polygon representing the shape of the parcel of land, and angles_u45 is a field that specifies the number of angles in the polygon representing the shape of the parcel of land that are less than 45°.
63. A method as claimed in any one of claims 48 to 62, wherein the custom second property data includes development potential data indicative of a development potential of a property.
64. A method as claimed in claim 63, wherein the development potential data includes data determined using the following formula: development_potential = ven_land_area I min_area_per_dwelling where min_area_per_dwelling is a field that defines the minimum area allowed for development of a building on a parcel of land, and ven_land_area is a field that specifies the sum of all land areas associated with a property.
65. A method as claimed in any one of claims 48 to 64, comprising determining based on the first property data whether a building disposed on the property has been demolished within a defined period after a transaction date, and adding a demolish financial amount to the property value indicator associated with the property if the property has been demolished within the defined period.
66. A method as claimed in claim 65, wherein the demolish financial amount is determined using the following formula:
Figure imgf000044_0001
where k is a constant
67. A method as claimed in any one of claims 48 to 66, comprising validating operation of the at least one machine learning model.
68. A method as claimed in claim 67, comprising validating operation of the at least one machine learning model by: excluding a defined first subset of financial transaction data associated with a plurality of properties from the training data set used to train the at least one machine learning model; and subsequently comparing the excluded financial transaction data with corresponding inferred property value indicators.
69. A method as claimed in claim 68, comprising defining the first subset of financial transaction data based on a defined start validation financial transaction date and a defined end validation transaction date.
70. A method as claimed in claim 69, wherein the defined start and end validation transaction dates are user definable.
71 . A method as claimed in any one of claims 67 to 70, comprising validating operation of the at least one machine learning model by: excluding a defined second subset of financial transaction data associated with a plurality of properties from the training data set used to train the at least one machine learning model; excluding first and second property data associated with the excluded financial transaction data from the training data set used to train the at least one machine learning model; and subsequently comparing the excluded financial transaction data with corresponding inferred property value indicators.
72. A method as claimed in claim 71 , wherein the defined second subset of financial transaction data is a subset of the first subset of financial transaction data.
73. A method as claimed in claim 71 or claim 72, wherein the second subset of financial transaction data is selected randomly from the first subset of financial transaction data.
74. A method as claimed in any one of claims 71 to 73, wherein the second subset of financial transaction data is a user selectable proportion of the first subset of financial transaction data.
75. A method as claimed in any one of claims 67 to 74, comprising validating operation of the at least one machine learning model by: excluding a defined third subset of financial transaction data associated with a plurality of properties from the training data set used to train the at least one machine learning model; and subsequently comparing the excluded financial transaction data with corresponding inferred property value indicators; wherein the third subset of financial transaction data is defined based on transaction dates within a defined period of the date of valuation.
76. A method as claimed in claim 75, wherein the defined period is user definable.
77. A method as claimed in any one of claims 48 to 76, comprising standardising the first property data so that the first property data is in a format compatible with the at least one machine learning model.
78. A method as claimed in any one of claims 48 to 77, comprising removing property data that is considered to be unlikely to be correct.
79. A method as claimed in any one of claims 48 to 78, comprising adding data to a first property data field if the data for the property data field can be reasonably inferred.
80. A method as claimed in claim 79, comprising adding data to a property data field using a lookup table.
81 . A method as claimed in claim 80, wherein the lookup table includes data indicative of average building areas, and the method comprises inferring a building area of a property by determining whether the building area of the property is above or below an average property area using the first property data and the lookup table.
82. A method as claimed in any one of claims 48 to 81 , wherein the first property data includes names of buyers and sellers involved in a financial transaction, and the method comprises determining whether the financial transaction was between related parties using the names of buyers and sellers.
83. A method as claimed in claim 82, comprising removing data indicative of the buyer and seller names associated with properties after the determination is made as to whether the financial transaction was between related parties using the names of buyers and sellers.
84. A method as claimed in any one of claims 48 to 83, comprising facilitating reception of property value assessment criteria used by the system to predict the property value indicators.
85. A method as claimed in claim 84, wherein the property value assessment criteria include a date of valuation, at least one property value indication type that defines the or each type of property value indication predicted by the system, a residential or industrial property type, and/or a date of effect.
86. A method as claimed in claim 85, wherein the property value indication type includes a gross rental value, an unimproved value and/or a capital value.
87. A method as claimed in any one of claims 48 to 86, comprising using a plurality of machine learning models, each machine learning model associated with a combination of one property value indication type and one of residential or industrial property type.
88. A method as claimed in any one of claims 48 to 87, comprising obtaining the first property information from an external data source, storing the first property information in a database, extracting first property data from the database and using the extracted first property data to train the at least one machine learning model and infer property value indicators.
89. A method as claimed in any one of claims 48 to 88, wherein the at least one machine learning model comprises at least one Gradient Boosting Machine (GBM) algorithm.
90. A method as claimed in any one of claims 48 to 89, comprising facilitating selection of a revaluation option wherein at least one machine learning model is trained using current first and second property data to produce at least one trained machine learning model, and using the at least one trained machine learning model to infer property value indicators for all properties in a geographical area, or an interim assessment wherein property value indicators for a defined set of properties are inferred using an existing at least one trained machine learning model.
91 . A method as claimed in any one of claims 48 to 90, comprising comparing predicted property value indicators associated with at least 2 property value assessments for at least 2 different dates of valuation and to produce property value change results.
92. A method as claimed in claim 91 , wherein the property value change results include information indicative of changes in mean property values, median property values and/or total property values.
93. A method as claimed in claim 91 or claim 92, wherein the property value change results are based only on properties that have not changed in the period between the at least 2 different dates of valuation.
94. A method as claimed in claim 43 or claim 44, wherein the property value change results are based on all properties covered by the property value assessments.
95. A method as claimed in claim 94, wherein the property value change results include total value data indicative of the total financial value of all properties covered by the property value assessments, valuation property count data indicative of the total number of properties covered by the property value assessments, value change data indicative of the total financial value change of the total number of properties covered by the property value assessments, and a count change section indicative of a change in the total number of properties.
PCT/AU2023/051201 2022-11-23 2023-11-23 A system for predicting property value indicators WO2024108265A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
AU2022903555A AU2022903555A0 (en) 2022-11-23 System for predicting property value indicators
AU2022903555 2022-11-23

Publications (1)

Publication Number Publication Date
WO2024108265A1 true WO2024108265A1 (en) 2024-05-30

Family

ID=91194786

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/AU2023/051201 WO2024108265A1 (en) 2022-11-23 2023-11-23 A system for predicting property value indicators

Country Status (1)

Country Link
WO (1) WO2024108265A1 (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5361201A (en) * 1992-10-19 1994-11-01 Hnc, Inc. Real estate appraisal using predictive modeling
US20040254803A1 (en) * 2003-06-11 2004-12-16 David Myr Method and system for optimized real estate appraisal
US20150242747A1 (en) * 2014-02-26 2015-08-27 Nancy Packes, Inc. Real estate evaluating platform methods, apparatuses, and media
US20150324939A1 (en) * 2014-03-09 2015-11-12 Ashutosh Malaviya Real-estate client management method and system
US20160202800A1 (en) * 2013-09-10 2016-07-14 Sony Corporation Sensor device, input device, and electronic apparatus
US20160275633A1 (en) * 2015-03-20 2016-09-22 Bki Software Solutions Utility monitoring and database correlation system, including user interface generation for utility assessment
US20170357984A1 (en) * 2015-02-27 2017-12-14 Sony Corporation Information processing device, information processing method, and program
US20210406707A1 (en) * 2018-09-13 2021-12-30 Diveplane Corporation Feature and Case Importance and Confidence for Imputation in Computer-Based Reasoning Systems
WO2022191775A1 (en) * 2021-03-09 2022-09-15 Real Estate Analytics Pte Ltd. A system for generating a value index for properties and a method thereof

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5361201A (en) * 1992-10-19 1994-11-01 Hnc, Inc. Real estate appraisal using predictive modeling
US20040254803A1 (en) * 2003-06-11 2004-12-16 David Myr Method and system for optimized real estate appraisal
US20160202800A1 (en) * 2013-09-10 2016-07-14 Sony Corporation Sensor device, input device, and electronic apparatus
US20150242747A1 (en) * 2014-02-26 2015-08-27 Nancy Packes, Inc. Real estate evaluating platform methods, apparatuses, and media
US20150324939A1 (en) * 2014-03-09 2015-11-12 Ashutosh Malaviya Real-estate client management method and system
US20170357984A1 (en) * 2015-02-27 2017-12-14 Sony Corporation Information processing device, information processing method, and program
US20160275633A1 (en) * 2015-03-20 2016-09-22 Bki Software Solutions Utility monitoring and database correlation system, including user interface generation for utility assessment
US20210406707A1 (en) * 2018-09-13 2021-12-30 Diveplane Corporation Feature and Case Importance and Confidence for Imputation in Computer-Based Reasoning Systems
WO2022191775A1 (en) * 2021-03-09 2022-09-15 Real Estate Analytics Pte Ltd. A system for generating a value index for properties and a method thereof

Similar Documents

Publication Publication Date Title
Kaklauskas et al. A multiple criteria decision support on-line system for construction
US7958048B2 (en) Method and apparatus for predicting outcomes of a home equity line of credit
US20070143132A1 (en) Automated valuation of a plurality of properties
Lyons Can list prices accurately capture housing price trends? Insights from extreme markets conditions
WO2011109576A1 (en) System and methods for management of real property and for comparing real properties for purchase
US20030018456A1 (en) Method and system for stress testing simulations of the behavior of financial instruments
Bukhtiarova et al. Ensuring transparency of key public finance authorities
Ikediashi et al. An assessment of risks associated with contractor’s cash flow projections in South-South, Nigeria
JP2000250888A (en) Model selection type demand forecast method by forecast purpose
CN114077977B (en) Building intelligent management method and system based on big data and readable storage medium
Prorokowski Validation of the backtesting process under the targeted review of internal models: practical recommendations for probability of default models
Zhao et al. Path selection of spatial econometric model for mass appraisal of real estate: evidence from Yinchuan, China
WO2024108265A1 (en) A system for predicting property value indicators
US20120254066A1 (en) Methods and apparatus for valuing mortgage loan portfolios
Wofford A simulation approach to the appraisal of income producing real estate
CN114693428B (en) Data determination method, device, computer readable storage medium and electronic device
JP2002073985A (en) Transaction support method and storage medium storing transaction support program
Amaral et al. Analysis of sales of vertical residential real estate projects in Goiânia and its influencing factors
AU2010101397A4 (en) Property evaluation system and method
Labropoulos et al. The necessity of developing a CAMA system for the Real Estate Market in Greece
Sibgatullin Prospective analysis of company's financial results for managerial purposes
Casey Report on the Residential Property Price Index Mission (April 8–12, 2024)
Monteiro An empirical approach to quantify minimum capital requirements for operational risk
Leandro An internal fraud model for operational losses: an application to evaluate data integration techniques in operational risk management in financial institutions
AU2008201962A1 (en) Computational apparatus and method for modelling employment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23892807

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE