Predicting House prices in Taiwan: A Bayesian approach


Predicting House Prices In Taiwan: A Bayesian approach

Bayesian methods are one of the main approaches for estimating models besides the frequentist methods. In bayesian models prior information can be included by using a suitable prior. The models are therefore a combination of the data and the prior information. The package brms is used for fitting Bayesian models using Stan. The package allows for modelling generalized nonlinear multivariate models. Every parameter of the distribution can be modelled in terms of linear or nonlinear terms. This allows for a very flexible modelling strategy. Bayesian methods excels with datasets, which have a hierarchical/multilevel structure and/or are relatively small. For a large number of observations with a relative simple structure the difference between Bayesian and frequentist models are very similiar. The main drawback besides the choice of the priors is the computation time needed for Bayesian models. Bayesian models needs be simulated using MCMC, while classical methods can be calculated by optimizing a function.

The data includes historical real data set of real estate valuation from Taipei. The dataset can be found here The dataset is transformed into a sf format, which is used for spatial data. The sf format is similiar to a data.frame, but includes an additional colum for the geometry of the observation. The dataset includes the transaction date, the age of the house, the number of convenicen stores, the distance to the nearest Metro station (MRT), the geographic coordinates and the the house price, measured in 1000 New Taiwan Dollar per ping. Ping is a local unit and its just 3.3 meter squared.

mapviewOptions(basemaps = "OpenStreetMap")
mapview(real_estate,zcol=c("Transaction_date","House_age","Distance_MRT","Number_convenience_stores",
                           "House_price","Latitude","Longitude"))