Project

Zillow Home Value Prediction (Zestimate) By Using XGBoost

A home is one of the largest/expensive purchase, people make in their lifetime. Making sure homeowners monitor their property in a trusted way is incredibly important. The Zestimate model was developed to give detailed information about home and market value of the home to the first-time consumers at free of cost. In this paper, I analyzed the real estate property prices in three counties in California (Los Angeles, Ventura, Orange). The information on the property listing was taken from Kaggle.com. I predicted sold price and asking prices of home properties based on features such as bedroom count, bathroom count, geographical location etc. I used gradient boosting models, XGBoost. Mean Absolute Error is how the results are evaluated between the log error (actual log error and predicted log error). The shape of the training data set is 90275,3 and that of the properties data set is 2985217,58. The shape of the merged dataset is 90275, 60. Target variable for this project is variable "log error". The log error has a normal distribution of data. The detailed prediction questions, the analysis of the real estate property, the testing and validation for other different algorithms have been presented in this paper. In addition, I will discuss my approach and methodology.

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.