House Sale Data Analysis PDF

Summary

This document presents an analysis of house sale data using various visualizations like scatterplots and heatmaps. The visualizations explore the correlation between the living area, sale price, and other factors involved. The detailed data might be useful for further real estate analysis and modeling of house prices based on various components.

Full Transcript

## House Sale Data | Id | MSSubClass | MSZoning | LotFrontage | LotArea | Street | Alley | LotShape | LandContour | Utilities | PoolArea | PoolQC | Fence | MiscFeature | |---|---|---|---|---|---|---|---|---|---|---|---|---|---| | 0 | 60 | RL | 65.0| 8450 | Pave | NaN | Reg | Lvl | AllPub | 0 | NaN...

## House Sale Data | Id | MSSubClass | MSZoning | LotFrontage | LotArea | Street | Alley | LotShape | LandContour | Utilities | PoolArea | PoolQC | Fence | MiscFeature | |---|---|---|---|---|---|---|---|---|---|---|---|---|---| | 0 | 60 | RL | 65.0| 8450 | Pave | NaN | Reg | Lvl | AllPub | 0 | NaN | NaN | NaN | | 1 | 20 | RL | 80.0 | 9600 | Pave | NaN | Reg | Lvl | AllPub | 0 | NaN | NaN | NaN | | 2 | 60 | RL | 68.0 | 11250 | Pave | NaN | IR1 | Lvl | AllPub | 0 | NaN | NaN | NaN | | 3 | 70 | RL | 60.0 | 9550 | Pave | NaN | IR1 | Lvl | AllPub | 0 | NaN | NaN | NaN | | 4 | 60 | RL | 84.0 | 14260| Pave | NaN | IR1 | Lvl | AllPub | 0 | NaN | NaN | NaN | | 1455 | 60 | RL | 62.0 | 7917 | Pave | NaN | Reg | Lvl | AllPub | 0 | NaN | NaN | NaN | | 1456 | 20 | RL | 85.0 | 13175 | Pave | NaN | Reg | Lvl | AllPub | 0 | NaN | MnPrv | NaN | | 1457 | 70 | RL | 66.0 | 9042 | Pave | NaN | Reg | Lvl | AllPub | 0 | NaN | GdPrv | Shed | | 1458 | 20 | RL | 68.0 | 9717 | Pave | NaN | Reg | Lvl | AllPub | 0 | NaN | NaN | NaN | | 1459 | 20 | RL | 75.0 | 9937 | Pave | NaN | Reg | Lvl | AllPub | 0 | NaN | NaN | NaN | ### GrLivArea vs. SalePrice This is a scatterplot showing the relationship between the size of above ground living area (*GrLivArea*) and the house sale price (*SalePrice*). ### Correlation Matrix This is a heatmap depicting the correlation between various numerical factors. The colours show the strength of the correlation between different factors where red refers to the strongest positive correlation, and blue refers to the strongest negative correlation. The correlation between factors is represented by the value at the intersection of two factors. For example, the correlation between *GrLivArea* and *OverallQual* is 0.59. The following features are shown on the correlation matrix: * GrLivArea * OverallQual * TotalBsmtSF * 1stFlrSF * GarageCars * GarageArea * PoolArea ### Bubble Chart: GrLivArea vs. SalePrice vs. YearBuilt The bubble chart displays data points showing the relationship between *GrLivArea*, *SalePrice*, *YearBuilt*. The size of each bubble is determined by the *YearBuilt* value, while the colour of the bubble is a visual representation of the *SalePrice* value.

Use Quizgecko on...
Browser
Browser