+1 (315) 557-6473 

Python-Based Predictive Housing Insights Web App with SHAP Interpretation

In this Python-powered web application, we've crafted an intuitive Predictive Housing Insights tool utilizing Streamlit. Harnessing the Random Forest Regressor from scikit-learn, the app predicts Boston house prices based on user-defined input parameters. Through an engaging sidebar with sliders, users can tailor inputs, and the app dynamically showcases the predicted house price. Beyond predictions, the app delves into model interpretability using SHAP values, offering users an insightful feature importance analysis. The SHAP summary plots vividly illustrate the impact of each feature on predictions. This seamless integration of machine learning and interpretability empowers users to gain deeper insights into the Boston housing market.

Understanding the Python Code for Boston House Price Prediction

The provided Python code is a Streamlit web application designed for predicting Boston house prices using a Random Forest Regressor model and visualizing feature importance through SHAP values. The application first loads the Boston housing dataset, allowing users to specify input parameters through sliders on the sidebar. This interactive interface can help with your Python assignment by showcasing practical usage of libraries like scikit-learn, pandas, and SHAP. The model is then trained on the dataset and used to make predictions based on user inputs. Additionally, SHAP values are employed to explain the model's predictions and offer insights into feature importance. This app provides a user-friendly way to understand how different features impact house price predictions, making it a valuable resource for learning and educational purposes in Python programming.

Block 1: Streamlit Setup and Introduction

import streamlit as st import pandas as pd import shap import matplotlib.pyplot as plt from sklearn import datasets from sklearn.ensemble import RandomForestRegressor

This block imports the necessary libraries for the application, including Streamlit for the web app interface, pandas for handling data, SHAP for explaining model predictions, matplotlib for plotting, and scikit-learn for the RandomForestRegressor model.

Block 2: App Title and Description

st.write(""" # Boston House Price Prediction App This app predicts the **Boston House Price**! """) st.write('---')

This block sets up the title and description of the Streamlit web application, indicating that it predicts Boston house prices.

Block 3: Loading and Preprocessing Data

# Loads the Boston House Price Dataset boston = datasets.load_boston() X = pd.DataFrame(boston.data, columns=boston.feature_names) Y = pd.DataFrame(boston.target, columns=["MEDV"])

This block loads the Boston House Price dataset, separating the features (X) and target variable (Y). The dataset is then converted into pandas DataFrames for easier manipulation.

Block 4: Sidebar and User Input

# Sidebar # Header of Specify Input Parameters st.sidebar.header('Specify Input Parameters') def user_input_features(): # ... (code for sliders and user input) return features df = user_input_features()

This block creates a sidebar in the web app where users can specify input parameters using sliders for each feature. The user_input_features function returns a DataFrame (df) containing the specified input parameters.

Block 5: Display Specified Input Parameters

# Main Panel # Print specified input parameters st.header('Specified Input parameters') st.write(df) st.write('---')

This block displays the specified input parameters in the main panel of the web app.

Block 6: Model Training and Prediction

# Build Regression Model model = RandomForestRegressor() model.fit(X, Y) # Apply Model to Make Prediction prediction = model.predict(df)

Here, a RandomForestRegressor model is trained on the entire Boston dataset (X and Y). The trained model is then used to make predictions based on the user-specified input parameters.

Block 7: Display Prediction

st.header('Prediction of MEDV') st.write(prediction) st.write('---')

This block displays the predicted house price (MEDV) in the web app.

Block 8: SHAP Explanation and Visualization

# Explaining the model's predictions using SHAP values explainer = shap.TreeExplainer(model) shap_values = explainer.shap_values(X) st.header('Feature Importance') plt.title('Feature importance based on SHAP values') shap.summary_plot(shap_values, X) st.pyplot(bbox_inches='tight') st.write('---') plt.title('Feature importance based on SHAP values (Bar)') shap.summary_plot(shap_values, X, plot_type="bar") st.pyplot(bbox_inches='tight')

In this block, SHAP values are calculated to explain the model's predictions. Two plots are generated: a summary plot and a bar plot of feature importance based on SHAP values. These plots are displayed in the web app using Streamlit.


In conclusion, this Streamlit application offers a valuable and interactive tool for predicting Boston house prices. It seamlessly integrates data loading, model training, and feature importance visualization, making it accessible to both data enthusiasts and prospective homeowners. Users can adjust various house features through intuitive sliders, enabling personalized predictions. The underlying random forest regression model, once trained on the Boston housing dataset, provides accurate price estimates. The application goes a step further by employing SHAP values to explain these predictions, shedding light on the critical factors influencing house prices. Overall, this user-friendly app not only facilitates informed real estate decisions but also serves as an educational resource for understanding machine learning model interpretability through SHAP values.