Building Age Classifier (Zero-shot) using GPT-4 & Facade Image in London (FI-London) Dataset

University College London

Abstract

A building's age of construction is crucial for supporting many geospatial applications. Much current research focuses on estimating building age from facade images using deep learning. However, building an accurate deep learning model requires a considerable amount of labelled training data, and the trained models often have geographical constraints. Recently, large pre-trained vision language models (VLMs) such as GPT-4 Vision, which demonstrate significant generalisation capabilities, have emerged as potential training-free tools for dealing with specific vision tasks, but their applicability and reliability for building information remain unexplored. In this study, a zero-shot building age classifier for facade images is developed using prompts that include logical instructions. Taking London as a test case, we introduce a new dataset, FI-London, comprising facade images and building age epochs. Although the training-free classifier achieved a modest accuracy of 39.69%, the mean absolute error of 0.85 decades indicates that the model can predict building age epochs successfully albeit with a small bias. The ensuing discussion reveals that the classifier struggles to predict the age of very old buildings and is challenged by fine-grained predictions within 2 decades. Overall, the classifier utilising GPT-4 Vision is capable of predicting the rough age epoch of a building from a single facade image without any training.

🏒 FI-London

FI-London contains a total of 131 high-resolution building facade images. These images contain individual building facades of varying building types such as residential apartment blocks, terraced houses, commercial properties, etc. FI-London covers 15 different architectural age epochs, derived from the Colourring Cities Research Programme. Due to the unbalanced sample distribution and small sample size, FI-London can currently only be used for testing. Please download FI-London here.

Sample of Each Age Epoch

Sample Image

Data Distribution

Data Distribution

Colection Location

Brent

Collection Location in Brent

Camden

Collection Location in Camden

πŸ€– Zero-shot Classifier

A training-free classifier is proposed in this study to identify the age epochs of buildings using GPT-4 Vision. A series of command prompts is crafted to perform zero-sample classification directly using GPT-4 Vision. The predicted age epoch and descriptive reasoning process are output by the classifie. Please find the demo here.

Workflow

Workflow

Prompt

Prompt

*The GPT model is GPT-4 Vison Preview. The cost was $2.08 for 131 images.

πŸ“Š Result

The classifier finally resulted in 52 correct and 79 incorrect cases with 1 "hallucination" case. Although GPT-4 Vision performed slightly less well in detailed prediction with 39.69% accuracy, the Mean Absolute Error (MAE) was only 0.85 decade, which means most of the age epochs can be predicted successfully to be in adjacent but not far off age epochs.

Visual Result Gallery

Please click πŸ‘‰ an image πŸ‘ˆ to see the predicted result 🌟.

ID: Image ID in FI-London

Ground Truth: Actual building age epoch

Predicted Result: Predicted result from our proposed classifier

Reason: Descriptive reasoning process given by GPT-4

General Performance

Output Example

Normal Confusion Matrix

Evaluation Metrics

Confusion Matrix with one adjacent age epoch

Normal Confusion Matrix

Normal Confusion Matrix

with one adjacent epoch

Confusion Matrix with one adjacent age epoch

with two adjacent epochs

Confusion Matrix with two adjacent age epochs

πŸ₯° Acknowledgement

This zero-shot classifier is based on GPT-4 Vision by OpenAI. The Facade Image in London (FI-London) dataset is grounded in Colourring Cities Research Programme by the Alan Turing Institude and Colouring London by Hudson, P., et al. in 2018. We thank for their great works.

This work is supported by the Engineering and Physical Sciences Research Council through an industrial CASE studentship with Ordnance Survey. We sincerely thank Dr James Haworth (UCL) and Dr Stefano Cavazzi (OS) for their help.

πŸ“ƒ BibTeX

@article{zeng2024zeroshot,
         title={Zero-shot Building Age Classification from Facade Image Using GPT-4},
         author={Zichao Zeng and June Moh Goo and Xinglei Wang and Bin Chi and Meihui Wang and Jan Boehm},
         journal={arXiv preprint arXiv:2404.09921},
         year={2024}
}