Residential unit count model
  • 11 Oct 2023
  • PDF

Residential unit count model

  • PDF

Article Summary


Residential Unit Count Modelunit level addressing

Residential Unit Count model is an estimation of residential unit count for locations where many people reside. This attribute is located on the LightBox Location BSL add-on product for SmartFabric.

Introduction

LightBox constructs SmartFabric using thousands of data sources and contains over two hundred fifty million addresses, and thus our dataset contains addresses with filled, estimated, or sometimes unreported unit counts. Estimating residential unit count for locations where many people reside - high rises, garden style apartment buildings, and more - represents the most significant challenge. 

Our specific methodologies for these complex housing types.

Our goal is to ensure we are neither over, nor under-estimating RUC which means that we need to detect and mitigate potential anomalies in our unit count data. To enhance our SmartFabric data quality for residential unit count totals, we have developed a methodology to detect potential anomalies in our unit count data and leverage it to eliminate potential abnormalities. Utilizing a real-world residential unit count dataset covering hundreds of thousands of properties across the country, we created three machine learning models to compute an estimated unit count per property, along with a high estimate and a low estimate of total unit counts to determine a range.

Each property in the SmartFabric is sorted into one of the three pre-trained models depending on the availability of proprietary data for the address. The first model, being the benchmark, requires the least amount of data to be present to compute a unit count prediction, and the last model requiring the most amount of data points to be present to produce a more holistic unit count prediction for an address. The second model lies in the middle. Each of these three models is trained using proprietary data that has been vetted internally and validated by one or more external trustworthy sources.

To ensure that our models produce data that aligns with other key datasets, we have several internal checks in place to ensure high quality unit count estimations are done where we have proprietary data that is significant and matches the conditions that each model was trained on. As the models are trained to calculate residential unit count estimates specifically for addresses that we recognize as multi-family dwellings, the models can suggest that an address has at least two units or more. These models support properties that we recognize as hotel properties, nursing homes, and other group quarters such as traditional apartment buildings, with reasonable tested accuracy.  

We ultimately use our methodology of taking address counts along with our model results to determine final residential unit count.
For example, when the current address unit count is within the range of low and high estimated units, the original unit count will be kept the same in SmartFabric. However, should the current address unit count not fall within the estimated range, the unit count is updated to the estimated unit count from the model. 

By utilizing LightBox's own proven address and assessor universe we can improve our capacity to deduce residential unit count data quality. We will continue to improve and evolve this process to enhance SmartFabric’s reporting of unit count.

The preceding model speaks mainly to the challenges of multi dwelling units and group quarters. LightBox determines single family residences, duplexes, triplexes, or quadraplexes through more straightforward approaches that use address, assessor data and more.


More Information


Was this article helpful?