# Continuous Accuracy Testing for Accurate Air Quality Data

BreezoMeter’s Algorithm Team explains how the accuracy of our computational air quality model is continuously tested and examined for error.

Learn how BreezoMeter continuously monitors our air quality data algorithms, to ensure high levels of accuracy for our clients' innovative and impactful integrations.

## Why it’s important to calculate and monitor data accuracy

When providing a product which is the result of a computational model, one of the first things that needs to be checked is the quality of the result - is it good enough for the end-users’ needs? In the case of air quality data - is it accurate enough to help people improve their health by reducing their exposure to harmful air pollutants? A suitable way to answer these questions is through testing the accuracy of the model as a whole. This kind of test typically produces a single result that can act as a metric for the accuracy of the product.

Is the data accurate enough to help people improve their health, by reducing their exposure to harmful air pollutants?

For example, when looking at a weather temperature forecast of 20 degrees Celsius, it can be useful to know how likely it is that this number will be close to reality: is it usually wrong by 1 degree or by 15 degrees? The same goes for air quality data.

Knowing the performance of a model, and possibly of different parts of it, is very valuable - it can point out any weak points or data anomalies, and provide insights into what can be improved and how.

## What methods can be used?

Assessing the quality of a model’s result, also called model validation, is basically done by comparing the result to data considered as true. In the field of statistics, this “true data,” or ground truth, can be any set of data points that represents the objective of the model. In case the model is of a physical phenomena, ground truth can be obtained through measurements: going out into the field with equipment and measuring the phenomenon that the model was designed to calculate. In the event that the ground truth measurement itself is part of the model’s input, another way to assess the result is by cross validation.

Cross validation (CV) is a statistical technique for model validation in which a subset of the input data is left out, to later serve as the ground truth and be compared against the model’s results. This comparison can be a simple subtraction, which yields the model’s errors.

This process is typically done more than once, each time with a different part of the data left out, to account for the data’s variance (each data chunk is slightly different). The results of all comparisons - which are the errors - are then combined together, to best represent the overall situation. A possible way to combine the errors is by taking their average, where n is the number of repetitions:

$$Error = \frac{1}{n}\sum_{i=1}^{n} (ground\;truth)_i - (model's\;result)_i$$

There are several types of CV, named by the way the data is divided. For example, in k-fold cross validation, the input data is randomly partitioned k times, and the model is run k times, each time with another partition as the input. Another example is the leave-one-out cross validation (LOOCV), in which a single data point is removed from the input each time the model is run.

## The Tools BreezoMeter has Developed

Mean, or average, is a widely known term: it is the sum of a group of numbers, divided by the amount of numbers in the group. Percentiles and median, on the other hand, are slightly less used in non scientific contexts. According to Wikipedia, a percentile is “the value below which a given percentage of observations in a group of observations fall. For example, the 20th percentile is the value (or score) below which 20% of the observations may be found.” (Wikipedia, 2018) Median is the 50th percentile - it is the value in the middle of the dataset. Percentiles are important for deeper understanding of the model’s errors and behaviour, since they provide information on the distribution of the error values: are most errors close to the mean? Are most errors low and only a few of them very high?

A similar insight can be made by using the root mean squared error (RMSE).

$$RMSE = \sqrt{\frac{1}{n}\sum_{i=1}^{n} \;[(ground\;truth)_i - (model's\;result)_i]^2}$$

By squaring the errors we give more weight to larger values, making this statistic more sensitive to outliers (i.e. extreme values). Therefore, if the RMSE is significantly larger than the mean, we know there are probably some large error values in our results.

## Dynamic Continuous Accuracy Testing: The Advantages

To complement the CAT system, we also use graphs in a dynamic report to view the results - a live CAT report. The CAT report is built in Google’s Data Studio which enables great flexibility: each type of accuracy test we perform has its own section and graphs, which can show data from varying time periods. This dynamic CAT report is used on a daily basis to monitor our performance, and support important decisions. For example, with this information, it is possible to identify areas of the model that can benefit from improvement, and the accuracy of the models can be demonstrated to stakeholders. Also, it is possible to check how changes that are made to the model affect the level of accuracy and can be modified accordingly. Knowing accuracy levels can help to identify any problems before they could become chronic.

Example graph from BreezoMeter’s CAT report. These are our model’s hourly errors over a two week period from May 2018, for the pollutant ground level ozone O3 (parts per billion, ppb). The data is from the global air quality API, including 80+ countries. Each colored line represents a different statistic of the errors, and the grey bars represent the number of monitoring stations included in each calculation.

At BreezoMeter, accuracy in air quality data is of upmost importance. Talk with an air quality expert:

Ms. Shaked Friedman has a B.Sc from the Technion, Israel Institute of Technology, and has been an Environmental Engineer on the Research and Development Team at BreezoMeter since 2014.

## A breath of fresh air. Delivered monthly.

The place to learn about air quality & IoT