Integrating air quality is intended to provide relevant information to users around the world. One of the best ways to ensure relevance is maintaining a high level accuracy, something that cannot be accomplished with raw data alone. BreezoMeter’s big data solution for air pollution data.
High level accuracy is essential for successful air quality data integration, and must be continuously monitored.
Governments are working diligently to generate and share accurate air quality information with the public. It starts with building and maintaining complex and costly monitoring stations. Governments then usually make the collected data available online to the public and businesses for free - this is referred to as open data. In the United States, the EPA (Environmental Protection Agency) provides real time Air Quality Index (AQI) on a general level, per city and state, while AirNow pools together shared data among government agencies to report on AQI.
Although these initiatives and information sources are both essential and helpful to broadly inform citizens and cities on air pollution, they often lack key elements that are necessary to provide businesses and individuals with relevant information that can guide actions to improved health outcomes, namely real-time and location-based data.
Businesses need location-based, real-time air pollution data
Government stations spread throughout cities and countries return air quality information relevant only at their exact locations. Sometimes this data is delayed by several hours. Additionally, air quality can change dramatically even within an hour - temporally, and within a few hundreds meters (or between two city blocks) - spatially, due to weather conditions (ex: wind), pollution events (ex: traffic) and more. Therefore, using government data for products and technology needing precise air quality information doesn’t make sense, as by the time the data is available, the conditions are likely to have changed, or never have been relevant for the point of interest.
In contrast, BreezoMeter’s air quality data - easily accessible through API calls - is real-time, location-based, and is already helping companies both big and small make their solutions and products air pollution smart.
QA for accuracy
In addition to being limited to the exact area of the sensor station, governmental data is what we call raw data. It isn’t verified, tested or monitored. This means that it has the potential to be really incorrect.
BreezoMeter verifies all its data sources, and performs strict QA to ensure the highest level of accuracy. In addition, our proprietary models and machine learning algorithms bring our data feed quality to the next level.
To learn more about BreezoMeter’s continuous accuracy testing processes, check out this post from our algorithm team where you can learn more about why it’s important to calculate and monitor data accuracy, and how we do it.
Raw or right data? Models, big data & machine learning: the benefits
Using big data analytics, BreezoMeter combines data from many sources to determine ultra accurate air pollution levels: While governmental monitoring stations provide at best, hourly concentration readings for air pollution, supplementary data from satellite measurements, meteorological and traffic data, and data regarding types of land cover are added to increase the accuracy of our predictions, together with air quality models such as the European program Copernicus Atmosphere Monitoring Service (CAMS). Naturally, this process generates big data, and our models learn from this data in order to constantly improve our themselves.
Emil Fisher, Co-founder & CTO: “Every hour we validate and organize over 1.8TB of data, while calculating 7.5 billion pollutant concentrations for 440 million geographical points worldwide. In this process we produce, validate and organize 1.8TB of new data every hour. In order to succeed in this mission, we use Google Cloud services to manage our data, and run hundreds of CPUs every second. Many government agencies don’t have the capacity or willingness to carry out to such heavy big data analysis”.
Machine learning is also used to enhance the readings’ quality, provide forecasts and ensure BreezoMeter offers impactful real-time air quality, instead of delayed information with consequently little relevance.