Options
Generalised non-negative matrix factorisation for air pollution source apportionment
Journal
Science of the Total Environment
ISSN
00489697
Date Issued
2022-09-15
Author(s)
Lekinwala, Nirav L.
Bhushan, Mani
Abstract
Source Apportionment (SA) techniques are widely used for identifying key sources of air pollution, thereby providing critical inputs for policy measures. Positive Matrix Factorisation (PMF) (Paatero and Tapper, 1994) is a widely used SA technique. PMF uses the speciated concentration data (X) collected over several days and factorises it into source contribution (G) and source profile (F) matrices, albeit under positivity constraint. Towards this end, it involves solving an optimisation problem where the elements of X are weighted by the inverse of the standard deviations of the corresponding errors introduced during the sampling and chemical analysis process. Thus, PMF implicitly assumes that the errors in different elements of the X matrix are uncorrelated. This assumption may not hold since the sampling, and chemical analysis steps deployed in any data-collection campaign will inevitably lead to correlated errors. While there are other existing Non-Negative Matrix Factorisation (NMF) methods in literature that can be potentially used for SA, these also make various restrictive assumptions about the error covariance structure. In this work, we propose a new method called Generalised Non-Negative Matrix Factorisation (GNMF) to fill this gap. In particular, the proposed method is able to incorporate any error covariance matrix without making any restrictive assumptions on its structure. Towards this end, we integrate the full error covariance matrix in the objective function to be minimised to obtain F and G matrices. We derive the corresponding update rules for obtaining these matrices iteratively. To ensure non-negativity, we extend the multiplicative and projected gradient-based ideas available in NMF literature to the proposed GNMF approach. The proposed method subsumes various NMF methods available in literature as special cases. The utility of the proposed approach is demonstrated by comparing its performance with other methods on an SA problem using a dataset derived from field measurements.
Subjects