To quantify the uncertainty of environmental footprints introduced by the proportionality assumption we undertake the following steps: We generate 4897^{Footnote 1} MRIO tables with globally randomised import allocations. With each of these new MRIO tables we then calculate national and industry footprints and investigate the variability of these footprints.
Data
We use EXIOBASE (Version 3.4), a global MRIO database that is based on the proportionality assumption and is widely used for environmental footprint analysis (Stadler et al. 2018). We use the tables for the year 2011 in current prices in its original resolution covering 163 industries in 44 countries and 5 Rest of the World (RoW) regions. EXIOBASE provides MRIOs in two different versions which differ in the model used to create a symmetric MRIO table from a rectangular MRSUT (Eurostat 2008): industrybyindustry tables based on the fixed product sales assumption covering 163 industries, and productbyproduct tables based on the industry technology assumption covering 200 products. Productbyproduct MRIOs offer a higher level of detail for (trade) transactions than industrybyindustry MRIOs as the former distinguish between different types of coal, natural gas, coke, refined products, biofuels and gas distribution services. However, this additional level of detail comes at the expense of a longer computation time for calculating environmental footprints. Since the computation time of a matrix inversion that is required when calculating environmental footprints (see below) scales exponentially with the number of dimensions (Çetinay et al. 2020), and computation time was constrained to 40 h by the cluster used for the analysis, we decided to work with the industrybyindustry tables. This allows us to perform considerably more simulations and thus to get more robust results. We believe that the choice of model does not significantly change our conclusions, since as stated by Eurostat in their Manual of Supply, Use and input–output Tables under most circumstances “industrybyindustry tables [are considered] a good approximation of productbyproduct input–output tables” (Eurostat 2008).
Generating a MRIO table with globally randomised import allocations
We first show our approach to randomise the allocation of the imports of the output of one industry to the target sectors in one country. To generate an entire new MRIO table with globally randomised import allocations this procedure has to be repeated for all industries and countries covered by the MRIO.
We refer to the sets of exporting regions as R, exporting industries as I, importing regions as S and importing target sectors (comprising industries and final demand categories) as J. Matrices (capital letters) and vectors are represented as bold characters. For demonstration in this paper, we use a simple industrybyindustry MRIO system with four regions, four industries and two final demand categories to exemplify how the imports of the output of industry \(i \in I\) (say “leather industry”) from the exporting countries R are randomly allocated to the six target sectors J of region \(s \in S\) (say Germany) (see Fig. 1). Since the proportionality assumption concerns only the interindustry matrix \({\mathbf {Z}}\) and the final demand matrix \({\mathbf {Y}}\) at that stage we omit the other elements of typical environmentallyextended MRIO tables (primary inputs, total output, environmental extensions).
The problem we are facing is the allocation of the import flows of a given good (here: the output of industry i = leather industry) from different countries R to different target sectors J in a given country (here: s = Germany), where both (i) the total amount of imports by each exporting country \(r \in R\), and (ii) the total amount of imported inputs for each target sector \(j \in J\) are known. This problem can be represented in the form of a matrix (the “import matrix” \({\mathbf {T}}^{si}\)), where both the row sums \({\mathbf {s}}^{si}\) (= imports of the output of industry i to region s by region of origin R) and column sums \({\mathbf {u}}^{si}\) (= inputs of industry i’s output by target sector J in region s) are known, but cell entries are not. Formally expressed we know thus:
$$\sum _{j\in J}t^{si}_{rj}= \mathbf {s}^{si}$$
(1)
$$\sum _{r\in R}t^{si}_{rj}= \mathbf {u}^{si}$$
(2)
Figure 1 shows how we extract the import matrix \({\mathbf {T}}^{si}\) from the interindustry matrix \({\mathbf {Z}}\) and the final demand matrix \({\mathbf {Y}}\). For the sake of readability, we omit the superscripts in the following. Summing \({\mathbf {T}}\) rowwise we get the vector of import flows \({\mathbf {s}}\) depicting the imports (supply) of the leather industries’ output from different exporting regions to Germany (Eq. 1). Summing \({\mathbf {T}}\) columnwise we get the vector \({\mathbf {u}}\) depicting the use of the imported leather industries’ output in different industries and final consumption categories in Germany (Eq. 2).
Now, the aim is to randomly allocate the regionspecific supply \({\mathbf {s}}\) to the industryspecific use \({\mathbf {u}}\). Thus, we want a ‘new’ randomised import matrix \({\mathbf {T}}'\). We follow the compilers of the most prominent global MRIOs EXIOBASE (Stadler et al. 2018), Eora (Lenzen et al. 2013) and GTAP (Peters et al. 2011) and do not—unlike WIOD (Dietzenbacher et al. 2013)—include information on the BEC.
We apply an algorithm to randomly allocate \({\mathbf {s}}\) to \(\mathbf{u}\) blockwise which works as follows (see Fig. 2A–C, a pseudecode version of the algorithm can be found in Additional file 1).

Step 1: We start by taking the supply of region 1 (\(s_1\) = 1st element of \({\mathbf {s}}\)) and the use of a randomly chosen target sector j (\(u_j\) = jth element of \({\mathbf {u}}\)).

Step 2: Now we differentiate three cases: If the supply from country 1 equals or is smaller than the use of industry j (case 1 or 2, respectively) we allocate the entire supply of country 1 to industry j. If, however, the supply of country 1 is larger than the use of industry j (case 3), the fraction of country 1’s supply which equals the entire use of j is allocated to j. In Fig. 2A–C these three cases are illustrated.

Step 3 is depending on which case has occurred in the previous step:

In case 1, both the entire supply of country 1 and the entire need of industry j have been accounted for. Thus, we can go over to the next country 2 and compare its supply with the next randomly chosen industry following the procedure described under step 2 and 3.

In case 2, the entire supply of country 1 has been accounted for but not the need of industry j. Thus, we go over to the next country 2 and compare its supply with the remainder of industry j following the procedure described under step 2 and 3.

In case 3, the entire need of industry j is met, but country 1 still has imports left. Therefore, we continue with the next randomly chosen industry (in our example \(i_3\)) and compare its need with the remainder of country 1’s supply following the procedure described under step 2 and 3.
We run this algorithm until the supplies of all countries have been accounted for and the needs of all industries have been met. This condition will certainly be reached since all trade flows in MRIO tables are balanced, i.e. the total imports of industry i’s output into region s equals the total use of imported inputs (\(\sum _{r}s_{r} = \sum _{j}u_{j}\)).
Carrying out the above outlined procedure for the imports of each industry output \(i \in I\) into each region \(s \in S\) results in ‘new’ matrices \({\mathbf {Z}}^{\text{new}}\) and \({\mathbf {Y}}^{\text{new}}\) where all imported industry outputs into all countries are randomly allocated to the target sectors while keeping fixed both, (i) the total imports per country and industry output, and (ii) the total use of product import per industry sector.
Our approach is strictly speaking not a randomisation, since we do not consider all possible versions of the import matrices. In the lack of knowledge on bilateral trade details we should have to assume that each of the theoretically infinite possible versions of this import matrix is evenly likely. However, with our algorithm we only consider the extreme versions of the import matrix. With “extreme” we mean that our algorithm produces import matrices where imports are bundled and allocated blockwise to the target sectors (Fig. 2A–C). Thus, we miss all versions of the import matrix \({\mathbf {T}}\) where the import flows from different regions are split and randomly distributed over a large number of target sectors (i.e. all target sectors import a bit from country a, a bit from b and so on, Fig. 2E).
So instead of randomly sampling out of all possible versions of the import matrix, we only sample out of all extreme ones. Given the number of repetitions to be limited by computational issues—in our case to 4897 repetitions—with our approach we increase the probability to capture the extreme ends of the “real” distribution of the uncertainty of the respective footprint. Thus, we come closer to an estimate of the maximum possible uncertainty of the respective footprint which is the aim of our study.
Calculating environmental footprints
We calculate the four most used environmental footprints: carbon, land, material and water (Steinmann et al. 2018). Following Steinmann et al. (2018) we define these footprints as the consumptionbased ...

... emissions of the greenhouse gases \(\text {CO}_2\), \(\text {CH}_4\), \(\text {N}_2\)O, \(\text {SF}_6\), hydrofluorocarbons (HFC), and perflourocarbons (PFC) weighted by their global warming potential based on a time horizon of 100 years (Myhre et al. 2014) (carbon footprint)

... area of land required by forestry, agriculture, infrastructure, etc. (land footprint)

... mass of all used extractions including metal ores, other minerals, wood, fish, and crops (material footprint)

... volume of the total blue water consumption (water footprint).
To calculate the environmental footprints at the national and industry level we use \({\mathbf {Z}}^{\text{new}}\) and \({\mathbf {Y}}^{\text{new}}\), along with the stressor matrix \({\mathbf {S}}\) containing the relative environmental impacts per unit of sector output, the output per sector \({\mathbf {x}}\), the characterisation matrix \({\mathbf {C}}\) that weights the environmental impacts according to the four footprints, and the matrix of direct impacts from final demand \({\mathbf {H}}\) storing the total direct impacts caused by all final demand categories in a region by footprint category, all provided by EXIOBASE (Stadler et al. 2018; Miller and Blair 2009).
We first calculate the Leontief inverse matrix \({\mathbf {L}}\) as
$${\mathbf {L}} = ({\mathbf {I}} {\mathbf {Z}}^{\text{new}} {\hat{\mathbf{X}}^{1}}),$$
(3)
where \({\mathbf {I}}\) is the identity matrix and \({\hat{\mathbf{X}}^{1}}\) is a square matrix with \(1/x_i\) on the main diagonal and zeros elsewhere.
We then calculate the matrix of environmental multipliers \(\mathbf{E}\) storing the environmental impacts per Euro of final demand produced by industry sector:
$${\mathbf {E}} = \mathbf {CSL}.$$
(4)
What we refer to as industry footprints \({\mathbf {F}}^{\text{ind}}\) we obtain by multiplying the environmental multipliers with the amount that is finally demanded for each industry’s output:
$${\mathbf {F}}^{\text{ind}} = {\mathbf {E}} {\hat{\mathbf {Y}}},$$
(5)
where \({\hat{\mathbf{Y}}}\) is a square matrix with \({\mathbf {y}} = \sum _{j}y_{ij}\) (i.e. the sum of final demand over final demand categories j) on the main diagonal and zeros elsewhere.
National footprints \({\mathbf {F}}^{\text{nat}}\) we calculate as
$${\mathbf {F}}^{\text{nat}} = {\mathbf {E}} {\mathbf {Y}}^{\text{new}} + {\mathbf {H}}.$$
(6)
We carry out 4897 simulations runs, thus resulting in samples of 4897 different carbon, water, land and material footprints at national and industry level respectively. We quantify both the absolute and the relative variability within these samples. To measure the absolute variability we use the Standard Deviation (SD), while for the relative variability we use the Coefficient of Variations (CV) defined as
$$\text{CV} = \frac{\text{SD}}{\mu},$$
(7)
where \(\mu\) is the sample’s mean.
We choose the more commonly used SD and CV instead of a measure that is more robust against outliers such as the (relative) Median Absolute Deviation, so that we can compare our results with other studies on the uncertainty of environmental footprints. In Additional files 3 and 4 we also provide our results with these alternative measures of variability. Since both SD and CV do not give any information about the exact appearance of a distribution (e.g. its skewness, number of modes, etc.), we take a closer look at the sample distributions for some example industries/nations by looking at their probability density function and describing their variability in terms of their 2.5th and 97.5th percentiles.