A modular bottom-up approach for constructing physical input–output tables (PIOTs) based on process engineering models

Physical input–output tables (PIOTs) were first conceptualized in the 1990s but have not been widely adopted. However, with the increased emphasis on building a circular economy and understanding the resource nexus, PIOTs will become critical for optimizing resource flows and restructuring economies to close material loops. This necessitates a focus on improved methodologies for PIOT development to allow wider adoption. In this work, we propose and demonstrate a modular bottom-up approach for constructing PIOTs from process engineering models. The method was tested on a PIOT for nitrogen with a subset of sectors in Illinois (USA) and compared with a nitrogen PIOT developed earlier for the same time period, finding equal or higher confidence in sector balances. While the method has high initial costs, its suitability for automation enables it to allow the fast creation of PIOTs where technical coefficient matrices reflect underlying physical processes and relationships within and between sectors, thus capturing accurately the physical structure of the economy. We also demonstrate how the method can be extended for the creation of regional input coefficient matrices. While not implemented here, the method can potentially be used for the creation of hybrid IO tables, trend analysis through time series and combined with non-survey methods to fill data gaps. This will allow combining the strengths of complementary methodologies for constructing PIOTs and standardization of methods for better reliability.

the physical interdependency between economic sectors as well as the interactions of the economy with the environment. While several researchers have highlighted the potential for PIOTs for material tracking and benefits of inclusion of more IO data in physical units (Altimiras-Martin 2014;Hubacek and Giljum 2003;Hoekstra and van den Bergh 2006;Duchin and Levine 2011;Merciai and Heijungs 2014), their adoption and development have lagged compared to material flow analysis (MFA) (Hoekstra 2010), Life Cycle Assessment (LCA), and Environmentally Extended Input-Output (EEIO) analysis. The slow adoption of PIOTs is attributed to high compilation costs, lack of physical data at the appropriate aggregation levels, lack of reproducibility and continuity in available datasets along with limited applications demonstrating the use of PIOT (Hoekstra 2010). However, the limited applications are a consequence of the limited number of PIOTs available leading to a negative feedback cycle of lag in PIOT development.
An early use of PIOTs for measuring the ecological footprint of trade was noted by Hubacek and Giljum (Hubacek and Giljum 2003). PIOTs provide unique insights into the physical structure of the economy (Altimiras-Martin 2014), which differs from the economic structure since monetary input-output tables (MIOTs) record the flows of goods and services according to their economic values, not their physical mass (Hoekstra and van den Bergh 2006). Thus, PIOTs can also provide insights toward a transparent transition to the circular economy. Recent emphasis on the food-energy-water nexus further highlights the need to integrate physical data with consumption patterns at a subnational level, which PIOTs are uniquely suited to do (Wachs and Singh 2017). Additionally, elemental cycles in the economy using PIOTs (Hoekstra and van den Bergh 2006) are important for studies of dematerialization and movements of critical materials. In this vein, Singh et al. have created a first regional PIOT to demonstrate the nitrogen (N) cycle in the economy of Illinois due to the important role of N in food and energy provision and its significant environmental impacts such as eutrophication . Physical transaction matrices for sectors with material outputs provide the best basis for future projections (Weisz and Duchin 2006). Furthermore, the incorporation of flows to nature, intrinsic to PIOTs, is key for environmental analysis regarding waste management (Dietzenbacher et al. 2009). Therefore, interest in developing PIOTs has continued (despite the lack of a standardized approach) and multiple PIOTs have been developed using empirical or IO-based approaches as follows.
Currently, four approaches exist that can facilitate the construction of PIOTs or mixed unit IO tables: unit physical input-output by materials (UPIOM) (Nakamura et al. 2011), the approach for developing hybrid supply and use tables (HSUTs) in EXIOBASE (Merciai and Schmidt 2018), hybrid LCA/IO approaches (Lindner and Guan 2014) and an RAS-based approach (Fry et al. 2016). In UPIOM, a highly disaggregated MIOT is used to derive the unit structure, that is, a binary matrix for each sector depicting whether or not it is related by intersectoral flows to other sectors. Another binary matrix shows whether the monetary relationship corresponds to a physical relationship. A yield matrix indicates the proportion of the starting material that becomes product versus waste. The transformed matrix is then classified into resources, materials and products. (In their terminology, resources are used to produce materials which in turn are used to make products.) In the USA, UPIOM has been employed to study metals at a detailed national level Nuss et al. 2016).
Recently, UPIOM was integrated into the pioneering work in building HSUTs for EXIOBASE (Merciai and Schmidt 2018). In this approach, rather than calculating technical coefficients exclusively from existing monetary IO tables and converting via price, coefficients are primarily taken from life cycle inventory (LCI) data and literature, thus overcoming the need for highly disaggregated MIOTs as a basis. Yield factors (called transfer or transformation coefficients) specific to each material and total materials are calculated. An initial estimate of supply and use tables (SUTs) is generated by a multistep approach involving empirical data, the technical and yield coefficients, minimum material requirements and a trade module to estimate sufficient supply. These SUTs are then constrained to ensure mass balance when technical and transformation coefficients are readjusted. The trade module uses this information to build the multiregional HSUTs. This method is comprehensive, however depends on the quality of data available from LCIs which in turn face challenges of validation, representativeness and future adaptation for changing technologies.
Another common approach for building PIOTs incorporates LCA data for individual sectors, similar to the tiered hybrid LCA approach (Crawford et al. 2018;Suh et al. 2004), where a subset of sectors is modeled with LCA and EIO data fills in for the rest. Nevertheless, the LCA data in physical units are often used to generate a new IOT in the IObased hybrid LCA method (Malik et al. 2014;Wolfram et al. 2016;Teh et al. 2017). This approach offers good precision, validation, and comprehensive mass flows and environmental impacts. Yet, the opacity of the LCA datasets limits reproducibility, continuity and usefulness in long-term decision-making.
A final approach has recently been adopted in Australia to create a time series of PIOTs (Fry et al. 2016), using a variety of approximation techniques to fill in data gaps, notably a RAS technique applied sequentially. It requires a strong initial basis for the data tables, which is not yet available at the subnational level, thereby limiting the application of this approach. While significant advances have been made toward PIOT compilation methodologies, existing challenges for adoption of these methods include: lack of disaggregated IO tables for UPIOM approach, coverage gaps and opacity in LCA datasets, LCI data quality challenges (unmet thermodynamic balances) and limited updates. Gaps for subnational PIOT construction are even larger. Thus, a critical need remains for methods that allow reproducibility, transparency and continuity, so that PIOTs can be widely utilized in decision-making. This was also alluded to in work on the future of IO analysis, where Dietzenbacher hypothesized a world MRIO in physical units, linked to engineering and GIS data (Dietzenbacher et al. 2013).
The main contribution of this paper is in proposing a bottom-up approach for constructing PIOTs that combine the strength of process engineering models with the IO framework. This approach is suitable for automation and allows for continuity of datasets at multiple scales to rapidly generate PIOTs. This method can also be integrated with existing approaches to improve process data used in technical coefficients. Since, the method utilizes a detailed technology model, PIOTs created with this method can be updated faster to reflect technology changes in the economy as envisioned by the rectangular choice-of-technology (RCOT) method (Duchin and Levine 2011). Specifically, the proposed method brings the following advantages over the process LCA-based PIOT generation: (1) Process modeling is a transparent methodology where model parameters and unit operations can be seen and adjusted, allowing easy updates and automation; (2) process modeling via the software (Aspen Plus) used here respects the laws of thermodynamics, a challenge in life cycle inventories that needs correction such as via data reconciliation (Yi and Bakshi 2007;Singh et al. 2008;Hau et al. 2008), which is rarely done and (3) process models detail the mechanisms of transformation occurring in each operation of a production system, rather than relying only on empirical documentation for coefficient calculation. Since process model results are used here mainly to generate technical and yield coefficients, this approach can be integrated with comprehensive methodologies such as UPIOM and EXIOBASE, and may allow for improvements in the databases used for calculating physical transactions matrices in existing approaches.
The rest of the paper is organized as follows: Sect. 2 describes the proposed methodology. Section 3 demonstrates the methodology to generate a 6-sector nitrogen PIOT for Illinois. In Sect. 4, we compare the PIOT generated in this work with a previously developed N PIOT ) as a validation of the proposed approach along with discussing advantages, challenges and limitations of the approach. Section 5 concludes and details opportunities for future work and standardization of PIOT development.

Proposed method for PIOT construction using process engineering models
A PIOT provides a representation of physical intersectoral transactions within the economy and between natural systems and the economy. Flows of raw materials (from nature) provide information on the dependence on natural systems for physical resource requirements, whereas wastes and emissions quantify impact. The final demand matrix captures the consumption of products. These flows form the basic inputs for construction of PIOTs.
In this work, we utilize the strength of process engineering models to capture regional production, providing a reliable estimation of domestic supply of products, domestic consumption of intermediate products, raw materials, co-products and waste streams. The proposed method is a bottom-up modular approach that connects the flows associated with individual industrial systems in a region to build the PIOTs by linking them to the economic sectors designated by the system of national accounts.
After the initial sector selection, the first step in our method ( Fig. 1) is constructing process engineering models representing sectoral technology, while empirical data related to model flows are gathered. Mapping is done between the model flows and producing or purchasing sectors (in accordance with the North American Industry Classification System, NAICS). In the second stage, models are scaled up and run to generate regional flows that are aggregated into a PIOT based on regional technical coefficients (RTCs). The final stage allows us to create a regional IO model based on regional input coefficients (RICs) by applying assumptions about imports or exports as per data availability. We next discuss this regional PIOT structure followed by an explanation of PIOT construction from information obtained through the proposed approach.
2.1 Regional input-output models Regional (subnational) IO models are important due to the heterogeneity in products and production approaches for the same sectors in different regions. These can be constructed by utilizing surveys to identify the production recipes of firms in a region (Miller and Blair 2009). Two matrix types can be used in regional IO models: RTCs and RICs. RTCs provide information about the production practices of a sector in a region, i.e., the total inputs used by a sector in the region for producing outputs. RICs on the other hand provide information about the inputs to a sector only from firms within the same region. It is to be noted that RICs will not capture the whole production technology for the sector if some of the inputs to the sector in a region are not being produced in the region. Hence, to fully understand the production of a sector, RTCs and RICs must be evaluated together. The model based on RICs is similar to the standard IO model with the coefficients replaced by a rr ij representing the input from sector i to sector j flowing within the region r. It is calculated as a rr ij = z rr ij x r j where z rr ij is the physical flow from sector i to sector j within region r and x r j is the total output of sector j. 1 Using this model, total regional impact of a final demand change y r in the region would be given as: Thus, RICs allow in computing expected regional impact to support a final demand, while RTC can allow to compute total expected impact (both inside and outside region) to support a final demand from sectors within the region. However, a survey-based construction of RTCs and RICs would be costly, time-consuming and faces the challenge of (1) x r = I − A rr −1 y r Fig. 1 Overview of the approach and methodology for building regional PIOTs from process models and empirical data.
Step 1 is described in depth in Sect. 2.1.1, step 2 in Sect. 2.1.2 and step 3 in Sect. 2.3 continuity and reproducibility. The proposed approach in this work overcomes some of these challenges by providing a method to estimate the RTCs and RICs utilizing information from process engineering models, empirical data and scenarios based on additional exogenous data available with greater reliability.

Process models to regional input-output models
As the proposed approach is a bottom-up approach (Fig. 1), it begins with selecting the sectors to be modeled. If the focus of the PIOT is a specific material or element, the choice can be informed by flowcharts based on MFA, which provides the subset of sectors for which physical data are most relevant. MFA is not a prerequisite for this approach, since modelers can also select sectors based on their knowledge of a regional economy. In Step 1, technology used for production is identified and process models are developed to capture production mechanisms. Since process engineering models are based on mechanistic information that captures the physical and chemical relationships for production, the knowledge about production, scalability and continuity are improved here over empirical methods. An initial model represents an average technology producing a single output type or linked output types (such as corn wet milling which has several coproducts). In the case of multiple technologies being used in a region for production of the same product, this method can include process models for variations in the production method combined with the specific production capacity for each technology. Simultaneously, data for scaling up (such as production capacity and utilization of capacity in an economy) process models is collected which allows to calculate total sectoral activity in a region. For agricultural sectors, the empirical data required is the true production values (P) which are tracked in the USDA NASS census (Additional file 1: Table A10). For industrial sectors, the total processing capacity (C) in the state is used here. The US Census Bureau (US Census Bureau 2005) provides a disaggregated survey of sectoral capacity utilization (U). Therefore, the scaling factor, S, is either assumed to be equivalent to the true production numbers (P) or calculated according to Eq. 2: Hence, scaled process models provide values of production and consumption of intermediate products for the sectors being modeled. Since the process models capture all input requirements, scaling up these models to represent total regional production provides information to calculate the RTCs for sectors modeled in the region as described further in Sect. 2.1.2. Additionally, scaling up these process models provides "supply" and "use" side physical flow information. As models are built, element and compound level data are collected about flows entering the production process, allowing us to trace the linkages to their respective production sectors. Each input is thus mapped either to its production sector, the "rest of the economy" or to raw materials. Likewise, product streams are mapped either to residuals or as saleable product streams. This mapping can be stored for automation in future, providing a major advantage for reproducibility and continuity of PIOTs. Data on consumption by households within the region, imports and exports are exogenous information required to allow full construction of regional PIOTs.

Regional technical coefficients (RTCs) from process models
Physical flow information obtained from running the scaled process models provides the basis for RTCs. Since models reflect production recipes, no distinction can usually be made about the proportion of domestic or imported factors as shown in Eq. 3. 2 Flows from sector i to sector j can originate within the region being modeled ( z rr ij ) or outside the region ( z sr ij ), 3 we capture z ij , the sum of both, from process models: Figure 2 shows the system boundary for a regional economy and flows for a modeled sector A. Inputs to sector A come from nature ( w A ) and other sectoral production z BA , z NA . Flows entering our models are captured at the point of entry; thus, raw materials from nature can be obtained from regional extraction ( w R A ) or imported from other regions ( w S A ), but the models estimate only w A . Inputs from other sectors are domestic ( z rr BA , z rr NA ) or from outside the region ( z sr BA , z sr NA ). If the focus is on single-region models, z sr BA , z sr NA is equivalent to imports by respective sectors m BA , m NA . Likewise, products can be sold to other domestic sectors ( z rr AB , z rr AN ) or outside regions ( z rs AB , z rs AN ) . Products can also be consumed by households ( c A ) in the same region. All flows from sector A to the exterior can be grouped as exports ( ē A ) for a single-region model. All wastes are grouped as residuals ( r A ) for simplicity. In some cases, sectors rely on their own products for production ( z AA ). The dotted flow arrow represents flows to and from stocks ( s A ), which represent the flow outside of the temporal boundary, both what is left as inventory at the end of the time period, or what was left over from past time periods. We assume a closed temporal boundary here, and hence, stock flows are zero. Total flows into the sector must be equal to total flows out, designated as x r,inputs A and x r,outputs A (equivalent to x r A and x A ). As Fig. 2 shows, flows from process models ( z ij , w A , r A , x r,inputs A ) provide the input side information necessary to fill the PIOT. The next step of mapping flows from process models to the PIOT structure is demonstrated in Fig. 3 by considering a single sector-ammonia fertilizer manufacturing-as our sector A. The left side of Fig. 3 provides a simplified diagram showing a process simulation model already scaled to the size of economic production from information on production capacity (C) and utilization (U). The model includes stream flows (numbered from 1 to 14, A1-A2 in Fig. 3), as well as unit operations where transformations and state changes occur. In the process depicted, natural gas is used as the feedstock for the production via steam reformation of hydrogen, which reacts with air to form ammonia in the Haber-Bosch process. Thus, the model results provide flow information on raw material consumed ( w A , stream 3), products as feedstocks ( z iA , streams 1 and 2), auxiliary inputs ( z iA , stream A1), waste generation ( r A , streams 8, A2, and 14), recycling (stream 15), products (stream 13) and co-products. (None are shown in the example, but in some cases the CO 2 stream shown in stream 8 will be divided into a co-product stream.) In Fig. 3, the flow from stream 1 is water, sector B, and the total steam per year is entered as z BA . Likewise, natural gas is used in input streams A1 and 2, so they are summed to provide the entry for z CA (in Fig. 2, sector C is aggregated into N). Air is a flow from nature, so here stream 3's total input is used as the entry for w A . Three residual streams are present: a purge stream (14) consists primarily of air and water vapor, an exhaust stream from the heat exchanger (A2) has combustion byproducts, and waste CO 2 exits in (8). Those three streams are added together in the r A cell. To the left, a simplified diagram of an ammonia production process model is shown. All primary flow streams are labeled in black circles. Auxiliary streams are numbered in gray circles. Streams are mapped to the PIOT framework at right, following the arrows. Stream numbers included in the categories are shown in the PIOT. This is an example of a single sector A, outputs of other sectors in this sample economy are shaded. As referred to in the text, the process information fills the column for the given sector. The IO identity that x r,outputs A = x r,inputs A is used to fill the total output column As other sectors are modeled, information for each column is obtained to build the PIOT. The total input to each sector is then calculated as the column sum using Eq. 4 from these known flows (Giljum and Hubacek 2009).
This corresponds to a treatment of residuals equivalent to that of Suh (2004) and discussed extensively on treatment of residuals for construction of PIOT in Pauliuk et al. (2015a, b). Since z ij and x j are known from the process model information, RTCs for the regional economy can be easily calculated: To build the whole PIOT, final demand must also be known, including consumption by households c A (in EIO, this typically includes consumption by government and investors as well) as well as net exports, e A . Since total mass output from sector A ( x r,outputs A ), is known from process modeling, the market balance, the row sum for sector A, is shown in Eq. 6: This equation can be rearranged to Eq. 7, which puts the variables known from process models (once all models are run) on the right-hand side: At this point, empirical data are necessary to fill in the left-hand side in Eq. 7. The strength of this method is that since Eqs. 6 and 7 have only two degrees of freedom, c A and e A , , they can be used to approximate an initial PIOT based on RTCs when data are available only on consumption or net exports. To list imports and exports separately, note that Eq. 7 is equivalent to: Ideally, data on domestic consumption, imports and exports should be available, and the balance in Eq. 8 agrees. At the subnational scale, however, information on these flows is frequently lacking, so assumptions may need to be made (Sect. 2.2). It is also possible that the equality in Eq. 8 does not hold after values are found for c A , m A and ē A , i.e., when Eq. 8 is overspecified. In this case, non-survey approaches such as RAS can be used to balance the matrix. Defining the best strategy for this situation is left to future work since our focus is on regional PIOTs, which are typically constrained by a lack of (4) z Aj information on intersectoral flows rather than conflicting information, for which many approaches have been used in the IOA literature.
This section has described the derivation of a PIOT based on RTCs. Building RICs from RTCs requires information on imports, which we can address by scenarios, as detailed in Sect. 2.2. Next, the compatibility of this approach with the supply and use framework is explained, which also assists in obtaining additional information for scenarios used to build RICs.

Physical supply and use tables (PSUTs) from process models
The scale up of process models also provides the supply table and the use table in physical units (see Tables 1 and 2). Use tables provide information about consumption of commodities by industries or by final demand sectors such as households, government, investment or exports (Miller and Blair 2009;Eurostat/European Commission 2008). Use table row provides an alternative representation of the market balance as shown in Eq. 6. Process models provide information for total consumption of specific commodities in the sector being modeled as the models are scaled up to represent true production or activity of the sector in the regional economy. Hence, process models can be used to obtain the use table columns. Similarly, "the supply or make" table contains information Notice that residuals are not considered here in detail, and a steady-state assumption is made at this point so that flows into and out from inventories are not included about the production of commodities in a region by industries, also provided by the scale up of process models.
The supply of a product in the region as provided from domestic production and imports can be compared with the use of the product in interindustry consumption, exports and household consumption. Domestic production and interindustry consumption data are available from process models as described earlier. Conversion of these PSUTs to square PIOTs can be done using EuroStat (Eurostat/European Commission 2008) approach when data for imports use in sectors are available. In this work, due to lack of imports use matrix data, methods provided in EuroStat were not used directly to construct the PIOT. Instead, a direct comparison of the values for supply and use of commodities for direction of imbalance was used as a first approximation of imports and exports by making simplifying assumptions as scenarios which can be used to get the RICs from RTCs (Sect. 2.3).

Regional input coefficients (RICs) from process models
In order to create the RIC matrix, imports must be removed from the intersectoral transactions matrix. Previous work has shown that at the subnational level, information on imports and share of use of imports needed to construct the matrix M is generally not available in physical units . Hence, information obtained from physical supply and use tables (or in the case of no secondary production as we have modeled, directly from the PIOT) can be used to make simplifying assumptions, as discussed below. Once M is available it can be subtracted from Z to obtain the matrix Z rr , which is desirable for regional calculations, i.e., Z rr = Z − M (matrix form of Eq. 3). The imports column sums are then included in the primary inputs quadrant.

Scenario 1
In scenario 1, no further information is available for physical imports and exports. This means that we can solve Eq. 6, but Eq. 8 is still underspecified ( ē A and m A are unknown). While not ideal, this is the most common situation in the case of regional PIOTs. In this case, two methods can be used to approximate a vector of imports for the PIOT: (a) In the standard approach used for regionalization of national IO tables when an industrial sector i has higher than national average representation in the regional economy, we assume that imports = 0, while exports are nonzero. Thus, z ij from process models is equal to z rr ij . Regional consumption can be empirically estimated, and balance of supply after total consumption (interindustry consumption + consumption) is accounted as exports.
Conversely, if sector i has lower than average representation in the regional economy as compared to the national economy, we assume that exports = 0. Regional consumption can be empirically estimated, and the balance of use and supply in the region is accounted as imports. (b) A simpler approach of looking at the difference between supply and demand is taken here. If the supply from process models for a product is higher than the demand side data, we assume imports to be 0. Similarly, if demand side data for a product is higher than the supply, we assume exports to be 0. Accordingly, we utilize the same approach as above to calculate z rr ij . This approach is similar to the balancing approach used by Singh et al. (2017) in developing regional PIOT based on empirical data alone.
Once a vector of total imports m is present via scenario 1 if necessary, a PIOT can be created. Now all data for the transactions matrix Z and interactions with nature (w′ and r′) have been taken from the scaled models, while exports ē , imports m and consumption c are available from empirical data collection and scenario 1. The final stage of our method transforms the PIOT from RTCs to a PIOT based on RICs. To do this, matrix M, a distribution of the total imports to their use by sector is needed. In some cases, the information is already present, whether from the data collection phase or complete specification through the scenario 1 assumptions made. Scenarios 2-4 give alternatives for constructing M.
Scenario 2 In scenario 2, we estimate the matrix of imports M from vectors of imports m using the scrubbing methods suggested in Miller and Blair or the method described in Dietzenbacher et al. (2005). In the first method by Miller and Blair, all imports are imputed to consumption by industries, whereas in the second, some imports are consumed by final demand. The methods allocate the import vector proportionally between the sectors and in the second method between the sectors and final demand. The approach in Dietzenbacher et al. (2005) is similar to the Miller and Blair approach that imputes all imports to industry consumption.

Scenario 3
In scenario 3, more complex interregional trade models such as described in Boero et al. (2018) are used for exports and imports values. Once the value of exports and imports is available, both supply and use data from process models along with RTCs can be used to develop the regional PIOT.

Scenario 4
In scenario 4, standard approaches proposed by EUROSTAT (Eurostat/ European Commission 2008) to convert supply and use tables to symmetrical PIOT can be utilized. Process models can provide data for physical supply and use tables. Then, the fixed technology assumption can be utilized to convert the PSUT to symmetrical PIOTs. For this approach, a use table providing import distribution to all sectors must also be present. Thus, scenarios 2 and 3 can be used to obtain the information necessary to implement scenario 4.
Among the scenarios described above, we hypothesize that scenario 3 should provide the most reliable data for PIOT construction since well-developed trade models exist (Boero et al. 2018;Többen and Kronenberg 2015). In this work, we focus on demonstration of converting process model data to RTCs and RICs using scenario 1, as there was not enough data to build RICs using all scenarios. Hence, an assessment of relative confidence levels for building RICs using these scenarios is left for future work.

Overview
We tested the methodology described in Sect. 2 to create a nitrogen (N) PIOT for six sectors of the economy (Tables 4 and 5) in Illinois (IL), USA, for 2002. (Other economic sectors were aggregated as rest of the economy (ROE) with variable o to represent their consumption and o′ to represent their production used as inputs to modeled sectors, so the row sum of z ij and o i represents total intersectoral consumption and the column sum of z ij and o j represents total inputs from industry.) A recently published N PIOT for IL in 2002 ) allowed us to benchmark the performance of our methodology. The published PIOT provided an MFA for N that we used to guide the selection of six sectors (which are then associated with NAICS codes as mentioned in Sect. 2) related to the corn supply chain (Fig. 4) and consequently the technology selection for process modeling.
For a PIOT of N, the fertilizer manufacturing sector, where N is mobilized from air via the Haber-Bosch process into ammonia (in IL used primarily to fertilize corn), is key. Since better models for the corn farming sector are available in the literature, it is part of the rest of economy in our PIOT. Corn is processed via feed mills as well as wet and dry mills. Wet mills separate the corn kernel into its constituent parts, processing the starch into food products (the NAICS corn wet milling sector includes just this production) or ethanol (this production is included in the ethyl alcohol manufacturing sector, part of other basic organic chemical manufacturing), while the rest of the kernel is used for feed. Corn is heated and fermented in dry mills to produce ethanol (and CO 2 ) from the starch, while remaining solids are lumped together in our model as dry distillers' grains solubles (DDGS). Milling co-products contains almost all the N in corn. The animal feed processing sector is heterogeneous (see Sect. 4.3.1), necessitating a granular approach for this sector. We focused on hog feed production (hogs represent the major livestock type in IL) and the hog farming sector. The final sector modeled is slaughtering and processing where hogs are slaughtered and carcasses are separated into primal cuts and other parts.

Empirical data collection, scaling and mapping to the PIOT
All sectors described above were modeled in Aspen Plus to generate physical flow data. (Detailed process inventories and diagrams are provided in Additional file 1.) Aspen Plus is a standard software used for simulation of process engineering models (Aspen Fig. 4 Models along the corn supply chain completed for this work are shown in green. Black sectors are excluded from the work since they are not suitable for Aspen modeling at this stage. Brown denotes sectors that have not yet been modeled Plus 2018). These sectors were modeled to capture the technology details independently using Aspen Plus (described more fully in Additional file 1: Section A1). Flows were then mapped to the sectors in IO accounts (Additional file 1: Table A2, Table 5) and to residuals or raw materials (process shown in Fig. 3). The completed models were scaled up to represent the economy level flows (scaling data provided in Additional file 1: Table A10) and run. PIOT values are obtained by multiplying the results by the N weight percentage. (All mappings and N weight percentages used are provided in Additional file 1.) Stream information from simulation results was translated to PIOT columns for each sector as depicted in Fig. 3. (All inventories, process diagrams, N wt percentages and mappings are available in Additional file 1: Section A2.) Input streams were mapped either to raw materials, the "rest of the economy" o′, or to the modeled sector providing the respective products. Output residual streams were aggregated in r′.
Once all PIOT columns were filled in with data from the process modeling results, information was available to use Eq. 4 for calculating total inputs ( x r,inputs A ) to the respective modeled sectors and check the balance with x r,outputs A . Hence, right-hand side variables ( x r,outputs A , m j=1 z Aj ) in Eqs. 7 and 8 can now be supplemented with information on exogenous final demand, imports and exports for all sectors to build the complete PIOT. For the case study, all sectors except "Other basic organic chemical mfg" required a scenario 1b approach (which utilizes the imbalance between supply and use for a sector product to generate a vector of imports, m or exports ē ), since only limited information on imports, exports and consumption was available. Final demand data used for each sector are provided in Additional file 1: Table A11. After applying scenario 1b for other sectors, a PIOT based on RTCs is generated (Table 3).
The final step is the preparation of a RIC matrix. In the case study after applying the scenario 1b assumption, enough information was present to generate M since the RTC matrix (Table 3) had only three nonzero entries. Both sectors with nonzero entries produce intermediate products (hence consumption in households, c i = 0). In hog farming, one nonzero entry is a self-flow, which was specified by the process model mapping as imports. Remaining imports were ascribed to slaughter and processing, the only other sector that uses products from hog farming. In the case of "feed processing" the total production is 75 thousand tons N. Seventy-six thousand tons N are consumed on hog farms, computed from process model. Thus, total imports for feed processing must be attributed to the connection between feed processing and hog farms. The imports are shifted to the imports row in the primary inputs matrix for the PIOT and the updated table gives an RIC matrix (Table 4) from the original RTC matrix (Table 3).

Comparison of PIOTs from process modeling versus empirical MFA approach
The N PIOTs from the "process-to-PIOT" approach are shown in Tables 4 (RTC) and 5 (RIC). We compared the results with the PIOT from Singh et al. (2017), as aggregated in Additional file 1: Table A12. The published PIOT in Singh et al. (2017) was also compiled using a bottom-up approach by first conducting a MFA for N in the three major agricultural commodities in Illinois's economy: corn, soy and wheat. Using the MFA, flows of N were tracked upstream and downstream for each commodity, and official statistics as well as data from trade associations were used to approximate an N mass balance

Table 3 N PIOT for Illinois, 2002, on basis of RTC from process modeling approach (metric tons of N)
Data for final demand matrix entries: scaling data for all process models given in Additional file 1: A10; net exports derived from Eq. 6 for all sectors except corn wet milling and ethyl alcohol mfg, See Additional file 1: A11; Type 1b assumption was used for estimation of imports or exports

Table 4 N PIOT for Illinois, 2002 on basis of RIC from process modeling approach (metric tons of N)
Note that in this, RIC-based PIOT has no import column, since the imports of the region have been taken out to get regional coefficients RIC, and total imports used for production are included in the primary inputs quadrant. This can be compared to the RTC Table 3 where the import column is still included for each sector  for each sector. This approach is similar to using LCA data and linearly scaling all flows according to the input/outputs for the sectors. One major limitation of the PIOT generated from this approach was lack of appropriate validation for scaling up data to represent full regional flows. This limitation has been addressed in our "process-to-PIOT" approach.
In this work, we chose sectors to model utilizing the MFA completed in Singh et al. (2017). The key synopsis of the difference in our approach here is that each mechanism of physical and chemical transformation was modeled. One major benefit is easy reproducibility and adoption of process models to generate PIOT for another region and time period quickly as tested on Indiana, a neighboring state, for 2016. Since we have a model for each sector as well as a data log for statistical sources for scaling, the models could be run and adjusted to represent Indiana's economy as well (Wachs and Singh 2017).

Comparison of sectoral flows in PIOTs from two approaches
The process modeling approach resulted in different extents of coverage of modeled sectors between the two approaches, as shown in Table 5. Specifically, the previous work omitted the wet corn milling sector, which includes only mills with the final product of human food. This means that for direct comparison of two approaches, the wet and dry corn milling sector totals from previous PIOT should be aggregated to give ethyl alcohol Table 5 Mapping of models to IO sectors with approximate coverage along with total production (x) numbers from the process modeling approach (PM) as well as the approach in Singh et al.

(2017) (MFA)
Validation values for comparison are given in the final column a Both PIOTs only cover ethyl alcohol manufacturing, but the corresponding IO sector also includes production of other chemicals such as amines that may include N. This distinction is based on MFA in Singh et al., which did not look at other chemical production for the sector. In Additional file 1: manufacturing (part of NAICS account 325190, "Other Basic Organic Chemical Manufacturing"). In the process modeling approach, the animal food and animal slaughtering sectors were only partially represented, whereas the previous PIOT included complete coverage. Additional validation of the PIOTs was difficult since no other official PIOTs for the USA or US regions exist. The authors made a careful comparison of the two models including additional data search and determination of other production activities present in the NAICS codes surveyed to estimate true production values (shown in validation column of Table 5) and intersectoral flows. This showed exclusions such as the neglect of corn for animal feed in the prior published PIOT, highlighting limitations of the MFA approach based on empirical data alone as tracing missing flows is nearly impossible. Additionally, we also found that both approaches failed to account for the use of fertilizer by households (see Note on Consumption of Fertilizer outside the Farming Sectors, Additional file 1: Section A5).

Sectoral total inputs and outputs (x) comparison
Besides discrepancies caused by incomplete sectoral coverage by models, Table 5 shows clear improvements in estimates for N total production flows. The improvements are especially striking for the fertilizer manufacturing and corn wet milling sectors. The relatively poor performance (~ 12% worse than original result) for other basic organic chemical mfg is due to an outdated efficiency number for the yield of ethanol (2.47 gallons/ bushel or 0.368 l/kg) in our process model (true yield estimated at 2.74 gallons/bushel or 0.408 l/kg in 2002), which can be rectified by updating the process model. The omission of the wet milling sector in the published PIOT  shows the benefits of the development of linked process models, as do the improvements to the fertilizer mfg sector (elaborated more in the next section). The current approach also provided a significant correction to the animal farming sector, where, in the previous work, total mass inputs were assumed equal to total mass outputs, accounting for 44 thousand tons of N. Our models estimated manure production and enteric emissions, showing that most mass leaves in residuals, giving a total production (x) of 30 thousand tons of N. We estimate that hog production should make up ~ 90% of the mass flows in this sector, which also includes equines, sheep, goats and other miscellaneous livestock, so the process model provides a closer estimate than the published approach (which also covered ~ 90% of the sector). Further, mechanistic accounting of inputs, outputs and losses improved the reliability of these estimates.

Structural comparison of RTCs, RICs and PIOTs
Figure 5 provides a visualization of all PIOT flows in the two approaches, showing (Fig. 5a) the dominance of the flows between the economy and natural resources via raw materials and residuals in the process model approach, versus a dominant fertilizer production sector in the previous MFA-based approach (Fig. 5b). Since all flows represent N, this difference is due primarily to the fertilizer manufacturing sector. Residuals in Fig. 5a come primarily from the purge stream of air, since the yield of the Haber-Bosch process is low and recycle is needed. An interaction from the hog farming sector is also clearly visible in Fig. 5a, since N in manure is classified into residuals. Figure 5b shows a much larger production figure from the fertilizer manufacturing sector than Fig. 5a. An assumption was made in the PIOT by Singh et al. (2017) that all demands were met by local production, giving a total production of 895 thousand tons of N (see Additional file 1: Table A11). Here, our approach should be more accurate, giving a production of 194 thousand tons of N with the rest of the local demand met by imports (593 thousand tons of N). Since the modeling based on capacity captures the production more closely than an assumption of meeting demand locally, it allowed a large correction to the figure in the previous PIOT when origin of products being consumed was not reported.
The process model-based PIOT (Table 3) showed a self-flow for animal farming in the RTC not shown in Singh et al. (2017). This represents imported animals grown to maturity in state. In the RIC created, the flow disappears since it does not reflect regional inputs. For animal feed production we accounted for corn, the major input to hog feed, which was omitted in the previously published PIOT. This difference accounts for most of the discrepancy in the animal feed sector's intersectoral flows. Missing flows from milling byproducts to animal feed in our PIOT are due to the focus on hog feed, which will be rectified as other livestock sectors are modeled and included since in 2002 these byproducts were used primarily for cattle and poultry feed.
In the primary inputs matrix, the process model approach provided estimates for residuals in all but one sector, while the previous approach has data only for two sectors. One major strength of this approach is that it is possible to approximate or incorporate information about residuals into process models, which can be difficult to find in official statistics. Residuals and other primary material flows are an important part of the PIOT, since they frequently make up a large portion of the total mass flows and so are critical for extended IO model applications in studying environmental relationship of economic  (Table 4) version of the PIOT created with our method. The large green areas represent the interactions between the fertilizer manufacturing sector and raw materials and residuals. b Chord diagram depicting flows from the PIOT prepared by Singh et al. (Additional file 1: A12) The large green area represents the flows from raw materials to fertilizer, and the large brown area represents the flows from fertilizer to the rest of the economy. Note: the perimeter space is proportional to the magnitude of outflow from the sector named. The flow between a sector with each additional sector is shown in a distinct color production structures. Therefore, it is evident from this 6-sector example here that process modeling approach to build PIOTs provides a significant advantage over purely empirical approaches in this aspect.

Discussions on strengths and limitations of empirical versus process-based approaches to PIOT compilation
As discussed in Sects. 4.1 and 4.2, study of the two approaches allowed us to estimate correct values for total production and intersectoral flows. Overall, the process modelbased results stayed within 30% of these values for all sectors. The highest discrepancies were in ethyl alcohol manufacturing and wet corn milling due to high estimates of N content in corn and low-efficiency factors for production. While 30% is a high factor of error, the previous approach performed much more variably against our estimates of correct values. (The highest discrepancy tracked was over 160%, and many streams compared showed > 30% discrepancy.) This suggests that our process modeling-based approach has significant potential to standardize PIOT construction while improving reliability.
In both approaches, export and import values are primarily estimated from imbalances between supply and use. Better data on exports and imports will improve the confidence in PIOTs built from both approaches. We also see from simple mass balance estimates that minor sectoral flows and household consumption can cause up to 10% discrepancy for many of the sectors in terms of final demand. We perceive that the uncertainty in final demand remains high due to lack of data that can only be obtained by survey.

Heterogeneous and homogeneous sectors
The process modeling approach is best suited for homogeneous sectors (where there is a uniform production process across the region) such as corn milling, and may be unwieldy and costly for very heterogeneous sectors (with high variability in production processes in a region), such as feed processing. Over 1000 operations supply animal feed in IL, each following an array of different production recipes. Although processing equipment is similar, use varies based on availability of ingredients and local demand patterns. We modeled this sector by segmenting the market, i.e., accounting for the production from these sectors based on demand which in our case study was hog feed needed for hog farms.
Overall, homogeneous sectors are easy to model and change occurs slowly due to large capital investments. Therefore, these models can be easily used for long-term projections of flows in homogeneous sectors. Heterogeneous sector models must be developed to stay relevant and be "forward looking. " Nonetheless, this approach provides a transparent way to develop PIOT from the bottom up and can easily be updated to capture technology changes.

Other limitations
One limitation of any process-based approach is the treatment of capital goods such as equipments and infrastructure which is handled separately in design phase for the production system. Hence, this represents a stock flow for PIOTs and need to be modeled separately, which need to be handled separately. Missing data are another important challenge. Some studies rely on non-survey techniques such as RAS to arrive at approximations based on whole system data (Miller and Blair 2009). In the case of our model, a mass balanced transactions matrix is always generated and less variation to thermodynamically validated processes should occur due to this balancing. Better data availability related to subnational trade and consumption will also improve the reliability for PIOT generation.
Another limitation arises due to the capacity of process modeling software to handle all production sectors, and hence, relevant modeling software will need to be selected. Aspen Plus used in this work [or other open source software like COCO simulator (COCO 2018)] can be used for most of the chemical sectors including refining, biomass and waste processing, mechanical production can be modeled using assembly line simulation packages, service sectors will need original model development relevant to the economy (following causal or correlation between inputs and outputs), farming sectors can use the software/models from agricultural modeling community [such as EPIC (Texas A&M AgriLife Research 2017)]. Using process modeling software like Aspen will need improvement for handling complex mixtures and solids. Aspen is strong on handling recycle flows and sensitivity analysis to help ensure that product splits and supplemental feeds are present in correct quantities. Still, this method does not depend on a specific software, and other simulation software for specific sectors can be used as long as production recipes and mass balanced products and residuals are provided at a sectoral level. The modeling approach can be adapted to different regions and countries, but the models themselves may need to be adjusted based on regional practices. Finally, in our first ever demonstration of the process engineering models to PIOT approach we have assumed closed temporal boundaries, with no movement in and out of inventories. This can be overcome by integration with dynamic models for capital (Pauliuk et al. 2015b), which should be supplemented with economic models.

Conclusions and future work
Based on the comparison of PIOT from the approach proposed in this work with a previously published PIOT, we conclude that a process modeling-based approach offers advantages for the construction of PIOTs in improving reliability, transparency, reproducibility and continuity at both regional and national scale. Aggregation and cost pose major challenges for PIOTs to enter fully into the set of environmental modeling tools based on the EEIO framework. The proposed "process-to-PIOT" approach can address the first concern by allowing detailed tracking of flows and transformations. Cost is addressed by developing reusable process models which with slight modifications allow generation of PIOTs for other regions and time periods.
This approach aligns with the recently proposed rectangular choice-of-technology (RCOT) approach, where a rectangular transactions matrix contains multiple rows for each sector corresponding to different production technologies (Duchin and Levine 2011). While currently our approach includes average technologies, we expect that the development of a wider array of process models will allow the RCOT approach to be adopted more widely. Additionally, the availability of reliable physical technical coefficients from this approach can also be integrated with the comprehensive HSUT approach (Merciai and Schmidt 2018) for constructing PIOTs.