Make sure you have Python 3.8 or above. You can check in the command line with python --version
To create a virtual environment within this head directory (cd climate-trace=metamodeling
), run
python -m venv venv
. The second "venv" can be any name of your virtual environment. (I used ct_venv
).
Run the following to activate virtual environment to use and disable to stop using:
- nix (Mac OS, Linux):
source venv/bin/activate
- win (windows):
venv\Scripts\activate.bat
orvenv\Scripts\activate.ps1
for powershell deactivate
will deactivate your virtual env
Once your environment is activated, run pip install --default-timeout=1000 -r requirements.txt
. The increased default timeout is for pandas, since it is a large file and takes additional time to download.
Run pip install -e .
to install local packages via setup.py
.
- First, you will need to add your database credentials to the
params.json
file. - Next, you can run the
main.py
file to run all actions. This will:- Load Climate TRACE and EDGAR data
- Project EDGAR data forward in time
- Write that projected data to the database
- Load the projected data and fill Climate TRACE gaps with both EDGAR and EDGAR-projected
- Write the filled data to the database
We use the pytest
library for unit testing the functionality. Simply run pytest tests
from the head directory to run all tests in the folder.
-
If the output of the gap filled value is very negative (less than -2), change the value to NaN
-
If the value is slightly negative (between -2 and 0), change the value to 0
- For every country, every sector, we initialize rows with a value of zero in the gapfilling code for: co2, ch4, nh4, co2e_20yr, co2e_100yr
- If the value is nan, this means that we expect there to be data for the particular observation but none was given. If the value is zero, it means there are no emissions for that observation, which includes when there are no emissions possible. For example, a particular sector may not emit a certain gas.
- Sectors designated to use regression are defined in the
__init__
function of theProjectData
class - Regression is used when the following are both true:
- The sector is part of the designated sector list
- There are 4 or more available data points within the 6 year training window (no more than 2 NaNs)
- Forward filling (using the most recent year of data and repeating that value for future years) is used when either ..
- The sector is part of the designated regression sector list AND there are less than 4 available data points within the 6 year training window (more than 2 NaNs)
- The sector is not part of the designated regression sector list
- The sector is part of the designated regression sector list AND there are less than 4 available data points within the 6 year training window (more than 2 NaNs)
- ceds_derived_sectors.py can be run to quantify country-level estimates calculated using a combination of existing CEDS, EDGAR and Climate-TRACE data in the country_emissions_staging table. Functions insert derivations to same table. These new sectors are needed for 2024 gap filling equations.