Skip to content

Commit

Permalink
with switch for passing through or dropping not bridged columns
Browse files Browse the repository at this point in the history
  • Loading branch information
konstantinstadler committed May 3, 2024
1 parent 6b5649e commit 1fad0b2
Show file tree
Hide file tree
Showing 8 changed files with 450 additions and 50 deletions.
61 changes: 61 additions & 0 deletions doc/source/notebooks/convert.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "60fd7d0e-a46e-4db0-8bbe-00256058ee71",
"metadata": {},
"source": [
"# Convert and Characterize"
]
},
{
"cell_type": "markdown",
"id": "850708b0-66c3-4ca8-a50b-f7396ec4c1a7",
"metadata": {},
"source": [
"Pymrio contains several possibilities to convert data from one system to another."
]
},
{
"cell_type": "markdown",
"id": "bde3cf89-6c36-47dd-b9d5-48433f4473b5",
"metadata": {},
"source": [
"The term *convert* is meant very general here, it contains \n",
" - finding and extracting data based on indicies across a table or an mrio(-extension) system based on name and potentially constrained by sector/region or any other specification\n",
" - converting the names of the found indicies\n",
" - adjusting the numerical values of the data, e.g. for unit conversion or characterisation\n",
" - aggregating the extracted data, e.g. for the purpose of characterization"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "74d2a195-5e5f-4798-9aa6-4136a4b84342",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.0"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
66 changes: 66 additions & 0 deletions doc/source/notebooks/convert.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# ---
# jupyter:
# jupytext:
# text_representation:
# extension: .py
# format_name: percent
# format_version: '1.3'
# jupytext_version: 1.15.2
# kernelspec:
# display_name: Python 3 (ipykernel)
# language: python
# name: python3
# ---

# %% [markdown]
# # Convert and Characterize

# %% [markdown]
# Pymrio contains several possibilities to convert data from one system to another.

# %% [markdown]
# The term *convert* is meant very general here, it contains
# - finding and extracting data based on indicies across a table or an mrio(-extension) system based on name and potentially constrained by sector/region or any other specification
# - converting the names of the found indicies
# - adjusting the numerical values of the data, e.g. for unit conversion or characterisation
# - aggregating the extracted data, e.g. for the purpose of characterization

# %% [markdown]
# Pymrio allows these convert function either on one specific table (which not necessaryly has to be a table of the mrio system) or on the whole mrio(-extension) system.

# %% [markdown]
# ## Structure of the bridge table


# %% [markdown]
# Irrespectively of the table or the mrio system, the convert function always follows the same pattern.
# It requires a bridge table, which contains the mapping of the indicies of the source data to the indicies of the target data.
# This bridge table has to follow a specific format, depending on the table to be converted.


# %% [markdown]
# Lets assume a table with the following structure (the table to be converted):

# %% [markdown]
# TODO: table from the test cases

# %% [markdown]
# A potential bridge table for this table could look like this:

# %% [markdown]
# TODO: table from the test cases

# %% [markdown]
# Describe the column names, and which entries can be regular expressions

# %% [markdown]
# Once everything is set up, we can continue with the actual conversion.

# %% [markdown]
# ## Converting a single data table


# %% [markdown]
# ## Converting a pymrio extension


204 changes: 196 additions & 8 deletions doc/source/notebooks/extract_data.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -901,8 +901,7 @@
"metadata": {},
"outputs": [],
"source": [
"rows_to_extract =[('emission_type1', 'air'),\n",
" ('emission_type2', 'water')]"
"rows_to_extract = [(\"emission_type1\", \"air\"), (\"emission_type2\", \"water\")]"
]
},
{
Expand Down Expand Up @@ -1001,14 +1000,17 @@
"id": "68f6f3e8",
"metadata": {},
"source": [
"Extracting to dataframes is also a convienient way to convert an extension object to a dictionary:"
"Extracting to dataframes is also a convienient\n",
"way to convert an extension object to a dictionary:"
]
},
{
"cell_type": "code",
"execution_count": 47,
"id": "b23d7415",
"metadata": {},
"metadata": {
"lines_to_next_cell": 2
},
"outputs": [
{
"data": {
Expand All @@ -1023,18 +1025,204 @@
],
"source": [
"df_all = mrio.emissions.extract(mrio.emissions.get_rows(), return_type=\"dfs\")\n",
"df_all.keys()"
"df_all.keys()\n",
"\n",
"\n",
"# The method also allows to only extract some of the accounts:\n",
"df_some = mrio.emissions.extract(\n",
" mrio.emissions.get_rows(), dataframes=[\"D_cba\", \"D_pba\"], return_type=\"dfs\"\n",
")\n",
"df_some.keys()"
]
},
{
"cell_type": "markdown",
"id": "4357fd67",
"metadata": {},
"source": [
"### Extracting from all extensions"
]
},
{
"cell_type": "markdown",
"id": "d49af58b",
"metadata": {},
"source": [
"We can also extract data from all extensions at once.\n",
"This is done using the `extension_extract` method from the pymrio object.\n",
"This expect a dict with keys based on the extension names and values as a list of rows (index) to extract."
]
},
{
"cell_type": "markdown",
"id": "db8053ac",
"metadata": {},
"source": [
"Lets assume we want to extract value added and all emissions.\n",
"We first define the rows (index) to extract:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fd730723",
"metadata": {
"lines_to_next_cell": 2
},
"outputs": [],
"source": [
"to_extract = {\n",
" \"Factor Inputs\": \"Value Added\",\n",
" \"Emissions\": [(\"emission_type1\", \"air\"), (\"emission_type2\", \"water\")],\n",
"}"
]
},
{
"cell_type": "markdown",
"id": "0882d1dc",
"metadata": {},
"source": [
"And can then use the `extension_extract` method to extract the data, either as a pandas DataFrame,\n",
"which returns a dictionary with the extension names as keys"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dbfe113a",
"metadata": {},
"outputs": [],
"source": [
"df_extract_all = mrio.extension_extract(to_extract, return_type=\"dataframe\")\n",
"df_extract_all.keys()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "47393c06",
"metadata": {},
"outputs": [],
"source": [
"df_extract_all[\"Factor Inputs\"].keys()"
]
},
{
"cell_type": "markdown",
"id": "e5fc1452",
"metadata": {},
"source": [
"We can also extract into a dictionary of extension objects:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b195ef6f",
"metadata": {},
"outputs": [],
"source": [
"ext_extract_all = mrio.extension_extract(to_extract, return_type=\"extensions\")\n",
"ext_extract_all.keys()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ee908ec0",
"metadata": {},
"outputs": [],
"source": [
"str(ext_extract_all[\"Factor Inputs\"])"
]
},
{
"cell_type": "markdown",
"id": "d3eed3a5",
"metadata": {},
"source": [
"Or merge the extracted data into a new pymrio Extension object (when passing a new name as return_type):"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b3690981",
"metadata": {},
"outputs": [],
"source": [
"ext_new = mrio.extension_extract(to_extract, return_type=\"new_merged_extension\")\n",
"str(ext_new)"
]
},
{
"cell_type": "markdown",
"id": "417f397d",
"metadata": {},
"source": [
"CONT: Continue with explaining, mention the work with find_all etc"
]
},
{
"cell_type": "markdown",
"id": "a2887bce",
"metadata": {},
"source": [
"### Search and extract"
]
},
{
"cell_type": "markdown",
"id": "c5beffce",
"metadata": {},
"source": [
"The extract methods can also be used in combination with the [search/explore](./explore.ipynb) methods of pymrio.\n",
"This allows to search for specific rows and then extract the data."
]
},
{
"cell_type": "markdown",
"id": "04144a4b",
"metadata": {},
"source": [
"For example, to extract all emissions from the air compartment we can use:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "87303c51",
"metadata": {},
"outputs": [],
"source": [
"match_air = mrio.extension_match(find_all=\"air\")"
]
},
{
"cell_type": "markdown",
"id": "e50edc2f",
"metadata": {},
"source": [
"And then make a new extension object with the extracted data:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2cac8d8a",
"metadata": {},
"outputs": [],
"source": [
"air_emissions = mrio.emissions.extract(match_air, return_type=\"extracted_air_emissions\")\n",
"print(air_emissions)"
]
},
{
"cell_type": "markdown",
"id": "9b1fef8b",
"metadata": {},
"source": [
"CONT: DESRIBE STUFF ABOVE\n",
"For example, to extract the total value added for all regions and sectors we can use:"
"For more information on the search methods see the [explore notebook](./explore.ipynb)."
]
}
],
Expand All @@ -1054,7 +1242,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
"version": "3.12.0"
}
},
"nbformat": 4,
Expand Down
Loading

0 comments on commit 1fad0b2

Please sign in to comment.