diff --git a/Index.Rmd b/Index.Rmd index 1b89291..3a63691 100644 --- a/Index.Rmd +++ b/Index.Rmd @@ -323,62 +323,72 @@ knitr::write_bib(c(.packages(), ```{r child="Seminar_Schedule.Rmd"} ``` -```{r child="Seminar_01_Experiential.Rmd"} -``` - ```{r child="Lecture_02_Decision_Analysis.Rmd"} ``` -```{r child="Seminar_02_R_RStudio_1.Rmd"} -``` -```{r child="Lecture_03_Define_Decision.Rmd"} -``` -```{r child="Seminar_03_R_RStudio_2.Rmd"} -``` + + -```{r child="Lecture_04_Case_Studies.Rmd"} -``` -```{r child="Seminar_04_RMarkdown.Rmd"} -``` + + + + -```{r child="Lecture_05_Participatory_Methods.Rmd"} -``` -```{r child="Seminar_05_Git_Github.Rmd"} -``` + + -```{r child="Lecture_06_Decision_Models.Rmd"} -``` -```{r child="Seminar_06_Model_Programming.Rmd"} -``` + + -```{r child="Lecture_07_calibration_lecture.Rmd"} -``` -```{r child="Seminar_07_calibration_seminar.Rmd"} -``` + + -```{r child="Lecture_08_Forecasts.Rmd"} -``` -```{r child="Seminar_08_models_seminar.Rmd"} -``` + + -```{r child="Lecture_08_01_Practical_Schiffers.Rmd"} -``` + + -```{r child="Lecture_09_Bayesian_Thinking.Rmd"} -``` -```{r child="Seminar_09_forecast_seminar.Rmd"} -``` + + -```{r child="Lecture_10_Analyst_Profile.Rmd"} -``` -```{r child="Seminar_10_Functions.Rmd"} -``` + + -```{r child="Lecture_11_Communicating.Rmd"} -``` -```{r child="Seminar_11_Writing.Rmd"} -``` + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + ```{r child="Impressum.Rmd"} ``` diff --git a/Index.html b/Index.html index beecedb..0d4d874 100644 --- a/Index.html +++ b/Index.html @@ -331,14 +331,14 @@

Bonus video from Sir Ken Robinson

Lecture schedule

--++-+ - + @@ -346,108 +346,82 @@

Lecture schedule

- - + + - - + + - - + + - - + + - - + + - - + + - - - - - - - - + + - - - - - - - - + + - - + + + + + + + + - - - - - - - - - - - - - - - - - -
LectureLecture Date Materials Reading
1Friday, April 14, 20231Wednesday, April 10, 2024 Introduction Watch: Sir Ken Robinson / Hans Rosling (optional)
2Friday, April 21, 20232Friday, April 12, 2024 Decision Analysis Overview Hubbard (2014) (Chapter 1. The Challenge of Intangibles)
3Friday, April 28, 20233Wednesday, April 17, 2024 Defining a Decision Howard and Abbas (2015) (Chapter 1. Introduction to Quality Decision Making)
4Friday, May 5, 20234Friday, April 19, 2024 Decision Analysis Case Studies Shepherd et al. (2015)
5Friday, May 12, 20235Wednesday, April 24, 2024 Participatory Methods Luedeling and Shepherd (2016)
6Friday, May 19, 20236Friday, April 26, 2024 Building Decision Models Do, Luedeling, and Whitney (2020)
7Friday, May 26, 2023Calibrating -expertsKahneman and Egan (2011) -(Chapter 1)
8Friday, June 9, 20237Friday, May 3, 2024 Using Models to Create Forecasts Tetlock and Gardner (2015) (Chapter 1. An Optimistic Skeptic)
PracticalFriday, June 16, 2023Models with -Dr. SchiffersBuild models
9Friday, June 23, 20238Wednesday, May 8, 2024 Bayesian Thinking Bertsch McGrayne (2011) (Chapter 1. Causes in the Air)
10Friday, June 30, 20239Friday, May 10, 2024Calibrating +expertsKahneman and Egan (2011) +(Chapter 1)
10Friday, May 17, 2024 Profile of a Decision Analyst Savage and Markowitz (2009) (Chapter 1)
11Friday, July 7, 2023Communicating Decision -SupportTBA
12Date TBAGroup presentationsPresent
13Date TBAGroup presentations continuedPresent
@@ -455,14 +429,14 @@

Lecture schedule

Seminar schedule

--+++- - + @@ -470,138 +444,114 @@

Seminar schedule

- - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + + + + + + + - - - - - - - - + + + + + + + + - - - - - - - - + + - - + + + + + + + + + + + + + + - + - - + +
WeekWeek Date Materials Assignment
1Friday, April 14, 20231Wednesday, May 28, 2024 Group Work Overview Install R and Rstudio
2Friday, April 21, 20232Friday, May 31, 2024 Using R and RStudio Share R scripts (part 1)
3Friday, April 28, 20233Wednesday, June 5, 2024 Using R and RStudio continued Share R scripts (part 2)
4Friday, May 5, 20234Friday, June 7, 2024 Using RMarkdown Share Rmarkdown file
5Friday, May 12, 20235Wednesday, June 12, 2024 Using git and Github Start a Github repo
6Friday, May 19, 20236Friday, June 14, 2024 Model Programming Share R scripts (part 3)
7Friday, May 26, 20237Wednesday, June 19, 2024 Calibration training Participate in calibration training
8Friday, June 9, 20238Friday, June 21, 2024Model +functionsShare updated model
9Wednesday, June 26, 2024 Value of information Share R scripts (part 3)
PracticalFriday, June 16, 2023Models with -Dr. SchiffersShare initial models
9Friday, June 23, 202310Friday, June 28, 2024Communicating Decision +Support
11Wednesday, July 03, 2024 Model forecasts Share initial forecasts
10Friday, June 30, 2023Model -functionsShare updated model
11Friday, July 7, 202312Friday, July 5, 2024 Citation Management Join Zotero Group
12Date TBA13Wednesday, July 10, 2024Guest Lecture
14Friday, July 12, 2024Consultation
15Wednesday, July 17, 2024 Groups present / discuss final modelPrepare final modelPresent
13Date TBA16Friday, July 19, 2024 Presentations continued Present
-
-

Seminar 1: Group Work Overview

- -

Welcome to the first seminar of Decision Analysis and -Forecasting for Agricultural Development.

-

You should already be familiar with the idea that we intend to have -group work form a major part of this course. If you have not already it -is important that you choose a group to work with and begin to think -about a topic (ideally an agricultural decision) that you intend to work -on through this course. In this seminar we will share the general -motivation behind the experiential nature of this course and talk about -effective group work techniques and tools. The lecture in week three Defining a decision will provide a -detailed exploration of how decisions are defined / steps we may need to -go through to help decision makers define decisions.

-

Feel free to bring up any questions or concerns in the Slack or to Dr. Cory -Whitney or the course tutor.

-

To see an example of the result we will be aiming to create see the -paper by Do, Luedeling, and Whitney (2020) -and the associated supplementary -materials.

-

The first step in getting ready for the work of this course is to -install and run R (R Core Team 2023) and RStudio.

-

-

- -

For some expert advice on how to name and organize your projects and -files please scroll through Danielle Navarro’s Project -Structure slides.

-

Lecture 2: Decision Analysis Overview

@@ -793,12802 +743,632 @@

Bonus video by Professor Karl Claxton

- - -
-

Seminar 2: Using R and RStudio

- -

Welcome to the second seminar of Decision Analysis and -Forecasting for Agricultural Development. Feel free to bring up -any questions or concerns in the Slack or to Dr. Cory -Whitney or the course tutor.

-

Remember that this is an experiential course and will be based -largely on work from your side that is supported by us, rather than -passive learning where we deliver content. In order to work with us in -this course you will need to have some basic tools.

-

You should already have installed R and RStudio. If not, please check -out the tools we use in this course. If you -already have R and RStudio loaded then you should be able to follow -parts our R tutorial. Please follow [Data Wrangling Part 1]((https://agtools.app/learning_r/#section-data-wrangling-part-1) -and [Data Wrangling Part 2]((https://agtools.app/learning_r/#section-data-wrangling-part-2) -portions of the tutorial. Share the results of your work with the -‘ggplot2’ ‘diamonds’ data (Wickham, Chang, et al. -2023) after you have worked through these materials.

- - -
-

Further Reading

- -
-
-
-

Lecture 3: Defining a Decision

- -

-

-

The following videos cover some of the challenges that researchers -and agricultural development experts face when attempting to define -decisions. Please watch the videos and answer the questions that -follow.

-
-

Defining Decisions Part 1

- - -
-
-
-
-
- -
-
-
-
-
-
-
- -
-
-
-
-

Defining Decisions Part 2

- - -

Think about the following questions and jot down your thoughts.

- -
-
-

Defining Decisions Part 3

- - -
-
-
-
-
- -
-
-
-
-
-
-
- -
-
-
-
-
-
-
- -
-
-

Think of challenges you may encounter when trying to do -credible, salient and legitimate research -(all of which decision analysts should aim to achieve).

-
-
-

Quality Decision Making

-

This video outlines the background for quality decision making, -closely following Howard and Abbas (2015). -At the end of the video please take some time to sketch out some -thoughts about a decision that you will be working on for the -experiential part of this course. You can find the slides for the first -three parts of the lecture here.

- - - -
-
-
-
-
- -
-
-
-
-

Group discussion reading:

-

This week you will all read Howard and Abbas -(2015) (Chapter 1. Introduction to Quality Decision Making). One -group will lead a discussion on the reading.

-
-
-
-
-
- -
-
- -
-
-
-

Seminar 3: Using R and RStudio Continued

- -

You should already have installed R and RStudio. If not, please check -out the tools we use in this course. If you -already have R and RStudio loaded then you should be able to follow this -Data -Visualization section of the R tutorial. Share the results of your -work with a data set of your choice (i.e. ggplot2 -diamonds data (Wickham, Chang, et -al. 2023)) after you have worked through these materials.

-
-

Lifestyle choices

-

If you run the code usethis::use_blank_slate() in your -console it will set your RStudio preference to NEVER -save or restore .RData on exit or on startup, which is a -lifestyle choice endorsed by many excellent researchers like Jenny Bryan.

-
-
-
-

Lecture 4: Decision Analysis Case Studies

- -

Welcome to lecture 4 of Decision Analysis and Forecasting for -Agricultural Development. In this lecture we will follow three -case studies and use the #lecture-04-case-studies Slack -channel and Zoom meeting to follow up. This will allow us all to have a -chance to see your progress, provide feedback. As always, your -engagement and sharing is appreciated. It will be encouraging to others -to see activity from colleagues. Feel free to bring up any questions or -concerns in the Slack or to Dr. Cory -Whitney or the course tutor.

-

Please take a few minutes to watch these short videos and do the -associated exercises before our meeting.

-
-

Introduction

-

The following videos introduce some decision analysis case studies. -Please watch the videos and answer the questions that follow.

-
-
-

Decision analysis case - Water for Wajir

-

The city of Wajir in Northern Kenya has lacks a reliable supply of -clean drinking water and sanitation. To improve the situation, plans are -being considered to construct a water pipeline from Habaswein over 100 -km away (Luedeling and Leeuw 2014; Luedeling et -al. 2015). Watch the following video to learn more.

- -

Now try to answer the following questions:

-
-
-
-
-
- -
-
-
-
-
-
-
- -
-
-
-
-
-
-
- -
-
-
-
-

Case study - Plastic covers in sweet cherry orchards in Chile

-

This video shows the study Rojas et al. -(2021) conducted in Chile for assessing the profitability of -implementing plastic covers in sweet cherry orchards to protect the -fruits from hazardous weather events.

- - -
-
-
-
-
- -
-
-

Based on the question above, think about and write down two to three -positive and negative implications of using the method we used to define -the decision.

-
-
-
-
-
- -
-
-
-
-
-
-
- -
-
-

The data and scripts for this study are also available online (Fernandez et al. 2021).

-
-
-

Group discussion reading:

- - - - - - - - - - - - - - - - - - - - + - - - - - - - - - - - + - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - -
-
-
-

Seminar 4: Using RMarkdown

- -
-

Using RMarkdown

-

RMarkdown: overview

-

-

Now let’s check out Rmarkdown, a powerful tool -that allows making fancy reports, websites etc. out of our R code (Allaire et al. (2023)). You’re already looking -at an example, by the way. This website was produced with Rmarkdown (and -it wasn’t hard at all)!

- - -

Here are the -slides and the -html file generated in the video.

-

To run this you will need to load library(rmarkdown)

-

-

and library(knitr)

-

- -
-
-

RMarkdown: getting stuck

- - -
-
-

RMarkdown: basics

-

https://rmarkdown.rstudio.com/authoring_basics.html

-

https://bookdown.org/yihui/rmarkdown/r-code.html

- -

- Open an -Rmarkdown file and knit.

-

-

-

Now we’re equipped with all the basic tools for this course. We’ll -start using them pretty soon. Throughout the course you’ll find -occasional references to your RMarkdown file. This is supposed to be -produced with Rmarkdown, with subsequent versions of it stored on -Github.

-
-
-
-

Lecture 5: Participatory Methods For Qualitative Model -Development

- -
-

Participatory methods

-

The following video outlines some of the tools that can be used in -participatory modeling. Watch the video and answer the questions that -follow.

- -
-
-
-
-
- -
-
- -
-
-

Stakeholder management

-

The following video covers some definitions, methods and tools and a -case study related to stakeholder management in decision analysis. -Please watch the videos and answer the questions that follow. These will -be helpful in determining which tools/techniques you might use to -identify stakeholders in your decision analysis process.

- - - -
-
-
-
-
- -
-
-
-
-
-
-
-
+ -
-

Bonus: Plot the stakeholder analysis in R

-

Plot stakeholders’ experience, availability and expertise in decision -analysis with the ggplot2 ggrepel -ggthemes libraries (Wickham, Chang, -et al. 2023; Slowikowski 2023; Arnold 2023). Use the stakeholder -data set from our git repository. To run this on your machine, save this -raw csv file as .csv to your local .RProj -directory. Then use -stakeholder<-read.csv("stakeholder.csv") to load the -data to your R environment and save it as stakeholder.

-
-
ggplot(data = stakeholder, aes(x = Experience, 
-                               y = Availability, 
-                               label = stakeholders, 
-                               color = Expertise)) + 
-  geom_point(aes(shape=Gender)) +
-  xlab("Relevant Experience") +
-  
- #label names of stakeholders and expand space to show full names
-  scale_x_continuous(labels = paste(seq(0, 5, by = 1)), 
-                     breaks = seq(0, 5, by = 1), 
-                     limits = c(0, 5),
-                     expand = c(0, 1)) +
-  scale_y_continuous(labels = paste(seq(0, 5, by = 1)), 
-                     breaks = seq(0, 5, by = 1), 
-                     limits = c(0, 5),
-                     expand = c(0, 1)) +
-  theme(plot.margin = unit(c(1, 1, 1, 1), "cm")) +
-  theme(legend.position = "none") +
-    
-# Create line to categorize stakeholders
-  geom_hline(yintercept=2.5, color="white", size=2) +
-  geom_vline(xintercept=2.5, color="white", size=2) +
+
+

References

+

+ -

-

Learn more about using names and overlapped -values.

-
-
-

Group discussion reading:

-

This week you will read Shepherd et al. -(2015). One of the members of each of your groups will present -the results of this reading.

-
    -
  • Shepherd, Keith, Douglas Hubbard, Norman Fenton, Karl Claxton, Eike -Luedeling, and Jan de Leeuw. “Development Goals Should Enable -Decision-Making.” Nature 523, no. 7559 (2015): 152–54.
  • -
-
-
-

Further Reading

-
    -
  • Liberating -Structures by Keith McCandless (and others)

  • -
  • Reed, M. S. et al. (2009) ‘Who’s in and why? A typology of -stakeholder analysis methods for natural resource management’, Journal -of Environmental Management, 90(5), pp. 1933–1949. doi: -10.1016/j.jenvman.2009.01.001.

  • -
-
-
-
-

Seminar 5: Using git and Github

- -

Welcome to the fourth seminar of Decision Analysis and -Forecasting for Agricultural Development. All your programming -and project work for this course should take place in the git -environment and this seminar is intended to show you how to make that -happen. Feel free to bring up any questions or concerns in the Slack or -to Dr. Cory -Whitney or the course tutor.

-
-

Git and Github

-

Now we look at the programming version control environment git and the interface github, which we use to share our scripts -and changes to our work.

- -

Install Git & join Github (if you have not already):

- -

Important note for Windows users: when you download -Git, it may be installed into your Program Files directory -by default. This often causes issues later. The general recommendation -is to choose a different directory for installing, -i.e. C:/Users/username/bin/. Once Git is installed, open -RStudio, and go to Tools > Global Options, select -Git/SVN, and then enter or select the path for your Git -executable.

- -

Some useful tips on getting these running from our friends

- -

Start simple with your own repository only and work on it alone. -Share your profile and link to a repository you made in the Slack -channel for this seminar. Soon you will start to collaborate on -something in git. For the project part of this course your team will -work in a collective repository and work together to generate the final -working project model and report.

-
-
-

Bonus: Sharing html files

-

We can use the options from htmlpreview.github.io to -show the results of any of our html output work that is stored in our -git repository.

-

To do that just preface the path for the html file with -http://htmlpreview.github.io/?.

-

For example the slides for this seminar can be found in the -CWWhitney/teaching_git repository.

-

The full path is: -https://github.com/CWWhitney/teaching_git/blob/master/R_Git_GitHub.html

-

and the link to this to be viewed directly as a web page is: http://htmlpreview.github.io/?https://github.com/CWWhitney/teaching_git/blob/master/R_Git_GitHub.html

-
-
-

Bonus: Further Reading

- -

Watch the ‘What is GitHub’ video from the GitHub developers

- -
-
-
-

Lecture 6: Building Decision Models

- -

Welcome to lecture 6 of Decision Analysis and Forecasting for -Agricultural Development. We will walk through brief examples -and offer the scripts. Feel free to bring up any questions or concerns -in the Slack, to Dr. Cory -Whitney or to the course tutor.

-

Decision-makers often wish to have a quantitative basis for their -decisions. However,‘hard data’ is often missing or unattainable for many -important variables, which can paralyze the decision-making processes or -lead decision-makers to conclude that large research efforts are needed -before a decision can be made. That is, many variables decision makers -must consider cannot be precisely quantified, at least not without -unreasonable effort. The major objective of (prescriptive) decision -analysis is to support decision-making processes faced with this -problem. Following the principles of Decision Analysis can allow us to -make forecasts of decision outcomes without precise numbers, as long as -probability distributions describing the possible values for all -variables can be estimated.

-

The decisionSupport package implements this as a Monte -Carlo simulation, which generates a large number of plausible system -outcomes, based on random numbers for each input variable that are drawn -from user-specified probability distributions. This approach is useful -for determining whether a clearly preferable course of action can be -delineated based on the present state of knowledge without the need for -further information. If the distribution of predicted system outcomes -does not imply a clearly preferable decision option, variables -identified as carrying decision-relevant uncertainty can then be -targeted by decision-supporting research. This approach is explained in -more detail below and in the model -programming seminar.

-

In this portion of the course we hope to introduce you to the methods -and inspire you to follow a few important guidelines in the process of -model building. One of the key aspects of model building has to do with -making a solid business case for the model before programming.

- -

Another important guideline is to start simple then move on to other -steps. This ensures that you always have a working model at each of the -steps in the process, i.e. starting with a skateboard rather than a car -part as in this example from MetaLab.

-
- -
‘Skateboard, Bike, Car’
-
- -

Before you start developing the decision model (an R function), open -R and download and load the decisionSupport package.

-
install.packages("decisionSupport")
-library(decisionSupport)
-
-

Creating a model

-

The mcSimulation function from the -decisionSupport package can be applied to conduct decision -analysis (Luedeling et al. 2023). The -function requires three inputs:

-
    -
  1. an estimate of the joint probability distribution of -the input variables. These specify the names and probability -distributions for all variables used in the decision model. These -distributions aim to represent the full range of possible values for -each component of the model.
  2. -
  3. a model_function that predicts decision outcomes based -on the variables named in a separate data table. This R function is -customized by the user to address a particular decision problem to -provide the decision analysis model.
    -
  4. -
  5. numberOfModelRuns indicating the number of times to run -the model function.
  6. -
-

These inputs are provided as arguments to the -mcSimulation function, which conducts a Monte Carlo -analysis with repeated model runs based on probability distributions for -all uncertain variables. The data table and model are customized to fit -the particulars of a specific decision.

-
-
-

The estimate

-

To support the model building process we design an input table to -store the estimate values. The table is stored locally as -example_decision_inputs.csv and contains many of the basic -values for the analysis. This table contains all the input variables -used in the model. Their distributions are described by 90% confidence -intervals, which are specified by lower (5% quantile) and upper (95% -quantile) bounds, as well as the shape of the distribution. This example -uses four different distributions:

-
    -
  1. const – a constant value
  2. -
  3. norm – a normal distribution
  4. -
  5. tnorm_0_1 – a truncated normal distribution that can -only have values between 0 and 1 (useful for probabilities; note that 0 -and 1, as well as numbers outside this interval are not permitted as -inputs)
  6. -
  7. posnorm – a normal distribution truncated at 0 (only -positive values allowed)
  8. -
-

For a full list of possible distributions, type -?random.estimate1d in your R console. When specifying -confidence intervals for truncated distributions, note that -approximately 5% of the random values should ‘fit’ within the truncation -interval on either side. If there is not enough space, the function will -generate a warning (usually it will still work, but the inputs may not -look like you intended them to).

-

We have provided default distributions for all the variables used -here, but feel free to make adjustments by editing the .csv file in a -spreadsheet program. You can download the ‘example_decision_inputs.csv’ -here and save it as a .csv file locally (from your web browser try: File -> Save page as). In this example it is stored in the ‘data’ folder. -Once you have downloaded the data you can run -example_decision_inputs <- read.csv("data/example_decision_inputs.csv") -in your console to make the following code work on your machine. Note -that the input table that can be written or read from a .csv file and -calculated with the estimate_read_csv function or converted -to the correct format with the as.estimate function.

-
-
-

The model_function

-

The decision model is coded as an R function which takes in the -variables provided in the data table and generates a model output, such -as the Net Present Value.

-
-
-

Create a simple function

-

Here we define a simple model function that we call -example_decision_model. It calculates profits as benefits -minus costs and arbitrarily adds 500 to the result to arrive at -final_profits. This simple example shows us how to use -function, the results of which can be applied elsewhere. -The most basic way to apply the library is to use the -mcSimulation function to run our -example_decision_model. Here we run it 100 times using -inputs from the example_decision_inputs.csv table.

-

Update this model by adding additional_benefits, one of -the variables from the example_decision_inputs data, to -replace the 500 that was arbitrarily added to the profit and changing -the number of model runs to 700.

-
-
example_decision_model <- function(x, varnames){
-  profit <- benefits-costs
-  
-  final_profits <- profit + 500
-  
-  return(final_profits)
-  
-}
+# load the data (use write.csv for data created in RMD)
+stakeholder <- read.csv("data/stakeholder.csv")
+input_estimates <- read.csv("data/input_estimates.csv")
+example_decision_inputs <- read.csv("data/example_decision_inputs.csv")
+input_table <- read.csv("data/example_input_table.csv")
+hail_estimates <- read.csv("data/hail_estimates.csv")
 
-mcSimulation(estimate = as.estimate(example_decision_inputs),
-                              model_function = example_decision_model,
-                              numberOfModelRuns = 100,
-                              functionSyntax = "plainNames")
- -
-
-
example_decision_model <- function(x, varnames){
-  
-  profit <- benefits-costs
-  
-  final_profits <- profit + additional_benefits
-  
-  return(final_profits)
-  
-}
+###### chile function #######
 
-mcSimulation(estimate = as.estimate(example_decision_inputs),
-                              model_function = example_decision_model,
-                              numberOfModelRuns = 700,
-                              functionSyntax = "plainNames")
-
-

Now we have a simulation of possible outcomes. In the model programming seminar this -week we will go into more detail on the options for assessment of the -results of simulations and on visualization of the results.

-

Note that this example was constructed for clarity and not for speed. -Speed considerations are not very important when we only run a process -once, but since in the Monte Carlo simulation all delays are multiplied -by numberOfModelRuns (e.g. 10,000), they can sometimes add -up to substantial time losses. Even with highly efficient coding, Monte -Carlo simulations can take a while when dealing with complex -decisions.

-

The objective of the procedures used in the -decisionSupport package is to make it easier for analysts -to produce decision-relevant information that adequately reflects the -imperfect nature of the information we usually have. Adding -probabilistic elements to a simulation adds substantial value to an -analysis. Mostly, it avoids making spurious assumptions, replacing -uncertainty with ‘best bets’ and producing results that do not reflect -the knowledge limitations that make decision-making so challenging. More -information on all this is contained in the decisionSupport -manual, especially under welfareDecisionAnalysis.

-
-
-

Group discussion reading:

-
    -
  • Do, Luedeling, and Whitney (2020) -‘Decision analysis of agroforestry options reveals adoption risks for -resource-poor farmers’.
  • -
-
-
-

Bonus: Further reading

-
    -
  • Whitney et al. (2018) -‘Probabilistic decision tools for determining impacts of agricultural -development policy on household nutrition’.

  • -
  • Ruett, Whitney, and Luedeling -(2020) ‘Model-based evaluation of management options in -ornamental plant nurseries’

  • -
-
-
-
-

Seminar 6: Model Programming

- -

Welcome to the sixth seminar of Decision Analysis and -Forecasting for Agricultural Development. In this seminar we -will look into different options for model programming in the -R programming environment (R Core -Team 2023). In case you have not already installed R -and RStudio you may want to go back to the Using R and Rstudio seminar and follow the -instructions there. Feel free to bring up any questions or concerns in -the Slack or to Dr. Cory -Whitney or the course tutor.

-

We have just learned about the processes and methods of generating Decision Models. Now we can explore -some of the packages that we commonly use in R. The main -package that our team uses for simulations is the -decisionSupport. In order to use the standard tools for -running these models in the R environment you will need to load the -decisionSupport library. Use the -install.packages function, -i.e. install.packages("decisionSupport").

-
library(decisionSupport)
-
-

Plot the impact pathway

-

In this example, we show how we can transform a simple impact pathway -into code for estimating profit distributions.

-

We use the mermaid function from the -DiagrammeR library to create the graphical impact pathway -(Iannone 2023).

-

Run the code below and then add an additional cost to the graph -called Management cost (try to do this on your own but feel -free to look to the solution to see one way to do -this).

-
-
mermaid("graph LR
-        Y(Yield)-->I(Income); linkStyle 0 stroke:green, stroke-width:1.5px
-        M(Market price)-->I; linkStyle 1 stroke: green, stroke-width:1.5px
-        I-->F(Final result); linkStyle 2 stroke: green, stroke-width:1.5px
-        CL(Labor cost)-->F; linkStyle 3 stroke: red, stroke-width:1.5px")
- -
-
-
mermaid("graph LR
-        Y(Yield)-->I(Income); linkStyle 0 stroke:green, stroke-width:1.5px
-        M(Market price)-->I; linkStyle 1 stroke: green, stroke-width:1.5px
-        I-->F(Final result); linkStyle 2 stroke: green, stroke-width:1.5px
-        CL(Labor cost)-->F; linkStyle 3 stroke: red, stroke-width:1.5px
-        CM(Management cost)-->F; linkStyle 4 stroke: red, stroke-width:1.5px")
-
-

The LR argument in the mermaid function -specifies that the plot will go from left to right. Try and change this -to TD.

-

Use the style options in mermaid to change -the arrow widths to 2px and the node colors to red for -costs and green for benefits. You will need to break the line and put -the linkStyle on a new line to add the color to the -node.

-
-
mermaid("graph LR
-        Y(Yield)-->I(Income); linkStyle 0 stroke:green, stroke-width:1px
-        M(Market price)-->I; linkStyle 1 stroke: green, stroke-width:1px
-        I-->F(Final result); linkStyle 2 stroke: green, stroke-width:1px
-        CL(Labor cost)-->F; linkStyle 3 stroke: red, stroke-width:1px
-        CM(Management cost)-->F; linkStyle 4 stroke: red, stroke-width:1px")
- -
-
-
mermaid("graph TB
-        Y(Yield)-->I(Income); style I fill:green
-        linkStyle 0 stroke:green, stroke-width:2px
-        M(Market price)-->I; linkStyle 1 stroke: green, stroke-width:2px
-        I-->F(Final result); linkStyle 2 stroke: green, stroke-width:2px
-        CL(Labor cost)-->F; style CL fill:red
-        linkStyle 3 stroke: red, stroke-width:2px
-        CM(Management cost)-->F; style CM fill:red
-        linkStyle 4 stroke: red, stroke-width:2px")
-
-

That was just one of many ways to generate impact pathways. To see -more options see the Decision -Analysis Overview lecture materials.

-
-
-

Building the model

-

Here we generate an input table to feed the model function. Update -this with your new management cost variable in the graphical impact -pathway. Call your new variable "Management_cost", make the -lower bound 100 and the upper bound 2000, make -the distribution "posnorm", make the label -"Management cost (USD/ha)" and make the description -"Management costs in a normal season".

-
-
input_estimates <- data.frame(variable = c("Yield", "Market_price", "Labor_cost"),
-                    lower = c(6000, 3, 500),
-                    median = NA,
-                    upper = c(14000, 8, 1000),
-                    distribution = c("posnorm", "posnorm", "posnorm"),
-                    label = c("Yield (kg/ha)", "Price (USD/kg)", "Labor cost (USD/ha)"),
-                    Description = c("Yield in a sweet cherry farm under normal conditions",
-                                    "Price of sweet cherry in a normal season",
-                                    "Labor costs in a normal season"))
-
-input_estimates
- -
-
-
input_estimates <- data.frame(variable = c("Yield", "Market_price", "Labor_cost", "Management_cost"),
-                    lower = c(6000, 3, 500, 100),
-                    median = NA,
-                    upper = c(14000, 8, 1000, 2000),
-                    distribution = c("posnorm", "posnorm", "posnorm", "posnorm"),
-                    label = c("Yield (kg/ha)", "Price (USD/kg)", "Labor cost (USD/ha)", "Management cost (USD/ha)"),
-                    Description = c("Yield in a sweet cherry farm under normal conditions",
-                                    "Price of sweet cherry in a normal season",
-                                    "Labor costs in a normal season", 
-                                    "Management costs in a normal season"))
-
-input_estimates
-
-

Here we use the mcSimulation function from the -decisionSupport package to implement a model (Luedeling et al. 2023). The model function that -describes the graphical impact pathway. Add a new line of code that -summarizes the Labor_cost and Management_cost -into overall_costs, then subtract these from the -income to calculate final_result.

-
-
model_function <- function(){
-  
-  # Estimate the income in a normal season
-  income <- Yield * Market_price
-  
-  # Estimate the final results from the model
-  final_result <- income - Labor_cost
-  
-  # Generate the list of outputs from the Monte Carlo simulation
-  return(list(final_result = final_result))
-}
-
-# Run the Monte Carlo simulation using the model function
-example_mc_simulation <- mcSimulation(estimate = as.estimate(input_estimates),
-                              model_function = model_function,
-                              numberOfModelRuns = 800,
-                              functionSyntax = "plainNames")
-
-example_mc_simulation
- -
-
-
model_function <- function(){
-  
-  # Estimate the income in a normal season
-  income <- Yield * Market_price
-  
-  overall_costs <- Labor_cost + Management_cost
-    
-  # Estimate the final results from the model
-  final_result <- income - overall_costs
-  
-  # Generate the list of outputs from the Monte Carlo simulation
+# create the model function
+model_function <- function(){
+  income <- Yield * Market_price
+  overall_costs <- Labor_cost + Management_cost
+  final_result <- income - overall_costs
   return(list(final_result = final_result))
 }
 
-# Run the Monte Carlo simulation using the model function
-example_mc_simulation <- mcSimulation(estimate = as.estimate(input_estimates),
+# Run the Monte Carlo simulation using the model function ####
+example_mc_simulation <- mcSimulation(estimate = as.estimate(input_estimates),
                               model_function = model_function,
-                              numberOfModelRuns = 800,
-                              functionSyntax = "plainNames")
-
-example_mc_simulation
-
-

Here we show the results of a Monte Carlo simulation (800 model runs) -for estimating the profits in sweet cherry orchards.

-

Change the plot to a histogram by using the method -argument in the plot_distributions function.

-
-
plot_distributions(mcSimulation_object = example_mc_simulation,
-                   vars = "final_result",
-                   method = "boxplot_density",
-                   old_names = "final_result",
-                   new_names = "Outcome distribution for profits")
- -
-
-
plot_distributions(mcSimulation_object = example_mc_simulation,
-                   vars = "final_result",
-                   method = "hist_simple_overlay",
-                   old_names = "final_result",
-                   new_names = "Outcome distribution for profits")
-
-
-
-

Testing with make_variables

-

You could simply start further developing the decision model now, but -since the model function will be designed to make use of variables -provided to it externally (random numbers drawn according to the -information in the data table), you will need to define sample values -for all variables, if you want to test pieces of the function code -during the development process. This can be done manually, but it’s more -easily accomplished with the following helper function -make_variables:

-
-
make_variables <- function(est,n=1)
-{ x<-random(rho=est, n=n)
+                              numberOfModelRuns = 100,
+                              functionSyntax = "plainNames")
+###### make variables function #######
+# make variables
+make_variables <- function(est,n=1)
+{ x<-random(rho=est, n=n)
     for(i in colnames(x)) assign(i,
      as.numeric(x[1,i]),envir=.GlobalEnv)
-}
- -
-

This function is not included in the decisionSupport -package, because it places the desired variables in the global -environment. This is not allowed for functions included in packages on -R’s download servers.

-

Applying make_variables and as.estimate to -the data table (with default setting n=1) generates one -random number for each variable, which then allows you to easily test -the code you are developing. Try running this function on your code as -you build the decision function. This allows for testing the values -within a model rather than running the full model.

-

Run the make_variables and as.estimate on -the input_estimates input table that we created and then -calculate the result of Labor_cost + Management_cost given -a single random value for these variables. Note that each time -you run this code it generates a new random draw and produces a -different number from within the range for the variables in the input -table.

-
-
make_variables(as.estimate(input_estimates))
+}
 
-Market_price
- -
-
-
make_variables(as.estimate(input_estimates))
+# suppress warnings
+options(warn = - 1)
+
+###### burkina function #######
+# second decision function
+example_decision_function <- function(x, varnames){
 
-Labor_cost + Management_cost
-
-
-
-

Next steps

-

Once you have followed and run the code above on your machine it is a -good time to look through the outline of these procedures in the -decisionSupport vignette on the CRAN called ‘Applying -the mcSimulation function in decisionSupport’ (Fernandez, Whitney, and Luedeling 2021). This -will show you how the tools can be applied for comparing decision -outcomes. It runs a Monte-Carlo-based selection of sedimentation -management strategies for a reservoir in the Upper Volta River Basin of -Burkina Faso (Lanzanova et al. 2019). Tip: -If you want to play with this code you can find the Rmarkdown -file and the estimate -table in the decisionSupport GitHub repository.

-

Taken together, all these outputs allow an evaluation of the -plausible range of net benefits that can be expected to arise from the -decision. They provide a recommendation on which decision option should -be preferred, and an appraisal of which input variables are responsible -for most of the variation in the output distribution.

-
-
-

Bonus: Bayesian modeling application

-

Listen to this interview on Alex Andorra’s podcast ‘Learning Bayesian -Statistics’, How -to use Bayes in industry, with Colin Carroll

- - - - -
-
-
-

Lecture 7: Calibrating experts

- -

Welcome to lecture 7 of Decision Analysis and Forecasting for -Agricultural Development. Feel free to bring up any questions -or concerns in the Slack or to Dr. Cory -Whitney or the course tutor.

-

The following videos tell you something about a method which is -frequently used in our group to get reliable values for the input tables -from our stakeholders. Please watch the videos and answer the questions -that follow.

-
-

What is calibration training?

- -
-
-
-
-
- -
-
-
-
-
-
-
- -
-
-
-
-

Why do we apply calibration training?

- -
-
-
-
-
- -
-
-
-
-
-
-
- -
-
-
-
-

Group discussion reading:

-
    -
  • Chapter 1. Kahneman, Daniel, and Patrick Egan. -Thinking, Fast and Slow. Vol. 1. New York: Farrar, Straus and Giroux, -2011.
  • -
-
-
-

Bonus Material

-

Watch the video and do the awareness test!

- -

Have a look at Terry Heick’s Cognitive -Bias codex.

-
-
-
-

Seminar 7: Calibration training

- -

Welcome to the sixth seminar of Decision Analysis and -Forecasting for Agricultural Development. This week we will be -going through the materials that were mentioned in the Calibrating experts lecture.

-

Your work for this this seminar is to review materials from Lecture 7 -on the methods of calibration and prepare to be subjected to the -calibration process.

-

To prepare please load the required calibration sheets from our -shared Slack work space.

-

After this session please consider the inputs that will be necessary -for your own model and prepare the input table and formulate questions -for eliciting calibrated estimates from your colleagues and/or any -experts that you have identified for your team’s model.

-
-

Further tools

-

Sign in to GuidedTrack -calibration training to get your confidence intervals results.

-

Sign in to the Prediction -Book to ‘Find out just how sure you should be, and get better at -being only as sure as the facts justify’.

-
-
-

Group discussion - food for thought

-
    -
  1. What do you think about the idea of taking estimates as values -for the impact models?

  2. -
  3. What do you think about the method of calibration training as -presented? Do you have suggestions how to change or improve it?

  4. -
  5. Can you think of further cognitive biases and how to deal with -them?

  6. -
-
-
-
-

Lecture 8: Using Models to Create Forecasts

- -

Welcome to lecture 8 of Decision Analysis and Forecasting for -Agricultural Development. We will walk through brief examples -and offer the scripts. Feel free to bring up any questions or concerns -in the Slack or to Dr. Cory -Whitney or the course tutor.

-

In this lecture we will build on what we learned in the Decision Models lecture and the Model programming seminar to learn -more about how to generate useful forecasts for communicating model -results. We will implement another Monte Carlo simulation.

-

In the Model programming -seminar we generated a model called example_mc_simulation. -As a quick refresher change the plot of the results of that model from -boxplot_density to smooth_simple_overlay by -using the method argument in the -plot_distributions function.

-
-
plot_distributions(mcSimulation_object = example_mc_simulation,
-                   vars = "final_result",
-                   method = "boxplot_density",
-                   old_names = "final_result",
-                   new_names = "Outcome distribution for profits")
- -
-
-
plot_distributions(mcSimulation_object = example_mc_simulation,
-                   vars = "final_result",
-                   method = "smooth_simple_overlay",
-                   old_names = "final_result",
-                   new_names = "Outcome distribution for profits")
-
-
-

Making forecasts

-

First let’s look at a case study on agroforestry interventions in -Northwest Vietnam (Do, Luedeling, and Whitney -2020).

- -
-
-
-
-
- -
-
-
-
-
-
-
- -
-
-
-
-
-
-
- -
-
-
-
-

Value of information and value of control

- -

Since the model inputs are uncertain variables, there is always a -chance that the decision turns out to be wrong. Gathering new data to -inform the decision, will lead to (on average) better decisions. But how -much better? Is it worth the measuring effort? What if we not only -measure but even manipulate the model inputs? These -concepts are informally known as “value of clairvoyance” and “value of -wizardry”. Please watch the following lecture by Johannes Kopton and -enter the magical world of decision analysis:

- -
-
-
-
-
- -
-
-
-
-
-
-
- -
-
-
-
-
-
-
- -
-
-
-
-
-
-
- -
-
-

Technically we assess the possibility of making the wrong decision -with a payoff matrix \(Rij\) with the -row index \(i\) describing a choice and -the column index \(j\) describing a -random variable that the decision maker does not yet have knowledge of, -that has probability \(pj\) of being in -state \(j\). If the decision maker has -to choose \(i\) without knowing the -value of \(j\), the best choice is the -one that maximizes the expected value:

-

\({\mbox{EMV}}=\max _{i}\sum -_{j}p_{j}R_{ij}\)

-

where \(\sum _{j}p_{j}R_{ij}\) is -the expected payoff for action \(i\) -i.e. the expectation value, and \({\mbox{EMV}}=\max _{i}\) is choosing the -maximum of these expectations for all available actions.

-

However, with perfect knowledge of \(j\), the decision maker would choose a -value of \(i\) that optimizes the -expectation for that specific \(j\). -Therefore, the expected value given perfect information is

-

\({\mbox{EV}}|{\mbox{PI}}=\sum -_{j}p_{j}(\max _{i}R_{ij})\)

-

where \(p_{j}\) is the probability -that the system is in state \(j\), and -\(R_{ij}\) is the pay-off if one -follows action \(i\) while the system -is in state \(j\). Here \((\max_{i}R_{ij})\) indicates the best -choice of action \(i\) for each state -\(j\).

-

The expected value of perfect information is the difference between -these two quantities,

-

\({\mbox{EVPI}}={\mbox{EV}}|{\mbox{PI}}-{\mbox{EMV}}\)

-

This difference describes, in expectation, how much larger a value -the decision maker can hope to obtain by knowing \(j\) and picking the best \(i\) for that \(j\), as compared to picking a value of -\(i\) before \(j\) is known. Since \(EV|PI\) is necessarily greater than or -equal to \(EMV\), \(EVPI\) is always non-negative.

-
-
-

Projection to Latent Structures

-

Projection to Latent Structures (PLS), also sometimes known as -Partial Least Squares regression is a multivariate statistical technique -that can deal with multiple collinear dependent and independent -variables (Wold, Sjöström, and Eriksson -2001). It can be used as another means to assess the outcomes of -a Monte Carlo model. Read more in ‘A Simple Explanation -of Partial Least Squares’ by Kee Siong Ng.

-

Variable Importance in Projection (VIP) scores estimate the -importance of each variable in the projection used in a PLS mode. VIP is -a parameter used for calculating the cumulative measure of the influence -of individual \(X\)-variables on the -model. For a given PLS dimension, \(a\), the squared PLS weight \((W_a^2\) of that term is multiplied by the -explained sum of squares (\(SS\)) of -that \(PLS\) dimension; and the value -obtained is then divided by the total explained \(SS\) by the PLS model and multiplied by the -number of terms in the model. The final \(VIP\) is the square root of that -number.

-

\(VIP_{PLS} = K\times -(\frac{[\sum_{a=1}^{A}(W_{a}^{2} \times -SSY_{comp,a})]}{SSY_{cum}})\)

-

Technically, VIP is a weighted combination overall components of the -squared PLS weights (\(Wa\)), where -\(SSY_{comp,a}\) is the sum of squares -of \(Y\) explained by component \(a\), \(A\) -is the total number of components, and \(K\) is the total number of variables. The -average VIP is equal to 1 because the \(SS\) of all VIP values is equal to the -number of variables in \(X\). A -variable with a VIP Score close to or greater than 1 (one) can be -considered important in given model. The input is a PLS model and the -output is a set of column vectors equal in length to the number of -variables included in the model. See Galindo-Prieto, Eriksson, and Trygg (2014) for a -detailed description of variations of VIP analysis.

-

In the seminar we will go into detail abut how to calculate PLS and -the VIP with the built-in functions of the decisionSupport -package.

-
-
-

Group discussion reading:

-
    -
  • Tetlock, Philip E., and Dan Gardner. Superforecasting: The Art and -Science of Prediction. New York, NY: Crown Publishers, 2015.(Chapter 1. -An Optimistic Skeptic)
  • -
-
-
-
-

Seminar 8: Models

- -

Welcome to eighth seminar of Decision Analysis and -Forecasting for Agricultural Development.Feel free to bring up -any questions or concerns in the Slack or to Dr. Cory -Whitney or the course tutor.

-
-

Decision analysis with the decisionSupport package

-

As we have discussed in previous lectures, when deciding on -intervention in complex systems, many of the variables decision makers -need to consider are difficult or impossible to precisely quantify. The -decisionSupport package (Luedeling -et al. 2023) can make use of uncertainty around these -relationships and implement a Monte Carlo simulation, which generates a -large number of plausible system outcomes. The model procedure draws -random numbers for each input variable, which are drawn from specified -probability distributions. The method requires two inputs to the -mcSimulation function:

-
    -
  1. an R function that predicts decision outcomes based on the -variables named in a separate data table. This R function is customized -by the user to address a particular decision problem.

  2. -
  3. an input table (normally in comma separated values -format, ending in .csv) specifying the names and -probability distributions for all variables used in the decision model. -These distributions aim to represent the full range of possible values -for each component of the model.

  4. -
-

The mcSimulation function from the -decisionSupport package conducts a Monte Carlo analysis -with repeated model runs based on probability distributions for all -uncertain variables. The data table and model are customized to fit the -particulars of a specific decision.

-
-
-

Building a decision model as an R function

-

We have already built some simple models in the Decision Models lecture and the Model programming seminar. Now -let’s look at a slightly more comprehensive decision model of -sedimentation management strategies for a reservoir in the Upper Volta -River Basin of Burkina Faso (Lanzanova et al. -2019). The reservoirs in the Upper Volta have multiple benefits -for rural communities and are important for food security and -livelihoods. Sedimentation is a major issue, requiring efficient -sedimentation management for which there are many options each fraught -with uncertainty and risk. We will walk through a simplified version of -the model by Lanzanova et al. (2019). We -will use various functions from the decisionSupport -library.

-

First we generate a model as a function. We use the -decisionSupport functions vv() to produce time -series with variation from a pre-defined mean and coefficient of -variation, chance_event() to simulate whether events occur -and discount() to discount values along a time series.

-
-
example_decision_function <- function(x, varnames){
-  
   # calculate ex-ante risks: impact the implementation of interventions ####
-  intervention_NonPopInvolvEvent <-
+  intervention_NonPopInvolvEvent <-
     chance_event(intervention_NonPopInvolv, 1, 0, n = 1)
-  
+
   # pre-calculate common random draws for all intervention model runs ####
-  
+
   # profits from Tropical Livestock Units (TLU)
-  TLU <- vv(TLU_no_intervention, var_CV, n_years)
-  TLU_profit <- vv(profit_per_TLU, var_CV, n_years)
-  
+  TLU <- vv(TLU_no_intervention, var_CV, n_years)
+  TLU_profit <- vv(profit_per_TLU, var_CV, n_years)
+
   # benefits of fruit
-  precalc_intervention_fruit_benefits <-
+  precalc_intervention_fruit_benefits <-
     vv(intervention_fruit_area_ha, var_CV, n_years) *
     vv(intervention_fruit_yield_t_ha, var_CV, n_years) *
     vv(intervention_fruit_profit_USD_t, var_CV, n_years)
-  
+
   # benefits of vegetables
-  precalc_intervention_vegetable_benefits <-
+  precalc_intervention_vegetable_benefits <-
     vv(intervention_vegetable_area_ha, var_CV, n_years) *
     vv(intervention_vegetable_yield_t_ha, var_CV, n_years) *
     vv(intervention_vegetable_profit_USD_t, var_CV, n_years)
-  
+
   # benefits of rain-fed crops
-  precalc_intervention_rainfed_crop_benefits <-
+  precalc_intervention_rainfed_crop_benefits <-
     vv(intervention_rainfed_crop_area_ha, var_CV, n_years) *
     vv(intervention_rainfed_crop_yield_t_ha, var_CV, n_years) *
     vv(intervention_rainfed_crop_profit_USD_t, var_CV, n_years)
-  
+
   #  Intervention ####
-  
-  for (decision_intervention_strips in c(FALSE,TRUE)){
-      
-  if (decision_intervention_strips){
-    
-    intervention_strips <- TRUE
-    intervention_strips_PlanningCost <- TRUE
-    intervention_strips_cost <- TRUE
+
+  for (decision_intervention_strips in c(FALSE,TRUE))
+      {
+
+  if (decision_intervention_strips)
+  {
+    intervention_strips <- TRUE
+    intervention_strips_PlanningCost <- TRUE
+    intervention_strips_cost <- TRUE
   } else
   {
-    intervention_strips <- FALSE
-    intervention_strips_PlanningCost <- FALSE
-    intervention_strips_cost <- FALSE
+    intervention_strips <- FALSE
+    intervention_strips_PlanningCost <- FALSE
+    intervention_strips_cost <- FALSE
   }
-  
+
   if (intervention_NonPopInvolvEvent) {
-    intervention_strips <- FALSE
-    intervention_strips_cost <- FALSE
+    intervention_strips <- FALSE
+    intervention_strips_cost <- FALSE
   }
-  
+
   # Costs ####
   if (intervention_strips_cost) {
-    cost_intervention_strips <-
-      intervention_adaptation_cost + 
-      intervention_tech_devices_cost + 
+    cost_intervention_strips <-
+      intervention_adaptation_cost +
+      intervention_tech_devices_cost +
       intervention_nursery_cost +
       intervention_wells_cost +
-      intervention_training_cost + 
-      intervention_mngmt_oprt_cost + 
+      intervention_training_cost +
+      intervention_mngmt_oprt_cost +
       intervention_mngmt_follow_cost +
       intervention_mngmt_audit_cost
   } else
-    cost_intervention_strips <- 0
-  
+    cost_intervention_strips <- 0
+
   if (intervention_strips_PlanningCost) {
-    plan_cost_intervention_strips <-
+    plan_cost_intervention_strips <-
       intervention_communication_cost + intervention_zoning_cost
   } else
-    plan_cost_intervention_strips <- 0
-  
-  maintenance_cost <- rep(0, n_years)
-  
+    plan_cost_intervention_strips <- 0
+
+  maintenance_cost <- rep(0, n_years)
+
   if (intervention_strips)
-    maintenance_cost <-
-    maintenance_cost + vv(maintenance_intervention_strips, 
+    maintenance_cost <-
+    maintenance_cost + vv(maintenance_intervention_strips,
                           var_CV, n_years)
-  
-  intervention_cost <- maintenance_cost
-  intervention_cost[1] <-
-    intervention_cost[1] + 
-    cost_intervention_strips + 
+
+  intervention_cost <- maintenance_cost
+  intervention_cost[1] <-
+    intervention_cost[1] +
+    cost_intervention_strips +
     plan_cost_intervention_strips
 
-  
+
   # Benefits from  cultivation in the intervention strips ####
-  
-  intervention_fruit_benefits <-
+
+  intervention_fruit_benefits <-
     as.numeric(intervention_strips) * precalc_intervention_fruit_benefits
-  intervention_vegetable_benefits <-
+  intervention_vegetable_benefits <-
     as.numeric(intervention_strips) * precalc_intervention_vegetable_benefits
-  intervention_rainfed_crop_benefits <-
+  intervention_rainfed_crop_benefits <-
     as.numeric(intervention_strips) * precalc_intervention_rainfed_crop_benefits
-  
+
   # Total benefits from crop production (agricultural development and riparian zone) ####
-  crop_production <-
+  crop_production <-
     intervention_fruit_benefits +
     intervention_vegetable_benefits +
     intervention_rainfed_crop_benefits
-  
+
   # Benefits from livestock ####
   # The following allows considering that intervention strips may
   # restrict access to the reservoir for livestock.
-  
+
   if (intervention_strips)
-    TLU_intervention <-
+    TLU_intervention <-
     TLU * (1 + change_TLU_intervention_perc / 100)
   else
-    TLU_intervention <- TLU
-  
+    TLU_intervention <- TLU
+
   if (decision_intervention_strips){
-    livestock_benefits <- TLU_intervention * TLU_profit
-    total_benefits <- crop_production + livestock_benefits
-    net_benefits <- total_benefits - intervention_cost
-    result_interv <- net_benefits}
-  
-  
+    livestock_benefits <- TLU_intervention * TLU_profit
+    total_benefits <- crop_production + livestock_benefits
+    net_benefits <- total_benefits - intervention_cost
+    result_interv <- net_benefits}
+
+
   if (!decision_intervention_strips){
-    livestock_benefits <- TLU_no_intervention * TLU_profit
-    total_benefits <- livestock_benefits
-    net_benefits <- total_benefits - intervention_cost
-    result_n_interv <- net_benefits}
-  
+    livestock_benefits <- TLU_no_intervention * TLU_profit
+    total_benefits <- livestock_benefits
+    net_benefits <- total_benefits - intervention_cost
+    result_n_interv <- net_benefits}
+
     } #close intervention loop bracket
 
-NPV_interv <-
+NPV_interv <-
   discount(result_interv, discount_rate, calculate_NPV = TRUE)
 
-NPV_n_interv <-
+NPV_n_interv <-
   discount(result_n_interv, discount_rate, calculate_NPV = TRUE)
 
-# Beware, if you do not name your outputs 
-# (left-hand side of the equal sign) in the return section, 
+# Beware, if you do not name your outputs
+# (left-hand side of the equal sign) in the return section,
 # the variables will be called output_1, _2, etc.
 
 return(list(Interv_NPV = NPV_interv,
             NO_Interv_NPV = NPV_n_interv,
             NPV_decision_do = NPV_interv - NPV_n_interv,
             Cashflow_decision_do = result_interv - result_n_interv))
-}
- -
-
-
-

The input table

-

The second thing we need for the model is the input table describing -the uncertainty in our variables. A draft of this has been generated for -you already. You can get the input table from this Github repository by -using the read_csv function from the readr -library (Wickham, Hester, and Bryan 2023) -and url function from base R. You can also download the -file directly from GitHub if you want. You will need to use the -‘save as’ option for the webpage so that you can have it stored locally -as a .csv file on your machine.

-
-
library(readr)
+}
+###### run burkina model #######
+# second simulation
+mcSimulation_results <- decisionSupport::mcSimulation(
+  estimate = decisionSupport::estimate_read_csv("data/example_input_table.csv"),
+  model_function = example_decision_function,
+  numberOfModelRuns = 100,
+  functionSyntax = "plainNames"
+)
 
-example_input_table = "https://raw.githubusercontent.com/CWWhitney/Decision_Analysis_Course/main/data/example_input_table.csv"
+###### burkina evpi #####
+mcSimulation_table <- data.frame(mcSimulation_results$x, mcSimulation_results$y[1:3])
 
-input_table <- read_csv(url(example_input_table))
- -
-

Want to have a look? Load the data using the read.csv -function.

-
example_data <- read.csv("example_input_table.csv")
-

Notice that the data has seven columns. Looking at the file it should -be clear from the ‘Description’ column what each variable is about and -in which units it is in. Ideally this is also mostly clear form the name -of the variable.

-
-
names(input_table)
- -
-
-
-

Perform a Monte Carlo simulation

-

Using the model function above, we can perform a Monte Carlo -simulation with the mcSimulation() function from -decisionSupport. This function generates distributions of -all variables in the input table as well as the specified model outputs -(see return() function above) by calculating random draws -in our defined example_decision_function(). Make sure that -all the variables in the input table are included in the model -(erroneous variables listed there can cause issues with some of the -post-hoc analyses).

-

The numberOfModelRuns argument is an integer indicating -the number of model runs for the Monte Carlo simulation. Unless the -model function is very complex, 10,000 runs is a reasonable choice (for -complex models, 10,000 model runs can take a while, so especially when -the model is still under development, it often makes sense to use a -lower number).

-
-
mcSimulation_results <- decisionSupport::mcSimulation(
-  estimate = decisionSupport::estimate_read_csv("data/example_input_table.csv"),
-  model_function = example_decision_function,
-  numberOfModelRuns = 200,
-  functionSyntax = "plainNames"
-)
- -
-
-
-

Model assessment

-

For the assessment of the results we will use the -decisionSupport and many tidyverse libraries -(Wickham et al. 2019) including -ggplot2 (Wickham, Chang, et al. -2023), plyr (Wickham -2023a) and dplyr (Wickham, -François, et al. 2023) among others in the R programming language (R Core Team 2023).

-
-
-

Plot Net Present Value (NPV) distributions

-

We can use the plot_distributions() function to produce -one of the several plotting options for distribution outputs. This shows -us an overlay of the full results of the Monte Carlo model of the -decision options, i.e. the expected NPV if we choose to do the -intervention Interv_NPV or not do the intervention -NO_Interv_NPV.

-
-
decisionSupport::plot_distributions(mcSimulation_object = mcSimulation_results, 
-                                    vars = c("Interv_NPV", "NO_Interv_NPV"),
-                                    method = 'smooth_simple_overlay', 
-                                    base_size = 7)
- -
-

We can use the same function to show the distributions of the ‘do’ -Interv_NPV and ‘do not do’ NO_Interv_NPV -decision scenarios as boxplots. This can be useful when comparing -multiple outputs by illustrating the spread of the data resulting from -the decision model. Boxplots show the median (central line), the -25th and 75th percentiles (sides of boxes) and any -outliers (light circles outside of boxes).

-
-
decisionSupport::plot_distributions(mcSimulation_object = mcSimulation_results, 
-                                    vars = c("Interv_NPV",
-                                    "NO_Interv_NPV"),
-                                    method = 'boxplot')
- -
-

We can use the same function for the value of the decision -(difference in NPV between do and do not do). This can be quite helpful -for us since it shows us the outcome distribution of the decision -itself.

-
-
decisionSupport::plot_distributions(mcSimulation_object = mcSimulation_results, 
-                                    vars = "NPV_decision_do",
-                                    method = 'boxplot_density')
- -
-
-

Cashflow analysis

-

Here we plot the distribution of annual cashflow over the entire -simulated period for the intervention (n_years). For this -we use the plot_cashflow() function which uses the -specified cashflow outputs from the mcSimulation() function -(in our case Cashflow_decision_do) to show cashflow over -time.

-
-
plot_cashflow(mcSimulation_object = mcSimulation_results, cashflow_var_name = "Cashflow_decision_do")
- -
-
-
-

Value of Information (VoI) analysis

-

We calculate Value of Information (VoI) analysis with the Expected -Value of Perfect Information (EVPI). As we learned in Lecture 8 on forecasts, EVPI measures the -expected opportunity loss that is incurred when the decision-maker does -not have perfect information about a particular variable. EVPI is -determined by examining the influence of that variable on the output -value of a decision model.

-

We use the function data.frame() to transform the x and -y outputs of the mcSimulation() function for EVPI -calculation. We use the multi_EVPI() function to calculate -the EVPI for multiple independent variables. For the -first_out_var argument we choose -intervention_mngmt_audit_cost from the input table since -this is the first variable after the NPV and -cashflow model outputs, which we would like to exclude from -the EVPI analysis.

-
-
#here we subset the outputs from the mcSimulation function (y) by selecting the correct variables
-# choose this carefully and be sure to run the multi_EVPI only on the variables that the you want
-mcSimulation_table <- data.frame(mcSimulation_results$x, mcSimulation_results$y[1:3])
+evpi <- multi_EVPI(mc = mcSimulation_table, first_out_var = "Interv_NPV")
+###### Burkina pls ########
+pls_result <- plsr.mcSimulation(object = mcSimulation_results,
+                  resultName = names(mcSimulation_results$y)[3], ncomp = 1)
+###### HAIL Risk #######
+hail_function <- function(){
 
-evpi <- multi_EVPI(mc = mcSimulation_table, first_out_var = "Interv_NPV")
- -
-

We use the function plot_evpi() on the results from -multi_EVPI() to plot the Expected Value of Perfect -Information (EVPI). Here we show the results with the standard settings. -The length of the bars is equal to EVPI.

-
-
plot_evpi(evpi, decision_vars = "NPV_decision_do")
- -
-

Finally, we can use the compound_figure() function to -provide a single figure for a quick assessment. The can be used to run -the full decision assessment for a simple binary decision (‘do’ or ‘do -not do’).

-
-
compound_figure(mcSimulation_object = mcSimulation_results, 
-                input_table = input_table, plsrResults = pls_result, 
-                EVPIresults = evpi, decision_var_name = "NPV_decision_do", 
-                cashflow_var_name = "Cashflow_decision_do", 
-                base_size = 7)
- -
-
-
-

Projection to Latent Structures (PLS) analysis

-

We can apply another post-hoc analysis to the -mcSimulation() outputs with -plsr.mcSimulation() to determine the Variable Importance in -the Projection (VIP) score and coefficients of a Projection to Latent -Structures (PLS) regression model. This function uses the outputs of the -mcSimulation() selecting all the input variables from the -decision analysis function in the parameter object and then -runs a PLS regression with an outcome variable defined in the parameter -resultName. We use the code -names(mcSimulation_results$y)[3] to select the outcome -variable NPV_decision_do, which is the third element of the list -y in our mcSimulation_results outputs (this -must be a character element).

-
-
pls_result <- plsr.mcSimulation(object = mcSimulation_results,
-                  resultName = names(mcSimulation_results$y)[3], ncomp = 1)
- -
-

We run the plot_pls() on the results from -plsr.mcSimulation() with a number of standard settings. The -length of the bars is equal to VIP with a vertical line at ‘1’ on the -x-axis indicating a standard cut-off for VIP used for variable -selection. The overall plot only shows those variables with a VIP > -0.8, which is the common threshold for variable selection. The colors of -the bars represent the positive or negative coefficient of the given -input variable with the output variable.

-

Here we import the input table again to replace the labels for the -variables on the y-axis. The input table can include a -label and variable column. The standard labels -(from the variable column) are usually computer readable -and not very nice for a plot. The plot_pls() function uses -the text in the label column as replacement for the default -text in the variable column.

-
-
plot_pls(pls_result, input_table = input_table, threshold = 0)
- -
-
-
-
-

Tasks

-

See the model section of the talk by Hoa Do in the Lecture 8 on forecasts (from around 4:40-6 -minutes in) for an example of how to present a model overview in -succinct and clear way. The model is presented in the first three -minutes of the talk.

- -
-
-
-

Lecture 8.1: Bayesian Analysis Practical

-
-

Confronting expert calibrated models with data

-

Welcome to the lecture on Bayesian modeling with me, Dr. Katja -Schiffers. Although we mainly implement these methods in our group -for decision analysis, we do not want to withhold the Bayesian Modeling -approaches from you. Originally, I prepared this introduction for a -seminar on the analysis of experimental data, so the focus is more on -comparing frequentist and Bayesian statistics than on decision models. -In this lecture we will use the brms package and use the -#lecture_08_01_practical Slack channel to follow up. This -will allow us all to have a chance to see your progress, provide -feedback. As always, your engagement and sharing is appreciated. It will -be encouraging to others to see activity from colleagues. Feel free to -bring up any questions or concerns in Slack or to Dr. Cory -Whitney or the course tutor.

-

The following text heavily relies on a very well-written article “Short -introduction to Bayesian statistics with R for ornithologists” -(2016) by Fränzi Korner-Nievergelt & Ommo Hüppop in ‘Die -Vogelwarte’. At the end of the lecture, you will know

-
    -
  • the principle of Bayesian statistics,
  • -
  • ways to apply it using the R package ‘bmrs’, and
  • -
  • the differences between frequentist and Bayesian statistics -regarding the interpretation of results.
  • -
-
-
-
-

Bayes’ theorem

-

No introduction to Bayesian statistics is complete without Bayes’ -theorem. Thomas Bayes was an English clergyman, philosopher, and -statistician who, in 1763 published a manuscript entitled “An essay -towards solving a problem in the doctrine of chances.” In this work, he -describes how, based on existing knowledge and additional observations -X, the probability can be calculated that a hypothesis -H is true:

-

\[P(H|X) = \frac -{P(X|H)×P(H)}{P(X)}\]

-

In words, the probability (P) that the hypothesis is true, given the -observed data (P(H|X), a-posteriori knowledge), is -equal to the probability of observing the data, assuming the hypothesis -is true (P(X|H), likelihood), times the prior -probability that the hypothesis is true before observing the data (P(H), -a-priori knowledge or prior), divided -by the probability of observing the data regardless of any assumptions -about the hypothesis (P(X), normalization constant).

-

It is important to note that hypotheses are expressed as parameter -values. For example, if our hypothesis is that the mean of a dependent -variable is greater under treatment A than under treatment B, we would -calculate the probability for the parameter mean(A) - mean(B) and obtain -the probability that it is greater than zero.

-
-
-

The principle of Bayesian statistics

-

In Bayesian statistics, this idea is used to combine prior knowledge -from experts/literature (P(H)) with the information contained in newly -collected data X to generate the updated a-posteriori knowledge. We also -see that as a result, we obtain the probability of our hypothesis. This -is much more digestible and closer to our “normal” way of thinking than -interpreting a p-value: “The probability of finding data as -extreme as this or more extreme, if the null hypothesis is -true.”

-

Excursion: Probability distributions

-

To represent probabilities, such as prior knowledge, probability -distributions are used. In the following figure, you can see such a -probability distribution: the higher the density, the more likely the -parameter value plotted on the x-axis. The total area under the curve -always adds up to 1 (exactly some parameter value must always be -true).

- -

So how do we calculate this probability? Unfortunately, it’s usually -not as straightforward as it seems. While we can usually calculate the -likelihood (this is also done in frequentist statistics to determine the -parameter values of our models) and establish our a-priori knowledge -through a probability distribution, things get tricky with the P(X) -part, the probability of the data, at least once we no longer have very -simple models. To do this, we would have to integrate the probability of -the data over all possible parameter values, and this is often not -possible. Fortunately, this problem can be overcome with a simulation -technique developed in the 1980s and 1990s:

-
-

MCMC - Markov Chain Monte Carlo

-

MCMC methods can approximate probability distributions (in this case, -P(H|X)) that cannot be analytically computed. The most intuitive and -quite ingenious explanation I found for this is by Michael Franke on his -Github page at https://michael-franke.github.io/intro-data-analysis/Ch-03-03-estimation-algorithms.html. -Here are some parts of it:

-
- -
Image by brgfx on Freepik
-
-

How the Apples Get to the Trees

-

Every year in spring, Mother Nature sends out her children to -distribute apples on the apple trees. For each tree, the number of -apples should be proportional to the number of leaves: Giant Ralph with -twice as many leaves as Skinny Daniel should also have twice as many -apples in the end. So if there are a total of \(n_a\) apples, and \(L(t)\) is the number of leaves on tree -\(t\), each tree should receive \(A(t)\) apples.

-

\[A(t) = \frac {L(t)}{\sum -L(t')}n_a\]

-

The problem is that Mother Nature cannot calculate \(\sum L(t')\) (the normalization -constant), so she doesn’t know how many leaves all the trees have -together.

-

The children (Markov chains) can count, however. They can’t visit all -the trees and they can’t remember the numbers for very long, but they -can do it for two trees at a time. Here’s what they do: they each start -at a random tree (parameter value), already put an apple in it, count -the leaves \(L(t_1)\) there and then -look for a nearby tree from which they can also count the leaves \(L(t_2)\) (the denominator of our -distribution). If the number of leaves in the second tree is higher, -they will definitely go there and put an apple in it. If it’s lower, -they will only go there with a probability of \(L(t_2)/L(t_1)\) and put an apple there. -They keep walking back and forth between the trees, and in the end, the -apples are roughly evenly distributed. The frequency of visits by the -children has therefore approximated the desired distribution (which -Mother Nature couldn’t calculate)!

-

MCMC methods do the same thing: an MCMC chain randomly draws values -for the model parameters and computes -Result1 = Likelihood of the data * Prior for them. Then it -draws values randomly around the first ones and computes -Result2 = Likelihood of the data * Prior for them. If -Result2 is higher than Result1, it jumps there -and draws new parameter values from there. If Result2 is -lower, it only jumps there with a probability of -Result2/Result1. Then, values are drawn randomly again, and -so on. In the following figure, successful jumps are represented by blue -arrows and rejected jumps by green arrows.

-
- -
Visualization of an MCMC simulation. You have to -imagine that you cannot see the salmon-colored parameter landscape, -which represents the values for Likelihood * Prior, but can only -calculate the values of individual points. Further explanations are -given in the text. Source: https://www.turing.ac.uk/research/research-projects/adaptive-multilevel-mcmc-sampling
-
-

If this process is continued long enough, the distribution of the -blue dots approaches the posterior probability distribution of the -parameters that we want: We have combined a priori knowledge with the -information contained in the newly collected data to successfully obtain -posterior knowledge!

-
-
-
-

A (somewhat contrived) example.

-

We need to decide which chicken breed we want to keep in our orchard. -We already have 3 Kraienköppe and 6 Niederrheiner. We prefer the -Niederrheiner because they are less aggressive, but the Kraienköppe seem -to have a slightly higher egg-laying performance. Is it really advisable -to choose Kraienköppe because of this tendency?

-

1: Definition of the A-priori-probability -distribution

-

We inquire about the egg production of the two breeds and learn that -Niederrheiners lay between 190 and 210 eggs per year, while Kraienkoppes -lay between 200 and 220. Accordingly, we formulate our prior probability -distributions using the brms package:

-

library(brms)

-

To get the brms library to work as expected you may need -the latest rstan and StanHeaders. Run -install.packages("rstan"). -install.packages("StanHeaders"). You may also need to set -up the connection with these libraries with -options(mc.cores = parallel::detectCores()) and -rstan_options(auto_write = TRUE).

-

We define 2 normal distributions using the function set_prior, -one with a mean of 200 and a standard deviation of 1.5, and another with -a mean of 210 and a standard deviation of 1.5. We specify under ‘coef’ -which parameters in the model that we later formulate these priors -belong to.

-

priors <- c(set_prior("normal(200, 5)", coef = "rasseN"), set_prior("normal(210, 5)", coef = "rasseK"))

-

Plot them, to get a feeling of the priors:

-

curve(dnorm(x, 200, 5), from = 170, to = 230, xlab="egg production", ylab="density")

-

2. Collecting new data

-

Now we report our own observations of the egg-laying performance of -the 3 Kraienköppe and 6 Niederrheiners from the previous year:

-

rasse <- c("K", "K", "K", "N", "N", "N", "N", "N", "N")

-

ll <- c(225, 222, 221, 219, 219, 216, 221, 218, 217)

-

daten <- data.frame(rasse=rasse, ll=ll)

-

3. Combining the prior distribution with the data to obtain -the posterior distribution

-

To estimate the posterior distribution, we use the ‘brm’ function of -the brms package. As we know from other evaluations, we -first formulate our model, namely that the egg-laying performance -ll depends on the breed. The -1 in the model -formula causes the model to estimate the mean egg-laying performance for -both breeds rather than just the mean for the first breed and the -difference to it. Under data, we enter our own collected -data and under prior, we define the above priors. The -silent argument determines how much information is output -to the console when the MCMC chains start running. Depending on the -complexity of the model, these simulations can take several seconds or -minutes.

-

Make the model

-

legeleistung <- brms::brm(formula = ll ~ rasse -1, data = daten , prior = priors, silent = 2)

-

4. Interpretation

-

The above code has performed the MCMC simulation and we can now -examine the posterior distributions for the laying performance of the -chicken breeds:

-

Print the summary summary(legeleistung)

-

The summary first shows us our inputs and some information about the -Markov chains. Most interesting to us are, of course, the estimates for -rasseK and rasseN and their credibility intervals. As you can see, the -confidence intervals overlap: the lower value of rasseK (l-95%) is about -218, which is smaller than the upper value (u-95%) of about 219 for -rasseN, which lays fewer eggs.

-

Sigma is the estimated standard deviation of the assumed normal -distribution and is less relevant to us here.

-

plot(legeleistung)

-

In the figure, the results are even easier to interpret. On the left, -you can see the a-posteriori distributions for both breeds (note that -the x-axes of the top two distributions are not aligned). To the right -of that are the dynamics of the Markov chains.

-

From the a-posteriori distributions, we could now also calculate the -exact probability that the egg-laying performance really is different. -However, since we have already seen that the credibility intervals -overlap, we stick to our preference for the Niederrheiners, even though -it is not ruled out that they lay a few eggs less.

-
-
-

Frequentist versus Bayesian

- ---- - - - - - - - - - - - - - - - - - - - - -
FrequentistBayesian
Probability of data, in relation to hypothesisonly yes/no answer
Confidence interval: in which interval do we expect 95% of the means -of further random samples of the same sample sizeCredibility interval: in which range of values does the true -population mean lie with a probability of 95%
Assumption that there is one ‘true’ value, but -observations lead to a distributionAssumption that the observed values are true and have an intrinsic -variance
-

The easier interpretability of Bayesian analyses is best illustrated -by the following figure:

-
- -
Source: Korner-Nievergelt und Hüppop (2016). -Five possible results of estimating an effect, such as the difference in -yield with different fertilization. The dots indicate the estimated -differences, the vertical bars indicate the 95% uncertainty intervals -(confidence interval or credible interval). The results of the -corresponding null hypothesis test are shown in the first row. The -second row shows the posterior probability for the hypothesis that the -effect is “economically relevant”. The background color distinguishes -“economically relevant” (orange) from “economically not relevant” (light -blue) effect sizes schematically.
-
-

An uncritical frequentist approach would result in: “The effect in -group/treatment A and E is significant, so we will use the tested -fertilizer for those groups. There seems to be no effect in the other -groups.”

-

However, there is an important effect in group B as well, and even -though the effect in group E is statistically significant, the effect -size on yield is so small that it probably does not make economic sense -to fertilize additionally.

-

The results of the Bayesian analysis (i.e., the probabilities for the -hypothesis that there is an effect) much more directly reflect these -interpretations.

-
-
-
-

Lecture 9: Bayesian Thinking

- -

Welcome to lecture 9 of Decision Analysis and Forecasting for -Agricultural Development. Feel free to bring up any questions -or concerns in the Slack or to Dr. Cory -Whitney or the course tutor.

-

Please watch the video and answer the multiple choice questions -below.

- - -
-
-
-
-
- -
-
-
-
-
-
-
- -
-
-
-
-
-
-
- -
-
-
-
-
-
-
- -
-
-
-
-
-
-
- -
-
-

-

Summarize for yourself the difference between the frequentist and the -Bayesian approach to knowledge generation. Which one do you feel more -comfortable about? Are you ready to become a Bayesian?

-
-

Group discussion reading:

-
    -
  • Bertsch McGrayne (2011) `The theory that would not die’ (Chapter 1. -Causes in the Air)
  • -
-
-
-
-

Seminar 9: Model forecasts

- -

Welcome to ninth seminar of Decision Analysis and Forecasting -for Agricultural Development.Feel free to bring up any -questions or concerns in the Slack or to Dr. Cory -Whitney or the course tutor.

-

In our seminar meeting each group will present the decision and -proposed impact pathway model that they will be programming for the rest -of the course.

-

Review the process in the Models lecture.

-
-

Building a decision model as an R function

-

We have already built some simple models in the Decision Models lecture and the Model programming seminar. Now we -can explore another simple model with decisionSupport.

-
-
-

Plot the impact pathway

-

Let’s use the graph.formula function from the -igraph library again (Csardi and -Nepusz 2006). This time to create the graphical impact pathway of -an investment into hail nets. Add another factor called ‘Discount’ that -impacts the Net Present Value (NPV) value.

-
-
hail_path <- graph.formula(HailNet -+ Yield, 
-                          HailNet -+ Cost, 
-                          HailEvent -+ Yield,
-                          Yield -+ MarketPrice, 
-                          MarketPrice -+ NPV,
-                          Cost -+ NPV)
-
-plot(hail_path)
- -
-
-
hail_path <- graph.formula(HailNet -+ Yield, 
-                          HailNet -+ Cost, 
-                          HailEvent -+ Yield,
-                          Yield -+ MarketPrice, 
-                          MarketPrice -+ NPV,
-                          Cost -+ NPV,
-                          Discount -+ NPV)
-
-plot(hail_path)
-
-

That was just one of many ways to generate visual impact pathways. To -see more options see the Decision -Analysis Overview lecture materials.

-
-
-

Building the model

-

Here we generate an input table to feed the model function. Update -this with your new management cost variable in the graphical impact -pathway. Call your new variable "p_hail", make the lower -bound 0.02 and the upper bound 0.2, make the -distribution "posnorm", make the label -"% chance hail" and make the description -"Probability of a hail storm".

-
-
hail_estimates <- data.frame(variable = c("yield", 
-                                           "var_CV", 
-                                           "initial_investment", 
-                                           "price"),
-                    lower = c(6000, 20, 500, 5),
-                    median = NA,
-                    upper = c(14000, 20, 1000, 80),
-                    distribution = c("posnorm", 
-                                     "const", 
-                                     "posnorm", 
-                                     "posnorm"),
-                    label = c("Yield (kg/ha)", 
-                              "Coefficient of variation", 
-                              "Investment cost (USD)", 
-                              "Market price (EUR/kg)"),
-                    Description = c("Yield under normal conditions",
-                                    "Coefficient of variation (measure of relative variability)",
-                                    "Investment cost", 
-                                    "Market price achieved for yields (EUR/kg)"))
-
-hail_estimates
- -
-
-
hail_estimates <- data.frame(variable = c("yield", 
-                                           "var_CV", 
-                                           "initial_investment", 
-                                           "price", 
-                                           "p_hail"),
-                    lower = c(6000, 20, 500, 5, 0.02),
-                    median = NA,
-                    upper = c(14000, 20, 1000, 80, 0.2),
-                    distribution = c("posnorm", 
-                                     "const", 
-                                     "posnorm", 
-                                     "posnorm",
-                                     "posnorm"),
-                    label = c("Yield (kg/ha)", 
-                              "Coefficient of variation", 
-                              "Investment cost (USD)", 
-                              "Market price (EUR/kg)", 
-                              "% chance hail"),
-                    Description = c("Yield under normal conditions",
-                                    "Coefficient of variation (measure of relative variability)",
-                                    "Investment cost", 
-                                    "Market price achieved for yields (EUR/kg)", 
-                                    "Probability of a hail storm"))
-
-hail_estimates
-
-

Here we create a function following the graphical impact pathway and -using the inputs above to calculate the Net Present Value for the -investment in hail nets. We use the vv() function from the -decisionSupport package to add more variation over time -(Luedeling et al. 2023). We also use the -chance_event() function to calculate a -hail_adjusted_yield for losses when there is hail.

-
-
hail_function <- function(){
-  
-# use vv() to add variability to the 
-# random draws of yield and of  price 
-# over a 20 year simulation 
-yields <- vv(var_mean = yield, 
-             var_CV = var_CV, 
+yields <- vv(var_mean = yield,
+             var_CV = var_CV,
              n = 20)
 
-prices <- vv(var_mean = price, 
-             var_CV = var_CV, 
+prices <- vv(var_mean = price,
+             var_CV = var_CV,
              n = 20)
 
-# use rep() to simulate the initial_investment 
+# use 'rep' to simulate the investment
 # only in the first year (assuming the net lasts 20 years)
-invest_costs <- c(initial_investment, rep(0, 19))
+invest_costs <- c(initial_investment, rep(0, 19))
 
-# use p_hail in the chance_event() 
-# to adjust yield for probability of hail
-# assuming no yield at all in the event of hail
-hail_adjusted_yield <- chance_event(chance = p_hail, 
+# use 'chance_event' to adjust yield for potential hail
+hail_adjusted_yield <- chance_event(chance = p_hail,
                                     value_if = 0,
                                     value_if_not = yield,
                                     n = 20)
 
 # calculate profit without net
-profit_no_net <- hail_adjusted_yield*prices
+profit_no_net <- hail_adjusted_yield*prices
 
 # calculate profit with the net
-profit_with_net <- (yields*prices)-invest_costs
+profit_with_net <- (yields*prices)-invest_costs
 
-# use 'discount' to calculate net present value 
-# 'discount_rate' is expressed in percent
-NPV_no_net <- discount(profit_no_net, discount_rate = 5, calculate_NPV = TRUE)
-NPV_net <- discount(profit_with_net, discount_rate = 5, calculate_NPV = TRUE)
+# use 'discount' to calculate net present value
+# 'discount_rate' is expressed in percent
+NPV_no_net <- discount(profit_no_net, discount_rate = 5, calculate_NPV = TRUE)
+NPV_net <- discount(profit_with_net, discount_rate = 5, calculate_NPV = TRUE)
 
-# calculate the overall NPV of the decision (do - don't do)
-NPV_decision <- NPV_net-NPV_no_net
+# calculate the overall NPV of the decision (do - don't do)
+NPV_decision <- NPV_net-NPV_no_net
 
 return(list(NPV_no_net =  NPV_no_net,
-            NPV_net =  NPV_net, 
+            NPV_net =  NPV_net,
             NPV_decision = NPV_decision))
-}
- -
-

We can use the mcSimulation() function from the -decisionSupport package to implement a model (Luedeling et al. 2023).

-
-
# Run the Monte Carlo simulation using the model function
-hail_mc_simulation <- mcSimulation(estimate = as.estimate(hail_estimates),
-                              model_function = hail_function,
-                              numberOfModelRuns = 200,
-                              functionSyntax = "plainNames")
-
-hail_mc_simulation
- -
-

Here we show the results of a Monte Carlo simulation (200 model runs) -for estimating the comparative profits with and without hail nets.

-
-
plot_distributions(mcSimulation_object = hail_mc_simulation, 
-                      vars = c("NPV_no_net", "NPV_net"),
-                      method = 'smooth_simple_overlay', 
-                      base_size = 7)
- -
-
-

Value of Information (VoI) analysis

-

Calculate Value of Information (VoI) analysis with the Expected Value -of Perfect Information (EVPI). As we learned in Lecture 8 on forecasts, EVPI helps us -determine if more research will be helpful in understanding the expected -change in the value of a decision outcome through reducing uncertainty -on a given variable.

-

Use the function data.frame() to transform the x and y -outputs of the mcSimulation() function results for EVPI -calculation. We use the multi_EVPI() function to calculate -the EVPI for multiple independent variables. For the first_out_var -argument we choose NPV_decision from the input table since -this is the first output variable (the only variable) and will the -subject of the EVPI.

-
-
# subset the outputs from the mcSimulation function (y) 
-# to run the multi_EVPI only on the variables that the we want 
-# (i.e. the NPV_decision)
-mcSimulation_table_hail <- data.frame(hail_mc_simulation$x, 
-                                 hail_mc_simulation$y[3])
-
-evpi_hail <- multi_EVPI(mc = mcSimulation_table_hail, 
-                   first_out_var = "NPV_decision")
- -
-

We use the function plot_evpi() on the results from -multi_EVPI() to plot the Expected Value of Perfect -Information (EVPI). Here we show the results with the standard settings. -The length of the bars is equal to EVPI.

-
-
plot_evpi(evpi_hail, decision_vars = "NPV_decision")
- -
-
-
-
-

Next steps

-

Once you have followed and run the code above on your machine it is a -good time to look through the outline of these procedures in the example -by Lanzanova et al. (2019) called ‘Sediment -management in a reservoir in Burkina Faso’. It runs a -Monte-Carlo-based selection of sedimentation management strategies for a -reservoir in the Upper Volta River Basin of Burkina Faso (Lanzanova et al. 2019).

-

Tip: If you want to play with this code you can find the Rmarkdown -file and the estimate -table in the GitHub repository.

-
- -
-
-

Lecture 10: Profile of a Decision Analyst

- - -

Welcome to lecture 10 of Decision Analysis and Forecasting -for Agricultural Development. Feel free to bring up any -questions or concerns in the Slack or to Dr. Cory -Whitney or the course tutor.

-

Those who work in Decision Analysis must be able to do a number of -diverse jobs from the facilitation and integration of ideas into models -and also in programming these models. In the following short lecture you -will learn about the skills that are particularly important for -integration of knowledge, facilitation of knowledge -gathering processes and for programming decision models. Please -watch the video and answer the multiple choice questions below.

- - -
-
-
-
-
- -
-
-
-
-
-
-
- -
-
-
-
-
-
-
- -
-
-
-

Reading

-

This week we will read and discuss Chapter 1 of ‘The Flaw of -Averages’ by Savage and Markowitz -(2009).

-

-Savage, Sam L., and Harry M. Markowitz. The Flaw of Averages: Why We -Underestimate Risk in the Face of Uncertainty. John Wiley & Sons, -2009.

-
-
-

Bonus reading

-

Sam Savage’s article The Flaw of -Averages in the Harvard Business Review.

-
-
-
-

Seminar 10: Model functions

- -

Welcome to the tenth seminar of Decision Analysis and -Forecasting for Agricultural Development.Feel free to bring up -any questions or concerns in the Slack or to Dr. Cory -Whitney or the course tutor.

-

In this seminar we will go over some of the functions that are useful -for building parts of our models in the decisionSupport -package.

-
-

The value varier function

-

Many variables vary over time and it may not be desirable to ignore -this variation in time series analyses. The vv function -produces time series that contain variation from a specified mean and a -desired coefficient of variation. A trend can be added to this time -series.

-

The arguments for the vv function include:

-
    -
  • var_mean, which is the mean of the variable to be -varied
  • -
  • var_CV, which is the desired coefficient of variation -(in percent)
  • -
  • n, which is the integer; number of values to -produce
  • -
  • distribution, which is the probability distribution for -the introducing variation. This is currently only implemented for normal -distributions
  • -
  • absolute_trend, which is the absolute increment in the -var_mean in each time step. Defaults to NA, -which means no such absolute value trend is present. If both absolute -and relative trends are specified, only original means are used
  • -
  • relative_trend, which is the relative trend in the -var_mean in each time step (in percent). Defaults to -NA, which means no such relative value trend is present. If -both absolute and relative trends are specified, only original means are -used
  • -
  • lower_limit, which is the lowest possible value for -elements of the resulting vector
  • -
  • upper_limit, which is the upper possible value for -elements of the resulting vector
  • -
-

Note that only one type of trend can be specified. If neither of the -trend parameters are NA, the function uses only the original means

-

The function produces a vector of n numeric values, -representing a variable time series, which initially has the mean -var_mean, and then increases according to the specified -trends. Variation is determined by the given coefficient of variation -var_CV.

-

Create a vector with the vv function and plot with the -base R plot function.

-
-
# reduce var_CV to 5 and change n to 40. Name the output 'valvar' and plot with base R  
-vv(var_mean = 100, 
-   var_CV = 10, 
-   n = 30)
- -
-
-
valvar <- vv(var_mean = 100, 
-             var_CV = 5, 
-             n = 40)
-
-plot(valvar)
-
-

Use the absolute_trend argument and plot with the base R -plot function.

-
-
# reduce var_mean to 50 and make absolute_trend 10. Name the output 'valvar' and plot with base R 
-vv(var_mean = 100, 
-           var_CV = 10, 
-           n = 30, 
-           absolute_trend = 5)
- -
-
-
valvar <- vv(var_mean = 50, 
-           var_CV = 10, 
-           n = 30, 
-           absolute_trend = 10)
-
-plot(valvar)
-
-

Use the relative_trend argument and plot with the base R -plot function.

-
-
# reduce var_CV to 5 and change n to 40. Name the output 'valvar' and plot with base R 
-vv(var_mean = 100, 
-             var_CV = 10, 
-             n = 30, 
-             relative_trend = 5)
- -
-
-
valvar <- vv(var_mean = 100, 
-             var_CV = 5, 
-             n = 40, 
-             relative_trend = 5)
-
-plot(valvar)
-
-
-
-

Simulate occurrence of random events

-

In many simulations, certain events can either occur or not, and -values for dependent variables can depend on which of the cases occurs. -This function randomly simulates whether events occur and returns output -values accordingly. The outputs can be single values or series of -values, with the option of introducing artificial variation into this -dataset.

-

The arguments for the chance_event function include:

-
    -
  • chance, which is the probability that the risky event -will occur (between 0 and 1)
  • -
  • value_if, which is the output value in case the event -occurs. This can be either a single numeric value or a numeric vector. -Defaults to 1.
  • -
  • value_if_not, which is the output value in case the -event does not occur. This can be either a single numeric value or a -numeric vector. If it is a vector, it must have the same length as -value_if
  • -
  • n, which is the number of times the risky event is -simulated. This is ignored if length(value_if)>1.
  • -
  • CV_if, which is the coefficient of variation for -introducing randomness into the value_if data set. This defaults to 0 -for no artificial variation. See documentation for the vv -function for details. CV_if_not, which is the coefficient -of variation for introducing randomness into the value_if_not data set. -This defaults to the value for CV_if. See documentation for -the vv function for details.
  • -
  • one_draw, which is the boolean coefficient indicating -if event occurrence is determined only once (TRUE) with -results applying to all elements of the results vector, or if event -occurrence is determined independently for each element -(FALSE is the default).
  • -
-

The chance_event function provides a numeric vector of -length n, containing outputs of a probabilistic simulation -that assigns value_if if the event occurs, or -value_if_not if is does not occur (both optionally with -artificial variation).

-
-
# decrease the chance and value_if by half and repeat 20 times. Name the output 'chancevar' and plot with base R 
-chance_event(chance = 0.5, 
-             value_if = 6, 
-             n = 10)
- -
-
-
chancevar <- chance_event(chance = 0.25, 
-             value_if = 3, 
-             n = 20)
-
-plot(chancevar)
-
-

Use the value_if_not and CV_if -arguments.

-
-
# make the chance 10 percent, the value_if 5 and the value_if_not 20, repeat 100 times and reduce the coefficient of variation by half. Name the output 'chancevar' and plot with base R.
-chance_event(chance = 0.5,
-             value_if = 1,
-             value_if_not = 5,
-             n = 10,
-             CV_if = 20)
- -
-
-
chancevar <- chance_event(chance = 0.1,
-             value_if = 5,
-             value_if_not = 20,
-             n = 100,
-             CV_if = 10)
-
-plot(chancevar)
-
-
-
-

Gompertz function yield prediction for perennials

-

Yields of trees or other perennial plants have to be simulated in -order to predict the outcomes of many interventions. Unlike annual -crops, however, trees normally yield nothing for a few years after -planting, following which yields gradually increase until they reach a -tree-specific maximum. This is simulated with this function, which -assumes that a Gompertz function is a good way to describe this (based -on the general shape of the curve, not on extensive research…). The -function assumes that yields remain at the maximum level, once this is -reached. For long simulations, this may not be a valid assumption! The -function parameters are estimated based on yield estimates for two -points in time, which the user can specify. They are described by a year -number and by a percentage of the maximum yield that is attained at that -time.

-

The arguments for the gompertz_yield function -include:

-
    -
  • max_harvest, which is the maximum harvest from the tree -(in number of fruits, kg or other units)
  • -
  • time_to_first_yield_estimate, which is the year (or -other time unit) number, for which the first yield estimate is provided -by first_yield_estimate_percent
  • -
  • time_to_second_yield_estimate, which is the year (or -other time unit) number, for which the second yield estimate is provided -by second_yield_estimate_percent
  • -
  • first_yield_estimate_percent percentage of the maximum -yield that is attained in the year (or other time unit) given by -time_to_first_yield_estimate
  • -
  • second_yield_estimate_percent percentage of the maximum -yield that is attained in the year (or other time unit) given by -time_to_second_yield_estimate
  • -
  • n_years, which is the number of years to run the -simulation
  • -
  • var_CV, which is the coefficient indicating how much -variation should be introduced into the time series. If this is one -numeric value, then this value is used for all variables. The default is -0, for a time series with no artificially introduced variation. See -description of the vv function for more details on -this.
  • -
  • no_yield_before_first_estimate, which is the boolean -variable indicating whether yields before the time unit indicated by -time_to_first_yield_estimateshould be 0.
  • -
-

The function provides a vector of n_years numeric -values, describing the simulated yield of the perennial. This starts at -0 and, if the simulation runs for a sufficient number of years, -approaches max_harvest. If var_CV>0, this -time series includes artificial variation.

-
-
# create a vector where the maximum harvest is 500, which is achieved in 10 years (i.e. 100% by the second yield estimate)
-gompertz_yield(max_harvest = 1000,
-               time_to_first_yield_estimate = 5,
-               first_yield_estimate_percent = 10,
-               time_to_second_yield_estimate = 15,
-               second_yield_estimate_percent = 90,
-               n_years = 30)
- -
-
-
gompertz_yield(max_harvest = 500,
-               time_to_first_yield_estimate = 5,
-               first_yield_estimate_percent = 10,
-               time_to_second_yield_estimate = 10,
-               second_yield_estimate_percent = 100,
-               n_years = 30)
-
-

In the seminar you will provide the updated version of your decision -model.

-
-
-

Bonus: Adding correlation to the model

-

To add correlation to your model visit the Adding -correlations to Decision Analysis models vignette.

-
-
-
-

Lecture 11: Communicating Decision Support

-

Welcome to lecture 11 of Decision Analysis and Forecasting -for Agricultural Development. Feel free to bring up any -questions or concerns in the Slack or to Dr. Cory -Whitney or the course tutor.

-
-

Communicating results of Decision Analysis models

-

The results of a Monte Carlo model are not always easy to interpret -and to communicate about. One important step in communicating these -often very large data sets is with good visualization. In previous -lectures and seminars we have covered the basics of plotting the -distributions of model results for comparisons between decision options. -In this lecture we will build on what we learned in the Decision Models lecture and the Model programming seminar to learn -more about how to generate useful plots for communicating model -results.

-

In the Model programming -seminar we generated a model called example_mc_simulation. -As a quick refresher change the plot of the results of that model from -hist_simple_overlay to boxplot_density by -using the method argument in the -plot_distributions function.

- -
-
plot_distributions(mcSimulation_object = example_mc_simulation,
-                   vars = "final_result",
-                   method = "hist_simple_overlay",
-                   old_names = "final_result",
-                   new_names = "Outcome distribution for profits")
- -
-
-
plot_distributions(mcSimulation_object = example_mc_simulation,
-                   vars = "final_result",
-                   method = "boxplot_density",
-                   old_names = "final_result",
-                   new_names = "Outcome distribution for profits")
-
-
-
-

Plot many results together

-

We can use the compound_figure function to create a -simple compound figure of model results and analyses of a binary -decision (do or do not do). The figure includes the distribution of the -expected outcome, the expected cashflow, as well as the variable -importance and the value of information.

-
-
# Create the estimate object:
-
-cost_benefit_table <- data.frame(label = c("Revenue", "Costs"),
-                                  variable = c("revenue", "costs"),
-                                  distribution = c("norm", "norm"),
-                                  lower = c(100,  500),
-                                  median = c(NA, NA),
-                                  upper = c(10000, 5000))
-
-# (a) Define the model function without name for the return value:
-
-profit1 <- function() {
-  Decision <- revenue - costs
-  cashflow <- rnorm(rep(revenue, 20))
-  return(list(Revenues = revenue,
-              Costs = costs, 
-              cashflow = cashflow, 
-              Decision = Decision))
-}
-
-compound_figure(model = profit1, 
-input_table = cost_benefit_table, 
-decision_var_name = "Decision",
-cashflow_var_name = "cashflow",
-model_runs = 1e2, 
-distribution_method = 'smooth_simple_overlay')
- -
-
-
-

Other visualization options

-

Here we demonstrate a few more various graphical options to visualize -uncertainty intervals of outcomes of Monte Carlo simulations. We create -a data set of yield distributions of three different farming practices -and use the function mcmc_intervals() from the -bayesplot library to plot the data set (Gabry and Mahr 2022).

-
-
test <- data.frame("practice 1" = rnorm(n = 1000, mean = 8, sd = 1.5), 
-                   "practice 2" = rnorm(n = 1000, mean = 7, sd = 1), 
-                   "practice 3" = rnorm(n = 1000, mean = 5, sd = 0.5))
-
-
-color_scheme_set("red")
-mcmc_intervals(test,prob = 0.5,prob_outer = 0.9,point_est = "median")
- -
-

Do the same with the with mcmc_areas() function from the -bayesplot library (Gabry and Mahr -2022).

-
-
test <- data.frame("practice 1" = rnorm(n = 1000, mean = 8, sd = 1.5), 
-                   "practice 2" = rnorm(n = 1000, mean = 7, sd = 1), 
-                   "practice 3" = rnorm(n = 1000, mean = 5, sd = 0.5))
-
-color_scheme_set("blue")
-
-mcmc_areas(test,prob = 0.9,point_est = "median")
- -
-
-
-

Comparative density curves

-

We can also use geom_density()in ggplot2 to -compare the spread of different distributions (Wickham, Chang, et al. 2023):

-
-
test <- data.frame("practice 1" = rnorm(n = 1000, mean = 8, sd = 1.5), 
-                   "practice 2" = rnorm(n = 1000, mean = 7, sd = 1), 
-                   "practice 3" = rnorm(n = 1000, mean = 5, sd = 0.5))
-
-stacked_test <- stack(test)
-
-ggplot(stacked_test, 
-       aes(x=values,group=ind,fill=ind )) +
-        geom_density(colour=NA,alpha=.5) +
-        ylab("Probability density") +
-        xlab("Yield")
- -
-
-
-

Comparative histogram

-

Use ggplot2 geom_histogram()function to -show the histogram of the data in comparison:

-
-
test <- data.frame("practice 1" = rnorm(n = 1000, mean = 8, sd = 1.5), 
-                   "practice 2" = rnorm(n = 1000, mean = 7, sd = 1), 
-                   "practice 3" = rnorm(n = 1000, mean = 5, sd = 0.5))
-
-stacked_test <- stack(test)
-
-ggplot(stacked_test,aes(x=values))+ 
-  geom_histogram(data=subset(stacked_test,ind =='practice.1'),
-                 aes(fill = ind), alpha = 0.5, bins = 150) + 
-  geom_histogram(data=subset(stacked_test,ind == 'practice.2'),
-                 aes(fill = ind), alpha = 0.5, bins = 150) +
-  geom_histogram(data=subset(stacked_test,ind == 'practice.3'),
-                 aes(fill = ind), alpha = 0.5, bins = 150) 
- -
-
-
-

Plot cashflow

-

The plot_cashflow function from the -decisionSupport package (Luedeling -et al. 2023) creates a cashflow plot of the returned list of -related outputs from the mcSimulation function using -ggplot2 (Wickham, Chang, et al. 2023). The -function automatically defines quantiles (5 to 95% and 25 to 75%) as -well as a value for the median.

-
-
# Plotting the cashflow:
-
-# Create the estimate object (for multiple options):
-
-variable = c("revenue_option1", "costs_option1", "n_years", 
-             "revenue_option2", "costs_option2")
-distribution = c("norm", "norm", "const", "norm", "norm")
-lower = c(10000,  5000, 20, 8000,  500)
-upper = c(100000, 50000, 20, 80000,  30000)
-
-costBenefitEstimate <- as.estimate(variable, distribution, lower, upper)
-
-# Define the model function without name for the return value:
-
-profit1 <- function(x) {
-  
-cashflow_option1 <- vv(revenue_option1 - costs_option1, n = n_years, var_CV = 100)
-cashflow_option2 <- vv(revenue_option2 - costs_option2, n = n_years, var_CV = 100)
-
-return(list(Revenues_option1 = revenue_option1,
-            Revenues_option2 = revenue_option2,
-            Costs_option1 = costs_option1,
-            Costs_option2 = costs_option2,
-            Cashflow_option_one = cashflow_option1,
-            Cashflow_option_two = cashflow_option2))
-}
-
-# Perform the Monte Carlo simulation:
-
-predictionProfit1 <- mcSimulation(estimate = costBenefitEstimate,
-                                  model_function = profit1,
-                                  numberOfModelRuns = 10000,
-                                  functionSyntax = "plainNames")
-
-
-# Plot the cashflow distribution over time
-
-plot_cashflow(mcSimulation_object = predictionProfit1, cashflow_var_name = "Cashflow_option_one",
-              x_axis_name = "Years with intervention",
-              y_axis_name = "Annual cashflow in USD",
-              color_25_75 = "green4", color_5_95 = "green1",
-              color_median = "red")
- -
-

Plot the cashflow with panels to compare the cashflow distribution -over time for multiple decision options:

-
-
# Plotting the cashflow:
-
-# Create the estimate object (for multiple options):
-
-variable = c("revenue_option1", "costs_option1", "n_years", 
-             "revenue_option2", "costs_option2")
-distribution = c("norm", "norm", "const", "norm", "norm")
-lower = c(10000,  5000, 10, 8000,  500)
-upper = c(100000, 50000, 10, 80000,  30000)
-
-costBenefitEstimate <- as.estimate(variable, distribution, lower, upper)
-
-# Define the model function without name for the return value:
-
-profit1 <- function(x) {
-  
-cashflow_option1 <- vv(revenue_option1 - costs_option1, n = n_years, var_CV = 100)
-cashflow_option2 <- vv(revenue_option2 - costs_option2, n = n_years, var_CV = 100)
-
-return(list(Revenues_option1 = revenue_option1,
-            Revenues_option2 = revenue_option2,
-            Costs_option1 = costs_option1,
-            Costs_option2 = costs_option2,
-            Cashflow_option_one = cashflow_option1,
-            Cashflow_option_two = cashflow_option2))
-}
-
-# Perform the Monte Carlo simulation:
-
-predictionProfit1 <- mcSimulation(estimate = costBenefitEstimate,
-                                  model_function = profit1,
-                                  numberOfModelRuns = 10000,
-                                  functionSyntax = "plainNames")
-
-plot_cashflow(mcSimulation_object = predictionProfit1, 
-              cashflow_var_name = c("Cashflow_option_one", "Cashflow_option_two"),
-              x_axis_name = "Years with intervention",
-              y_axis_name = "Annual cashflow in USD",
-              color_25_75 = "green4", color_5_95 = "green1",
-              color_median = "red", 
-              facet_labels = c("Option 1", "Option 2"))
- -
-
-
-

Reading

- - - -

This week we will read and discuss chapter four of Hubbard’s ‘How To -Measure Anything’.

-

Hubbard, Douglas W. How To Measure Anything: Finding the Value of -Intangibles in Business. 2nd ed. Vol. Second Edition. Hoboken, New -Jersey: John Wiley & Sons, 2014.

-
-
-

Bonus, More plotting options

-
-
-

Violin & box plot overlays

-

Here we use R’s built in OrchardSprays data to run the -example from the tidyverse Violin -plot examples (Wickham 2023b).

-
-
ggplot(OrchardSprays, aes(y = decrease, x = treatment, fill = treatment))+
-  geom_violin() +
-  geom_boxplot(width = 0.1) +
-  theme(legend.position = "none")
- -
-
-
-

Ridge line plot

-

A variation on the example from edav using the -ggridges library (Wilke -2022).

-
-
ggplot(OrchardSprays, 
-       aes(x=decrease,y = treatment,fill = treatment)) +
-  geom_density_ridges_gradient(scale=2) + 
-  theme_ridges() +
-  theme(legend.position = "none")
- -
-

More examples on the rdrr.io CRAN -website.

-

To see more options for plotting high dimensional data visit the High -Domensional Data vignette.

-
-
-
-

Seminar 11: Academic Writing

- -

Welcome to the eleventh seminar of Decision Analysis and -Forecasting for Agricultural Development. Feel free to bring up -any questions or concerns in the Slack or to Dr. Cory -Whitney or the course tutor.

-

In this seminar we will build on some of the things we learned in the -Using RMarkdown seminar. We will use Rmarkdown (Allaire et al. 2023) to create a report on our -R code.

-

To run this you will need to load library(rmarkdown) and -library(knitr).

- - -

Find the RMD file and bib files referenced in the video in the RMarkdown_template -Git repository.

-
    -
  • Open an Rmarkdown file
  • -
-

RMarkdown_options.png

-
    -
  • Open an Rmarkdown file
  • -
-

-
-

Yet Another Markup Language

-
    -
  • When you open an Rmarkdown file you get a default document with -examples and also with the standard setting, as we discussed in the Using RMarkdown seminar. the top of the -document contains information between three dashed lines like this ----. The language within these marks is known as YAML and -sets the general layout and format of your document.
  • -
-

-
-
-

Reuse chunk options

-

In case of repeated formatting in different code chunks we can use -the knitr opts.label option to reuse what we -wrote and avoid repetition (Xie (2023)). -Following the Don’t -Repeat Yourself (DRY) principle.

- - - - -
-
-

Hosting an html

-

You can host your html files in your github repository with the -htmlpreview.github.io tools.

-

The website hosting the raw html file for this course is:

-

http://htmlpreview.github.io/?https://github.com/CWWhitney/Decision_Analysis_Course/blob/main/Index.html

-

This works with the htmlpreview.github.io by taking -adding http://htmlpreview.github.io/? to the first part of -the link to the html file in an open access github repository -https://github.com/CWWhitney/Decision_Analysis_Course/blob/main/Index.html.

-
-
- -
-

References

-

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +} +###### Run HAIL Risk model ####### +hail_mc_simulation <- mcSimulation(estimate = as.estimate(hail_estimates), + model_function = hail_function, + numberOfModelRuns = 100, + functionSyntax = "plainNames") - - - +###### HAIL Risk evpi ####### +mcSimulation_table_hail <- data.frame(hail_mc_simulation$x, + hail_mc_simulation$y[3]) - - - +evpi_hail <- multi_EVPI(mc = mcSimulation_table_hail, + first_out_var = "NPV_decision") - - -

-
-Allaire, JJ, Yihui Xie, Christophe Dervieux, Jonathan McPherson, Javier -Luraschi, Kevin Ushey, Aron Atkins, et al. 2023. Rmarkdown: Dynamic -Documents for r. https://github.com/rstudio/rmarkdown. -
-
-Arnold, Jeffrey B. 2023. Ggthemes: Extra Themes, Scales and Geoms -for Ggplot2. https://github.com/jrnold/ggthemes. -
Bertsch McGrayne, Sharon. 2011. The Theory That Would Not Die: How Bayes’ Rule Cracked the Enigma Code, Hunted down Russian Submarines, & Emerged Triumphant from Two Centuries of Controversy. Yale University Press.
-
-Csardi, Gabor, and Tamas Nepusz. 2006. “The Igraph Software -Package for Complex Network Research.” InterJournal -Complex Systems: 1695. https://igraph.org. -
Do, Hoa, Eike Luedeling, and Cory Whitney. 2020. “Decision Analysis of Agroforestry Options Reveals Adoption Risks for @@ -13596,33 +1376,6 @@

References

Development 40 (3): 20. https://doi.org/10.1007/s13593-020-00624-5.
-
-Fernandez, Eduardo, Gonzalo Rojas, Cory Whitney, Italo F Cuneo, and Eike -Luedeling. 2021. “Data and Scripts: Adapting Sweet -Cherry Orchards to Extreme Weather Events – Decision -Analysis in Support of Farmers’ Investments in -Central Chile.” -
-
-Fernandez, Eduardo, Cory Whitney, and Eike Luedeling. 2021. -“Applying the mcSimulation Function in -decisionSupport.” https://cran.r-project.org/web/packages/decisionSupport/vignettes/example_decision_function.html. -
-
-Gabry, Jonah, and Tristan Mahr. 2022. Bayesplot: Plotting for -Bayesian Models. https://mc-stan.org/bayesplot/. -
-
-Galindo-Prieto, Beatriz, Lennart Eriksson, and Johan Trygg. 2014. -“Variable Influence on Projection (VIP) for -Orthogonal Projections to Latent Structures (OPLS): -Variable Influence on Projection for -OPLS.” Journal of Chemometrics 28 (8): -623–32. https://doi.org/10.1002/cem.2627. -
Howard, Ronald A., and Ali E. Abbas. 2015. Foundations of Decision Analysis. NY, NY: Prentice Hall. @@ -13633,42 +1386,16 @@

References

Intangibles in Business. 2nd ed. Vol. Second Edition. Hoboken, New Jersey: John Wiley & Sons.
-
-Iannone, Richard. 2023. DiagrammeR: Graph/Network -Visualization. https://github.com/rich-iannone/DiagrammeR. -
Kahneman, Daniel, and Patrick Egan. 2011. Thinking, Fast and Slow. Vol. 1. New York: Farrar, Straus; Giroux.
-
-Lanzanova, Denis, Cory Whitney, Keith Shepherd, and Eike Luedeling. -2019. “Improving Development Efficiency Through Decision Analysis: -Reservoir Protection in Burkina -Faso.” Environmental Modelling & -Software 115 (May): 164–75. https://doi.org/10.1016/j.envsoft.2019.01.016. -
Luedeling, Eike, Lutz Goehring, Katja Schiffers, Cory Whitney, and Eduardo Fernandez. 2023. decisionSupport: Quantitative Support of Decision Making Under Uncertainty. http://www.worldagroforestry.org/.
-
-Luedeling, Eike, and JDe Leeuw. 2014. “Water for -Wajir.” Worldagroforestry.org. -
-
-Luedeling, Eike, Arjen L. Oord, Boniface Kiteme, Sarah Ogalleh, Maimbo -Malesu, Keith D. Shepherd, and Jan De Leeuw. 2015. “Fresh -Groundwater for Wajir - Ex-Ante Assessment of Uncertain -Benefits for Multiple Stakeholders in a Water Supply Project in -Northern Kenya.” Frontiers in -Environmental Science 3, article 16: 1–18. https://doi.org/10.3389/fenvs.2015.00016. -
Luedeling, Eike, and Keith Shepherd. 2016. “Decision-Focused Agricultural @@ -13685,22 +1412,6 @@

References

Robinson, Ken. 2017. Out of Our Minds: Learning to Be Creative. Oxford, United Kingdom: John Wiley & Sons.
-
-Rojas, Gonzalo, Eduardo Fernandez, Cory Whitney, Eike Luedeling, and -Italo F. Cuneo. 2021. “Adapting Sweet Cherry Orchards to Extreme -Weather Events – Decision Analysis in Support -of Farmers’ Investments in Central -Chile.” Agricultural Systems 187 -(February): 103031. https://doi.org/10.1016/j.agsy.2020.103031. -
-
-Ruett, Marius, Cory Whitney, and Eike Luedeling. 2020. -“Model-Based Evaluation of Management Options in Ornamental Plant -Nurseries.” Journal of Cleaner Production 271 (June): -122653. https://doi.org/10.1016/j.jclepro.2020.122653. -
Savage, Sam L., and Harry M. Markowitz. 2009. The Flaw of Averages: Why We Underestimate Risk in the Face of Uncertainty. @@ -13711,11 +1422,6 @@

References

Luedeling, and Jan de Leeuw. 2015. “Development Goals Should Enable Decision-Making.” Nature 523 (7559): 152–54.
-
-Slowikowski, Kamil. 2023. Ggrepel: Automatically Position -Non-Overlapping Text Labels with Ggplot2. https://github.com/slowkow/ggrepel. -
Tetlock, Philip E., and Dan Gardner. 2015. Superforecasting: The Art and Science of @@ -13725,64 +1431,6 @@

References

Whitehead, Alfred North, and Bertrand Russell. 2011. Principia Mathematica. San Bernardio, CA: Rough Draft Printing.
-
-Whitney, Cory, Denis Lanzanova, Caroline Muchiri, Keith D. Shepherd, -Todd S. Rosenstock, Michael Krawinkel, John R. S. Tabuti, and Eike -Luedeling. 2018. “Probabilistic Decision Tools for Determining -Impacts of Agricultural Development Policy on Household -Nutrition.” Earth’s Future 6 (3): 359–72. https://doi.org/10.1002/2017EF000765. -
-
-Wickham, Hadley. 2023a. Plyr: Tools for Splitting, Applying and -Combining Data. http://had.co.nz/plyr. -
-
-———. 2023b. Tidyverse: Easily Install and Load the Tidyverse. -https://tidyverse.tidyverse.org. -
-
-Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy -D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. -“Welcome to the tidyverse.” -Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686. -
-
-Wickham, Hadley, Winston Chang, Lionel Henry, Thomas Lin Pedersen, -Kohske Takahashi, Claus Wilke, Kara Woo, Hiroaki Yutani, and Dewey -Dunnington. 2023. Ggplot2: Create Elegant Data Visualisations Using -the Grammar of Graphics. https://ggplot2.tidyverse.org. -
-
-Wickham, Hadley, Romain François, Lionel Henry, Kirill Müller, and Davis -Vaughan. 2023. Dplyr: A Grammar of Data Manipulation. https://dplyr.tidyverse.org. -
-
-Wickham, Hadley, Jim Hester, and Jennifer Bryan. 2023. Readr: Read -Rectangular Text Data. https://readr.tidyverse.org. -
-
-Wilke, Claus O. 2022. Ggridges: Ridgeline Plots in Ggplot2. https://wilkelab.org/ggridges/. -
-
-Wold, Svante, Michael Sjöström, and Lennart Eriksson. 2001. -PLS-Regression: A Basic Tool of -Chemometrics.” Chemometrics and Intelligent Laboratory -Systems, PLS Methods, 58 (2): 109–30. https://doi.org/10.1016/S0169-7439(01)00155-1. -
-
-Xie, Yihui. 2023. Knitr: A General-Purpose Package for Dynamic -Report Generation in r. https://yihui.org/knitr/. -
diff --git a/Lecture_Schedule.Rmd b/Lecture_Schedule.Rmd index 6e8b46d..ddb4927 100644 --- a/Lecture_Schedule.Rmd +++ b/Lecture_Schedule.Rmd @@ -3,6 +3,6 @@ ```{r echo = FALSE, results = 'asis', message=FALSE, warning=FALSE} library(knitr) library(pander) -lecture_schedule<-read.csv("lecture_schedule.csv") +lecture_schedule<-read.csv("lecture_schedule2024.csv") kable(lecture_schedule) ``` diff --git a/Seminar_Schedule.Rmd b/Seminar_Schedule.Rmd index e9bb285..e548389 100644 --- a/Seminar_Schedule.Rmd +++ b/Seminar_Schedule.Rmd @@ -3,6 +3,6 @@ ```{r echo = FALSE, results = 'asis', warning=FALSE} library(knitr) library(pander) -seminar_schedule<-read.csv("seminar_schedule.csv") +seminar_schedule<-read.csv("seminar_schedule2024.csv") kable(seminar_schedule) ``` diff --git a/bib/packages.bib b/bib/packages.bib index 489f1d4..518cc07 100644 --- a/bib/packages.bib +++ b/bib/packages.bib @@ -10,8 +10,8 @@ @Manual{R-base @Manual{R-bayesplot, title = {bayesplot: Plotting for Bayesian Models}, author = {Jonah Gabry and Tristan Mahr}, - year = {2022}, - note = {R package version 1.10.0}, + year = {2024}, + note = {R package version 1.11.1}, url = {https://mc-stan.org/bayesplot/}, } @@ -35,7 +35,7 @@ @Manual{R-dplyr title = {dplyr: A Grammar of Data Manipulation}, author = {Hadley Wickham and Romain François and Lionel Henry and Kirill Müller and Davis Vaughan}, year = {2023}, - note = {R package version 1.1.3}, + note = {R package version 1.1.4}, url = {https://dplyr.tidyverse.org}, } @@ -77,24 +77,25 @@ @Manual{R-ggrepel @Manual{R-ggridges, title = {ggridges: Ridgeline Plots in ggplot2}, author = {Claus O. Wilke}, - year = {2022}, - note = {R package version 0.5.4}, + year = {2024}, + note = {R package version 0.5.6}, url = {https://wilkelab.org/ggridges/}, } @Manual{R-ggthemes, title = {ggthemes: Extra Themes, Scales and Geoms for ggplot2}, author = {Jeffrey B. Arnold}, - year = {2023}, - note = {R package version 5.0.0}, - url = {https://github.com/jrnold/ggthemes}, + year = {2024}, + note = {R package version 5.1.0, +https://github.com/jrnold/ggthemes}, + url = {https://jrnold.github.io/ggthemes/}, } @Manual{R-igraph, title = {igraph: Network Analysis and Visualization}, author = {Gábor Csárdi and Tamás Nepusz and Vincent Traag and Szabolcs Horvát and Fabio Zanini and Daniel Noom and Kirill Müller}, year = {2023}, - note = {R package version 1.6.0}, + note = {R package version 1.5.1}, url = {https://r.igraph.org/}, } @@ -168,16 +169,16 @@ @Manual{R-rmarkdown @Manual{R-shiny, title = {shiny: Web Application Framework for R}, author = {Winston Chang and Joe Cheng and JJ Allaire and Carson Sievert and Barret Schloerke and Yihui Xie and Jeff Allen and Jonathan McPherson and Alan Dipert and Barbara Borges}, - year = {2023}, - note = {R package version 1.7.5.1}, + year = {2024}, + note = {R package version 1.8.1}, url = {https://shiny.posit.co/}, } @Manual{R-stringr, title = {stringr: Simple, Consistent Wrappers for Common String Operations}, author = {Hadley Wickham}, - year = {2022}, - note = {R package version 1.5.0, + year = {2023}, + note = {R package version 1.5.1, https://github.com/tidyverse/stringr}, url = {https://stringr.tidyverse.org}, } diff --git a/lecture_schedule2024.csv b/lecture_schedule2024.csv new file mode 100644 index 0000000..db54a2a --- /dev/null +++ b/lecture_schedule2024.csv @@ -0,0 +1,11 @@ +Lecture,Date,Materials,Reading +1,"Wednesday, April 10, 2024",[Introduction](#introduction),Watch: Sir Ken Robinson / Hans Rosling (optional) +2,"Friday, April 12, 2024",[Decision Analysis Overview](#decision_analysis),@hubbard_how_2014 (Chapter 1. The Challenge of Intangibles) +3,"Wednesday, April 17, 2024",[Defining a Decision](#define_decision),@howard_foundations_2015 (Chapter 1. Introduction to Quality Decision Making) +4,"Friday, April 19, 2024",[Decision Analysis Case Studies](#case_studies),@shepherd_development_2015 +5,"Wednesday, April 24, 2024",[Participatory Methods](#participatory_methods),@luedeling_decision-focused_2016 +6,"Friday, April 26, 2024",[Building Decision Models](#decision_models),@do_decision_2020 +7,"Friday, May 3, 2024",[Using Models to Create Forecasts](#forecasts),@tetlock_superforecasting_2015 (Chapter 1. An Optimistic Skeptic) +8,"Wednesday, May 8, 2024",[Bayesian Thinking](#bayes),@bertsch_mcgrayne_theory_2011 (Chapter 1. Causes in the Air) +9,"Friday, May 10, 2024",[Calibrating experts](#calibration_lecture),@kahneman_thinking_2011 (Chapter 1) +10,"Friday, May 17, 2024",[Profile of a Decision Analyst](#analyst_profile) ,@savage_flaw_2009 (Chapter 1) diff --git a/seminar_schedule2024.csv b/seminar_schedule2024.csv new file mode 100644 index 0000000..ce5f74b --- /dev/null +++ b/seminar_schedule2024.csv @@ -0,0 +1,17 @@ +Week ,Date,Materials,Assignment +1,"Wednesday, May 28, 2024",[Group Work Overview](#tools),Install R and Rstudio +2,"Friday, May 31, 2024",[Using R and RStudio](#r_rstudio),Share R scripts (part 1) +3,"Wednesday, June 5, 2024",[Using R and RStudio continued](#rstudio_continued),Share R scripts (part 2) +4,"Friday, June 7, 2024",[Using RMarkdown](#rmarkdown),Share Rmarkdown file +5,"Wednesday, June 12, 2024",[Using git and Github](#github_git),Start a Github repo +6,"Friday, June 14, 2024",[Model Programming](#model_programming),Share R scripts (part 3) +7,"Wednesday, June 19, 2024",[Calibration training](#calibration_seminar),Participate in calibration training +8,"Friday, June 21, 2024",[Model functions](#model_functions),Share updated model +9,"Wednesday, June 26, 2024",[Value of information](#model_share_seminar),Share R scripts (part 3) +10,"Friday, June 28, 2024",[Communicating Decision Support](#communicating), +11,"Wednesday, July 03, 2024",[Model forecasts](#forecast_seminar),Share initial forecasts +12,"Friday, July 5, 2024",[Citation Management](#citations),Join Zotero Group +13,"Wednesday, July 10, 2024",Guest Lecture , +14,"Friday, July 12, 2024",Consultation , +15,"Wednesday, July 17, 2024",Groups present / discuss final model,Present +16,"Friday, July 19, 2024",Presentations continued,Present