rmarkdown_report.Rmd

---
title: |
  |
  |
  | \vspace{1cm}Literate programming with Python, R, Julia and Stata*^[*Corresponding address: miguel.portela@eeg.uminho.pt. The current template adapts part of the Rmd code by [Paul C. Bauer](https://github.com/paulcbauer/Writing_a_reproducable_paper_in_rmarkdown), Mannheim Centre for European Social Research.]\vspace{0.5cm}
  |
author: |
  | Miguel Portela
  |
  | Universidade do Minho
  |
date: |
  |
  |
  | `r gsub("^0", "", format(Sys.time(), "%d %B, %Y"))`
  |
  |
linestretch: 1.2
colorlinks: true
abstract: \noindent\setstretch{1}In this presentation I will discuss how we can enhance the workflow by using literate programming to combine key features of different statistical packages, namely Stata, R, Julia and Python, on the one hand, and Latex as the typesetting system on the other. The goal is to demonstrate and share a template aiming at producing a highly automated report, or research paper, within the same framework. The tasks will run from exploratory data analysis to regression analysis, where the output, from summary to regression tables and figures, is seamlessly included in the final document. Furthermore, important elements of Latex editing, such as automatic referencing, will be highlighted. We aim at freeing the researcher form repetitive tasks to focus on critical and creative writing. Efficiency and replicability will be at the core of the discussion. RStudio will be used to edit and compile R Markdown. The focus will be on producing PDF outputs. In the presentation I will make use of packages such as bookdown, knitr, stargazer, dlookr, ggplot2, plotly, Statamarkdown, reticulate, JuliaCall, pandas, numpy, matplotlib or FixedEffectModels.\vspace{.8cm}
bibliography: references.bib
csl: american-sociological-association.csl
output:
  bookdown::pdf_document2:
    includes:
    toc: no
    keep_tex: true
    latex_engine: xelatex
mainfont: Times New Roman
sansfont: Times New Roman
fontsize: 12pt
link-citations: true
documentclass: article
geometry: margin=1in
always_allow_html: yes
header-includes:
   - \usepackage{dcolumn}
   - \usepackage{color}
   - \usepackage{pdfpages}
   - \usepackage{amsmath}
---


```{r Setup, include=FALSE}
knitr::opts_chunk$set(cache = FALSE)
# Use chache = TRUE if you want to speed up compilation

# A function to allow for showing some of the inline code
rinline <- function(code){
  html <- '<code  class="r">``` `r CODE` ```</code>'
  sub("CODE", code, html)

  ##https://opensource.com/article/19/5/python-3-default-mac

  Sys.setenv(RETICULATE_PYTHON = "C:/Users/mangelo.EEG/AppData/Local/Microsoft/WindowsApps/python3")

##install.packages("reticulate")
library(reticulate)
##use_python("/Library/Frameworks/Python.framework/Versions/3.8/bin/python3")

use_virtualenv("C:/Users/mangelo.EEG/Documents/python")

##knitr::opts_chunk$set(python.reticulate=FALSE)

# library(devtools) # before this you may need to install devtools
# install_github("hemken/Statamarkdown")

library(JuliaCall)

library(Statamarkdown)
stataexe <- "C:/Program Files/Stata16/StataMP-64.exe"
knitr::opts_chunk$set(engine.path=list(stata=stataexe))

}
```

```{r RETICULATE,echo=FALSE}
Sys.setenv(RETICULATE_PYTHON = "C:/Program Files/Python39/python.exe")
library(reticulate)
use_virtualenv("C:/Users/mangelo.EEG/Documents/python")
```


```{r Libraries Stata and Julia, echo=FALSE, message=FALSE, warning=FALSE, paged.print=FALSE, results="asis"}
library(stargazer)
library(Statamarkdown)
stataexe <- "C:/Program Files/Stata16/StataMP-64.exe"
#stataexe <- "/Applications/Stata15/StataMP.app/Contents/MacOS//stata-mp"
knitr::opts_chunk$set(engine.path=list(stata=stataexe))
library(JuliaCall)
options(JULIA_HOME = "C:/Users/mangelo.EEG/AppData/Local/Programs/Julia-1.6.1/bin")
julia_setup()
```

\clearpage

\renewcommand{\baselinestretch}{0.5}\normalsize
[//]: # \tableofcontents
\renewcommand{\baselinestretch}{1.1}\normalsize

\clearpage


# Exploratory data analysis

I start by exploring the data **NLSWORK** (National Longitudinal Survey.  Young Women 14-26 years of age in 1968).


```{r ExPanDaR, echo=FALSE, message=FALSE, warning=FALSE, paged.print=FALSE, results="asis"}

## ExPanDaR: Explore Panel Data Interactively

  library(ExPanDaR)

    ## type ExPanD() in the Console

setwd("C:/Users/mangelo.EEG/Documents/GitHub/prjs/logs")

library(haven)
library(ggplot2)

nlswork <- read_dta("C:/Users/mangelo.EEG/Documents/GitHub/prjs/data/nlswork.dta")

nls<-data.frame(nlswork)

attach(nlswork)

head(nlswork)

library(stargazer)
stargazer(nls,
          title = "Summary statistics",
          label="tab1",
          table.placement = "ht",
          header=FALSE)

```


```{r Stats, echo=FALSE, message=FALSE}
library(dplyr)
library(dlookr)
library(ggplot2)

##eda_report(nlswork,output_dir = "C:/Users/mangelo.EEG/Documents/GitHub/prjs/reports/",output_file = "eda_report.pdf")

## The data

names(nlswork)
##summary(nlswork)

## Missing values

library("visdat")

  vis_dat(nlswork)

## https://cran.r-project.org/web/packages/naniar/vignettes/naniar-visualisation.html

library(naniar)

  vis_miss(nlswork)

  gg_miss_upset(nlswork)

## GRAPHS
dplyr::glimpse(nlswork$ln_wage)
d <- density(ln_wage)
plot(d)

plot(nls$ln_wage,nls$ttl_exp)

ggplot(nlswork,
       aes(x = hours,
           y = year)) +
geom_miss_point()

ggplot(nlswork,
       aes(x = hours,
           y = year)) +
geom_miss_point() +
facet_wrap(race)

stats <- summary(nlswork$age)

```

The average age in our data is `r round(mean(nlswork$age,na.rm=TRUE),1)`.

# Tables {#sec:tables}
R Markdown PDF is now able to produce good tables with our output. For `stargazer` the label is contained in the function, while for `kable` it's contained in the chunk name.

## stargazer(): Summary and regression tables

Table \@ref(tab1) shows data's summary statistics.^[You can reference the table as \@ref(tab1cars).] `stargazer()` is and excellent solution to export outputs.


```{r Export Stats, echo=TRUE, message=FALSE, warning=FALSE, paged.print=FALSE, results="asis"}
library(stargazer)
stargazer(cars,
          title = "Summary table with stargazer",
          label="tab1cars",
          table.placement = "H",
          header=FALSE)
```

Table \@ref(tab2) reports regression outputs. Name the models as you can refer to their names in the text (M1, M2, M3).

```{r Export Regressions, echo=TRUE, message=FALSE, warning=FALSE, paged.print=FALSE, results="asis"}
library(stargazer)
model1 <- lm(speed ~ dist, data = cars)
model2 <- lm(speed ~ dist, data = cars)
model3 <- lm(dist ~ speed, data = cars)
stargazer(model1, model2, model3,
          title = "Regression table with stargazer",
          label="tab2",
          table.placement = "H",
          column.labels = c("M1", "M2", "M3"),
          model.numbers = FALSE,
          header=FALSE)
```

# Figures

## Graphs with R
You can insert figures like this. One would like to produce and insert them on the fly in the `.rmd` file. Figure \@ref(fig:fig-1) is such an example.

```{r Figures 1, fig-1, fig.align="center", fig.cap="Scatterplot of Speed and Distance", fig.pos="H", message=FALSE, warning=FALSE, paged.print=FALSE}
plot(cars$speed, cars$dist)
```

However, in some cases it does not work.

## Example: ggplot2 graphs
See the `ggplot2` output reported in Figure \@ref(fig:fig-2).

```{r Fig-2, fig.align="center", fig.cap="Miles per gallon according to the weight", fig.pos="H", fig.width=6, fig.height=3, message=FALSE, warning=FALSE, paged.print=FALSE}
mtcars$cyl <- as.factor(mtcars$cyl) # Convert cyl to factor
library(ggplot2)
ggplot(mtcars, aes(x=wt, y=mpg, shape=cyl)) + geom_point() +
  labs(x="Weight (lb/1000)", y = "Miles/(US) gallon",
       shape="Number of \n Cylinders") + theme_classic()
```

## Another example using Plotly
With `Plotly` we can produce interactive graphs which play well, for example, once can embeded in html webpages (drop by  [here](https://paulcbauer.shinyapps.io/visualizing-causal-scenarios/) for an example). One can insert this type of graphs in R Markdown PDF using `Orca` (it generates static images from Plotly graphs). Go [here](https://github.com/plotly/orca#installation) to check how to install it. See Figure \@ref(fig:fig-3) for an example.

```{r Fig-3, message=FALSE, warning=FALSE}
library(plotly)
p <- plot_ly(cars, type = "scatter", mode="markers",
        x=~speed,
        y=~dist)
#Sys.setenv('MAPBOX_TOKEN' = '12423423') # set arbitrary token
#orca(p, "logs/plotly-plot.pdf")
```


\begin{figure}[ht]
\centering
\caption{Example: export a Plotly figure using `orca`}\label{fig:fig-3}
		\includegraphics[width=0.9\linewidth]{logs/plotly-plot.pdf}
\begin{flushleft}
\end{flushleft}
\end{figure}
\vspace{-1.2cm}


# Python

## API data download using Python

```{python Py,eval=TRUE,echo=TRUE}
import sys
print(sys.version)


import json
##from json.decoder import JSONDecodeError
import requests
import numpy as np
import pandas as pd

## INE: https://www.ine.pt/ine/json_indicador/pindica.jsp?
## op=2&varcd=0008074&Dim1=S7A2015&Dim2=200&Dim3=3&lang=PT

# api-endpoint

URL = "https://www.ine.pt/ine/json_indicador/pindica.jsp"

# define parameters

OP="2"
VARCD="0008074"
DIM1="S7A2015"
DIM2="200"
DIM3="3"
LANG="PT"


# defining a params dict for the parameters to be sent to the API
PARAMS = {'op':OP,'varcd':VARCD,'Dim1':DIM1,'Dim2':DIM2,'Dim3':DIM3,'lang':LANG}

# sending get request and saving the response as response object
r = requests.get(url = URL,params=PARAMS)

# extracting data in json format
data = r.json()

valor = data[0]['Dados']['2015'][0]['valor']

valor

```

The criminal rate is `r py$valor`\%o.

\vspace{0.3cm}

## Import data from PDF files

```{bash Bash,eval=FALSE}
  cd C:/Users/mangelo.EEG/Documents/GitHub/prjs/pdfs
	find . -name '*.pdf' -print0 | xargs -0 -n1 pdfsandwich -gray
	find . -name '*ocr.pdf' -print0 | xargs -0 -n1 pdftotext
```


```{python Python 2, eval=TRUE,echo=FALSE}
import os
import numpy as np
import pandas as pd
import re

## CHECK PyPDF2

## wget -A pdf -m -p -E -k -K -np https://joram.madeira.gov.pt/joram/4serie/
## find . -name '*.pdf' -print0 | xargs -0 -n1 pdfsandwich -gray
## find . -name '*ocr.pdf' -print0 | xargs -0 -n1 pdftotext

# Create list with .txt files for the specified folder
files_list = list()
for (dirpath, dirnames, filenames) in os.walk('C:/Users/mangelo.EEG/Documents/GitHub/prjs/pdfs/'):
    files_list += [os.path.join(dirpath, file)
                   for file in filenames if file.endswith('.txt')]


##print("START:FILES -- list")

##print(files_list)

##print("END:FILES -- list")

p1 = r'PORTARIA'
p2 = r'EXTENSAO'
p3 = r'Materiais'
p5 = r'PE das'

linha = []
output = []
other = []
palavra = []
source = []

for file in files_list:

    f = open(file, "r", encoding='latin8')
    data = f.read()
    f.close()

    line = []
    nh = 0

    tmp1 = str(data)
    #print(tmp1)
    tmp2 = tmp1.splitlines()
    #print(tmp2)
    for n,tmp3 in enumerate(tmp2):
        #print(tmp3)
        if (tmp3.find("PE das") == 0):
            tmp4 = tmp3 + tmp2[2]
            line.append(tmp4)
            #print(n)
            nh = 1
        elif (nh == 1):
            nh = 0
            continue
        elif (nh == 0):
            line.append(tmp3)

    print(line)

    print("   ")

    print("FILE: ", file[46:-4])

    for num, word in enumerate(line):
            if num == 0:
                continue
            else:
                match1 = re.search(p1, word)
                match2 = re.search(p2, word)
                match3 = re.search(p3, word)
                match4 = re.search(r'\d{9}', word)
                match5 = re.search(p5, word)
                ##print("   ")
                ##print("START: ",num)

                if match1:
                        ##print("   ")
                        print("match 1")
                        if match4:
                            ##print("   ")
                            print("match 4")
                            linha.append(num)
                            output.append(re.search(r'\d{9}', word).group())
                            other.append("vazio")
                            palavra.append(p1)
                            source.append(file[46:-4])
                elif match2:
                            ##print("   ")
                            print("match 2")
                            linha.append(num)
                            output.append(re.search(r'\d{9}', word).group())
                            other.append("vazio")
                            palavra.append(p2)
                            source.append(file[46:-4])
                elif match3:
                            ##print("   ")
                            print("match 3")
                            linha.append(num)
                            output.append(re.search(r'\d{9}', word).group())
                            other.append("vazio")
                            palavra.append(p3)
                            source.append(file[46:-4])
                elif match5:
                            ##print("   ")
                            print("-> match 5")
                            ##word.sub(" e o ", " e a ",1)
                            print(word)
                            linha.append(num)

                            if (word.find(" e o ") > 0):
                                print("11111")
                                output.append((word.split("re a", 1)[1]).split(" e o ", -1)[-2])
                                other.append((word.split("re a", 1)[1]).split(" e o ", 1)[1])
                            elif (word.find(" e a ") > 0):
                                print("99999")
                                output.append((word.split("re a", 1)[1]).split(" e a ", -1)[-2])
                                other.append((word.split("re a", 1)[1]).split(" e a ", 1)[1])

                            palavra.append(p5)
                            source.append(file[46:-4])
## o paragrafo tem de estar na mesma linha e temos de ter 'e a' em vez de 'e o'
df = pd.DataFrame({'linha': linha, 'output': output,
                   'outra': other, 'source': source})
print(df)

df.to_csv('data/PE.csv', index=False)
df.to_stata('data/PE.dta', write_index = False)

```


And now we use Stata to explore the data.

```{stata Stata Import CSV, echo=TRUE,message=FALSE}

quiet cd "C:/Users/mangelo.EEG/Documents/GitHub/prjs/logs"
quiet import delimited "C:/Users/mangelo.EEG/Documents/GitHub/prjs/data/PE.csv", encoding(ISO-8859-2) clear
tab source

```

## Run a Python script to read a PDF using the shell to run the commands wget, pdfsandwich and pdftotext

```{bash Bash 2}
python3 "C:/Users/mangelo.EEG/Documents/GitHub/prjs/chunks/python_chunk.py"
```

## Use Stata to list the fiscal numbers (read the data produced with Python)

```{stata Codebook, echo=TRUE,message=FALSE}

quietly{
cd C:/Users/mangelo.EEG/Documents/GitHub/prjs/chunks

use C:/Users/mangelo.EEG/Documents/GitHub/prjs/data/nipcs, clear
compress
contract nipc
drop _freq
drop if nipc == .
format %12.0f nipc
}

//codebook nipc

tab nipc
```

# Julia experiments

## Computations

```{julia Julia experiments, echo=FALSE}
## This is a julia language chunk.
## In julia, the command without ending semicolon will trigger the display
## so is JuliaCall package.
## The julia display will follow immediately after the corresponding command
## just as the R code in R Markdown.

using ReadStat
using StatFiles
using StatsBase
using DataFrames
using FixedEffectModels
using RegressionTables

@time results_hdfe1 = reg(DataFrame(load("C:/Users/mangelo.EEG/Documents/GitHub/prjs/data/data_short.dta")), @formula(lnrealwage ~ education + lnsales + fe(workerid) + fe(year)));

@time results_hdfe2 = reg(DataFrame(load("C:/Users/mangelo.EEG/Documents/GitHub/prjs/data/data_short.dta")), @formula(lnrealwage ~ education + lnsales + fe(workerid) + fe(firmid) + fe(year)));

regtable(results_hdfe1,results_hdfe2; renderSettings = latexOutput("logs/hdfe_output.tex"))

VERSION
```

## Grab results in R

```{r Grab results in R, echo=FALSE}
library(JuliaCall)

  julia_eval("results_hdfe2")

betas <- julia_eval("coef(results_hdfe2)")
r2 <- julia_eval("r2(results_hdfe2)")
```

## Insert Julia's results

\vspace{0.3cm}

The estimated return to education is `r round(betas[1:1]*100,1)`\%. The model has an $R^2$ of `r round(r2,4)`.

\vspace{0.3cm}

## Computation time: run the HDFE with Stata and R and compare to Julia

```{stata HDFE with Stata, echo=TRUE,message=FALSE}

use C:/Users/mangelo.EEG/Documents/GitHub/prjs/data/data_short, clear

timer on 1

	reghdfe lnrealwage education lnsales,absorb(workerid firmid year)

timer off 1
timer list 1
timer clear 1
```

```{r,eval=FALSE,echo=TRUE}
library(lfe)
data_short <- read_dta("C:/Users/mangelo.EEG/Documents/GitHub/prjs/data/data_short.dta")

system.time(est_hdfe <- felm(data_short$lnrealwage ~ data_short$education + data_short$lnsales | data_short$workerid + data_short$firmid + data_short$year))

summary(est_hdfe)
```


## Output Julia's table for HDFE

\begin{table}[ht]
\label{tab:hdfe}
  \input{logs/hdfe_output}
\end{table}


# Miguel's tests

## R

Table \@ref(tab3) ... See Section \@ref(sec:stata)

Example of an equation

$$\int_0^{2\pi} \sin x~dx$$

*Example of a matrix*


$$
\mathbf{X} = \left[\begin{array}
{rrr}
1 & 2 & 3 \\
4 & 5 & 6 \\
7 & 8 & 9
\end{array}\right]
$$

or

\begin{equation}
f\left(k\right)=\binom{n}{k}p^k\left(1-p\right)^{n-k} \label{eq:binom}
\end{equation}


See Equation \@ref(eq:binom).


\begin{align}
y_{ijt} = \beta x_{ijt} + \eta_i + \gamma_j + \lambda_t + \varepsilon_{ijt}
\end{align}


```{r echo=FALSE, message=FALSE, warning=FALSE, paged.print=FALSE, results="asis"}
library(stargazer)
library(Statamarkdown)
stataexe <- "C:/Program Files/Stata16/StataMP-64.exe"
knitr::opts_chunk$set(engine.path=list(stata=stataexe))

setwd("C:/Users/mangelo.EEG/Documents/GitHub/prjs/logs")
rm(list = ls())
library(haven)
nlswork <- read_dta("../data/nlswork.dta")

auto <- read_dta("../data/auto.dta")

attach(nlswork)

regs1 <- lm(auto$price ~ auto$mpg + auto$weight)
regs2 <- lm(auto$price ~ auto$mpg + auto$weight + auto$rep78)
regs3 <- lm(auto$price ~ auto$mpg + auto$weight + auto$rep78 + auto$trunk)

regs4 <- lm(ln_wage ~ union)
regs5 <- lm(ln_wage ~ union + collgrad)
regs6 <- lm(ln_wage ~ union + collgrad + age)

##summary(auto)
##summary(regs1)

## https://www.jakeruss.com/cheatsheets/stargazer/

nls<-data.frame(nlswork)

stargazer(nls, summary.stat = c("n", "p75", "sd"), summary.logical = FALSE,
          title = "Summary table",
          label="tab23",
          table.placement = "ht",
          header=FALSE)


stargazer(regs1, regs2, regs3,
          title = "Regression table with stargazer",
          label="tab3",
          table.placement = "ht",
          column.labels = c("M1", "M2", "M3"),
          model.numbers = FALSE,
          header=FALSE,keep=c(0,1,2,3))

attach(auto)


library(naniar)
vis_miss(nlswork)

# plot(y=price,x=mpg)

```


```{r echo=TRUE, message=FALSE, warning=FALSE, paged.print=FALSE, results="asis"}
library(stargazer)
stargazer(cars,
          title = "Summary 24",
          label="tab24",
          table.placement = "ht",
          header=FALSE)
```


## Stata {#sec:stata}

This a Stata example, @arellano2003panel. See also @arellano1991some and @bb98. While ... [check @arellano1995another].


```{stata, echo=FALSE,message=FALSE}
quiet sysuse auto
sum price

tab rep78

quiet cd "C:/Users/mangelo.EEG/Documents/GitHub/prjs/logs"

quiet use ../data/nlswork, clear

twoway (kdensity ln_wage if collgrad == 0) || (kdensity ln_wage if collgrad == 1), scheme(sj) graphregion(color(white)) legend(label(1 "No College") label(2 "College")) legend(region(lwidth(none))) ytitle("Density")

graph export "C:/Users/mangelo.EEG/Documents/GitHub/prjs/logs/density.pdf", replace

use ../data/data_full, clear
		quiet generate lngdp = ln(rgdpwok)
		quiet ge lnk = ln(capital)

		label var rgdpwok "Real GDP per worker"
		label var education "Education (in years)"
		label var capital "Capital"
		label var open "Degree of openness"

// # regression analysis

	quiet reg lngdp education
		estimates store r1

	quiet reg lngdp education lnk
		est store r2

	reg lngdp education lnk openk i.year
		est store r3

outreg, clear
	quiet estimates restore r1
		outreg using growth_analysis_frag, tex fragment replace rtitles("Education" \ "" \ "Capital" \ "" \ "Openness degree" \ "")  /*
				*/ drop(_cons) /*
				*/ ctitle("","Simple model") /*
				*/ nodisplay varlabels bdec(4) se starlevels(10 5 1) starloc(1) summstat(r2\rmse \ N) summtitle("R2"\"RMSE" \ "N")

	quiet estimates restore r2
		outreg using growth_analysis_frag, tex fragment merge rtitles("Education" \ "" \ "Capital" \ "" \ "Openness degree" \ "")  /*
				*/ drop(_cons) /*
				*/ ctitle("","Include capital") /*
				*/ nodisplay varlabels bdec(3) se starlevels(10 5 1) starloc(1) summstat(r2\rmse \ N) summtitle("R2"\"RMSE" \ "N")

	quiet estimates restore r3
		outreg using growth_analysis_frag, tex fragment merge rtitles("Education" \ "" \ "Capital" \ "" \ "Openness degree" \ "")  /*
				*/ drop(_cons 1975.year 1980.year 1985.year 1990.year) /*
				*/ ctitle("","Full model") /*
				*/ nodisplay varlabels bdec(1) se starlevels(10 5 1) starloc(1) summstat(r2\rmse \ N) summtitle("R2"\"RMSE" \ "N")

sum lngdp
```

## Grab Stata's output

```{stata , collectcode=TRUE} 
use C:/Users/mangelo.EEG/Documents/GitHub/prjs/data/data_full, clear
		quiet generate lngdp = ln(rgdpwok)
      summarize lngdp
```
		
```{stata for_inline, include=FALSE} 
file open myfile using example.txt, write replace
file write myfile `"`r(mean)'"'
file close myfile
```

\vspace{0.3cm}

The mean ln GDP is `r format(as.numeric(readLines('example.txt')), digits = 2)`.

\vspace{0.3cm}

```{r cleanup, include=FALSE} 
unlink("example.txt")
```

See https://www.ssc.wisc.edu/~hemken/Stataworkshops/stata.html#stata-and-r-markdown-the-statamarkdown-package

![Wage density](logs/density.pdf)


\begin{table}[ht]
\caption{Regression analysis}
\label{tab:stata}
  \input{logs/growth_analysis_frag}
\end{table}

## Use Stata to export statistics to Excel

We now export a set of statistics to an Excel file.

```{stata, echo=FALSE,message=FALSE}
version
//ado describe

findfile xtabond2.ado
checksum "c:/ado/plus/x/xtabond2.ado"

// PUTEXCEL

cd "C:/Users/mangelo.EEG/Documents/GitHub/prjs/logs"

quiet use ../data/graph_data, clear
	codebook, compact

			putexcel clear
			putexcel set descriptives.xlsx, sheet("Avg. Educ. & desc.") replace
			

gen first = substr(country,1,1)

	levelsof first,local(ff)
	
	foreach vv of local ff {
	
		di _new(3) "Country's first letter:	`vv'"
		
		preserve
		quiet keep if first == "`vv'"
		
		quiet unique country
			
			if r(unique) > 5 {
			di _new(2) "	Number of countries:	" r(unique) _new(1)
			quietly {
				collapse (mean) lngdp education,by(country)
					putexcel set descriptives.xlsx, sheet("FIRST LETTER `vv'") modify
					
					regress lngdp education
						
							matrix list r(table)
						
						matrix results = r(table)
							mat l results
						
						mat b = results[1,1...]'
						mat t = results[3,1...]'
						
						putexcel C2="Coef." F2="t"
						putexcel B3 = matrix(b), rownames nformat(number_d2) right
						putexcel D3 = matrix(t),nformat("0.00")
				}
			}
			
			if r(unique) <= 5 {
				// di _new(2) "	Insufficient number of countries; n countries = " r(unique) _new(1)
			}
			
		restore
}

// tabulate, summarize() -- EXAMPLE

tabulate first year, summarize(education) nost nof noob

collapse (mean) education,by(first year)

reshape wide education,i(first) j(year)
mkmat education*,matrix(mean_educ) rownames(first)

putexcel set descriptives.xlsx, sheet("Mean Education") modify

	putexcel C2="1960" D2="1965" E2="1970" F2="1975" G2="1980" H2="1985" I2="1990" J2="1995" K2="2000"
	putexcel B3 = matrix(mean_educ), rownames nformat(number_d2) right
```


See Figure \@ref(fig:fig-tmp).

```{r fig-tmp, echo=FALSE, fig.align="center", fig.cap="Scatterplot test MP", fig.pos="ht", message=FALSE, warning=FALSE, paged.print=FALSE}
plot(x = mpg, y = price,
     pch = 16, frame = FALSE,
     xlab = "wt", ylab = "mpg", col = "#2E9FDF")
```


# Final remarks

Check the replication package for Bonhomme, Lamadon and Manresa (2019): https://github.com/tlamadon/blm-replicate

\newpage

# References {-}


<div id="refs"></div>


# Appendix: Chunk options {-}

## Software versioning

### R

```{r fig-versioning, echo=TRUE}
cat(paste("#", capture.output(sessionInfo()), "\n", collapse =""))
  # or use message() instead of cat()
```

### Python

```{python py2,eval=TRUE,echo=TRUE}
import sys
print(sys.version)
```

### Julia

```{julia echo=FALSE}
VERSION
```

### Stata

```{stata, echo=FALSE,message=FALSE}
version
//ado describe

findfile xtabond2.ado
checksum "c:/ado/plus/x/xtabond2.ado"
```

## All the code in the paper

To simply attach all the code you used in the PDF file in the appendix see the R chunk in the underlying `.rmd` file:

```{r, ref.label=knitr::all_labels(),echo=TRUE,eval=FALSE}
```

<!-- ## Exploratory data analysis report -->

<!-- [//]: # \includepdf[pages=-]{C:/Users/mangelo.EEG/Documents/GitHub/prjs/reports/eda_report.pdf} -->