-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.qmd
106 lines (68 loc) · 7.9 KB
/
index.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
---
title: "Welcome!"
engine: knitr
---
I think of myself as a competent [{{< fa brands r-project >}} R]{.r} coder but am a total novice when it comes to [{{< fa brands python >}} Python]{.py}. This repository is my attempt at forcing myself to 'eat my vegetables' and gain basic competency in [{{< fa brands python >}} Python]{.py}. I think trying for a 1-to-1 [{{< fa brands r-project >}} R]{.r} translation to [{{< fa brands python >}} Python]{.py} will be an effective learning method (at least to start) and enshrining it in a Quarto website will keep me rigorous about documenting my process.
## Package Installation
Both coding languages rely on packages to install specific functions that are absent from the 'base' version of either [{{< fa brands python >}} Python]{.py} or [{{< fa brands r-project >}} R]{.r}. The following code chunks are not evaluated in the building of this website but you'll need to install these packages on your local machine (if you haven't already done so) in order to run the code in the rest of this translation tutorial.
:::panel-tabset
## [{{< fa brands r-project >}} R]{.r}
[{{< fa brands r-project >}} R]{.r} contains an `install.packages` function for installing packages desired libraries.
```{r r-pkg-install, eval = F}
# The 'tidyverse' meta-package contains many of the tools we'll need
install.packages("tidyverse")
# For SQL operations, DBI is needed
install.packages("DBI")
# For spatial operations we'll need sf and terra
install.packages("sf")
install.packages("terra")
```
## [{{< fa brands python >}} Python]{.py}
[{{< fa brands python >}} Python]{.py} packages must be installed via the `r fontawesome::fa(name = "terminal", fill = "#EB6534", a11y = "sem")` [command line]{.cli}.
```{bash py-pkg-install, eval = F}
# The numpy and pandas packages contain fundamental Python tools
python3 -m pip install pandas
python3 -m pip install numpy
# For plotting we'll want some other packages
python3 -m pip install matplotlib
python3 -m pip install plotnine
## plotnine is Python's ggplot2
# For SQL operations we'll need the sqlite3 package
python3 -m pip install sqlite3
# For spatial operations we'll need a few packages
python3 -m pip install geopandas
python3 -m pip install rasterio
python3 -m pip install rioxarray
```
:::
## Section Overviews
### Fundamentals
When learning a new programming language, it can be really helpful to begin with dramatically simplified examples to demonstrate crucial concepts. We can also build upon really core concepts into more nuanced fundamentals like automation or [string]{.py}/[character]{.r} methods.
### Data Wrangling
The beating heart of my day-to-day work revolves around data 'wrangling'. I view 'wrangling' as any _scripted_ data manipulations between the very first raw data entered digitally and the data being ready for any analysis/visualization. This covers a huge swath of operations and should allow me to explore [{{< fa brands python >}} Python]{.py} equivalents to many of the [{{< fa brands r-project >}} R]{.r} operations that I know and love.
### Visualization
Somewhat self-explanatory but this section is all about data visualization! While visualization can be an effective quality control tool it is also useful in data exploration and--eventually--to generate publication- or report-quality graphics. This section attempts to cover the fundamentals of data "viz" in both languages but is by no means exhaustive!
### Spatial
Spatial data are really widely-used and are often a particular reason why [R]{.r}-focused people are encouraged to shift to [Python]{.py}. While this section takes no particular stance on which language to use, some fundamental considerations for spatial data are addressed as well as some core tools that apply to raster or vector data (two major structures of spatial data).
### SQL
`r fontawesome::fa(name = "database", fill = "#283E82", a11y = "sem")` [SQL]{.sql} is a powerful programming language in its own right that is intended to work with relational databases. Relational databases include several data tables of various sizes/structures that share some common 'index' columns that allow them to be combined as needed. Both [{{< fa brands python >}} Python]{.py} and [{{< fa brands r-project >}} R]{.r} allow users to access these databases using `r fontawesome::fa(name = "database", fill = "#283E82", a11y = "sem")` [SQL]{.sql} syntax while still living in their preferred coding language. This section highlights some of the major considerations when working with databases through either language though it is _not_ a tutorial on `r fontawesome::fa(name = "database", fill = "#283E82", a11y = "sem")` [SQL]{.sql}'s syntax itself.
### Glossary
As the heading would suggest, I'm housing various term definitions here. As of now, it makes most sense to me to provide the definition for a concept and then give the term in [{{< fa brands python >}} Python]{.py} & [{{< fa brands r-project >}} R]{.r}. Note that I also give a more functional definition of major concepts in the code tutorial pages upon first mention.
## Contributing Guidelines
{{< include /collabs.qmd >}}
## Additional Resources
If this is of interest, consider checking out these useful coding language-specific tutorials and other attempts at [{{< fa brands r-project >}} R]{.r}/[{{< fa brands python >}} Python]{.py} bilingualism. See below:
### Bilingualism Resources
- The [EEOB-BioData](https://github.com/EEOB-BioData) faculty at [Iowa State University](https://www.iastate.edu/) offer a ["Computational Skills For Biological Data"](https://eeob-biodata.github.io/EEOB-BCB-546X/) course that covers R and Python (and Unix)
- [ESIIL](https://esiil.org/) (<u>E</u>nvironmental Data <u>S</u>cience <u>I</u>nnovation and <u>I</u>nclusion <u>L</u>ab) has created [an R-Python bilingualism tutorial](https://cu-esiil.github.io/pre-innovation-summit-training/2_R_and_Py_bilingualism/code/outline/outline/) that is framed for a more applied audience
- [Marie Rivers](https://github.com/marierivers) also made [a website](https://marierivers.github.io/marie_quarto/) for exploring Quarto more generally that I've found super helpful for the part of it that deals with R/Python bilingualism
- [Rebecca Barter](https://rebeccabarter.com/) made a nice blog post titled "[An Introduction to Python for R Users](https://rebeccabarter.com/blog/2023-09-11-from_r_to_python?utm_source=substack&utm_medium=email)" that focuses on comparing [{{< fa brands python >}} Python]{.py} operations to the _logic_ of [{{< fa brands r-project >}} R]{.r} (rather than side-by-side comparison)
### [{{< fa brands python >}} Python]{.py} Resources
- The Carpentries has a "[Data Analysis and Visualization in Python for Ecologists ](https://datacarpentry.org/python-ecology-lesson/index.html)" lesson that is really well put-together (as is characteristic of The Carpentries' content)
- The [Earth Lab](https://www.earthdatascience.org/) has a Python course that was recommended to me by an NCEAS employee (specifically [section 4](https://www.earthdatascience.org/courses/intro-to-earth-data-science/python-code-fundamentals/))
- Dr. Diba Mirza taught a UCSB Computer Science ([CS8](https://ucsb-cs8.github.io/w19-mirza/lectures_list/)) course on Python
- There's a nice [30 Days of Python](https://github.com/Asabeneh/30-Days-Of-Python) challenge on GitHub you can work through at your own pace that covers a breadth of Python skills
### [{{< fa brands r-project >}} R]{.r} Resources
- [R for Data Science](https://r4ds.hadley.nz/) is a free online book that does a really nice job covering a lot of powerful R topics
- I've created an "[R for Biologists](https://njlyon0.github.io/teach_r-for-biologists/)" course aimed at teaching undergraduates R which may be helpful if you have limited prior experience with this language and/or want a formal scaffold in which to tie your existing knowledge
- The Carpentries also has a "[Data Analysis and Visualization in Rfor Ecologists](https://datacarpentry.org/R-ecology-lesson/)" that is very well done