Skip to content

Commit

Permalink
prepare for next version
Browse files Browse the repository at this point in the history
  • Loading branch information
leovan committed Jan 21, 2023
1 parent 8e7413f commit c850b10
Show file tree
Hide file tree
Showing 367 changed files with 42,676 additions and 606 deletions.
2 changes: 0 additions & 2 deletions .gitattributes

This file was deleted.

27 changes: 20 additions & 7 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,14 +1,27 @@
# R
# R Blogdown
blogdown
public
/resources/_gen
.hugo_build.lock

# R Common
.Rhistory
.Rproj.user
.RData
.Ruserdata

# vscode
# VS Code
.vscode

# Python
*.pyd
*.pyc
*.pyo
.ipynb_checkpoints

# JetBrains
.idea

# system
.DS_Store
*.log
Expand All @@ -18,11 +31,11 @@
base/fonts

# slide source
/*/slide/*_cache
/*/slide/*_files
/*/slide/libs
/*/slide/generated
/*/slide/*.html
/slides/**/*_cache
/slides/**/*_files
/slides/**/libs
/slides/**/generated
/slides/**/*.html

# others
*.h5
151 changes: 7 additions & 144 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,36 +1,18 @@
# Data Science Introduction with R <img src="docs/images/web/data-science-introduction-with-r.png" align="right" alt="logo" height = "100" style = "border: none; float: right;">
# R 语言数据科学导论 <img src="images/data-science-introduction-with-r.png" align="right" alt="logo" height="100" style="border: none; float: right;">

![Release](https://img.shields.io/github/release/leovan/data-science-introduction-with-r.svg)
![License](https://img.shields.io/badge/license-CC%20BY--NC--SA%204.0-blue.svg)
![Issues](https://img.shields.io/github/issues/leovan/data-science-introduction-with-r.svg)

---

## 简介 - Introduction
## 简介

1. 本项目是一套以 R 为分析语言的数据科学入门教程。
2. 托管网站: https://ds-r.leovan.tech
3. Git 仓库目录结构:
- base 目录:幻灯片相关配置文件
- docs 目录:其他资料
- 其他一级目录:
- 二级目录:
- *.pdf:本节课程幻灯片
- data:本节课程所需数据文件
- slide:本节课程幻灯片源代码
4. 本项目遵守 [CC BY-NC-SA 4.0](http://creativecommons.org/licenses/by-nc-sa/4.0/) 协议。
本项目是以 R 语言为基础的数据科学入门教程。

## 准备 - Preparation
在线版本托管在 https://ds-r.leovan.tech 上,源代码存储在 [Github](https://github.com/leovan/data-science-introduction-with-r) 中。

1. 操作系统: Windows 10+ (x64),macOS 10.12+,Ubuntu 16.04+
2. R 最新版本 ([下载地址](https://cloud.r-project.org/))
3. RStudio: 最新 Preview 版本 ([下载地址](https://www.rstudio.com/products/rstudio/download/preview/),Preview 版本有些新特性比较实用)
4. Visual Studio Code: 最新版本 ([下载地址](https://code.visualstudio.com/),用于代码浏览和编辑)
5. Python: 最新版本 Anaconda Python 3 ([下载地址](https://www.anaconda.com/download/),用于 Jupyter 基础环境)
6. Visual Studio Code: 最新版本 ([下载地址](https://code.visualstudio.com/),用于代码浏览和编辑)
7. Typora: 最新版本 ([下载地址](http://typora.io),用于 Markdown 浏览)

## 参考书籍 - Reference
## 参考

1. 《R语言实战》(R in Action),Robert I. Kabacoff 著,王小宁、刘撷芯、黄俊文 等 译
2. 《R数据科学》(R for Data Science),Hadley Wickham & Garrett Grolemund 著,陈光欣 译
Expand All @@ -41,125 +23,6 @@
7. 《机器学习》周志华 著
8. 《深度学习》(Deep Learning),Ian Goodfellow, Yoshua Bengio & Aaron Courville 著,赵申剑、黎彧君、符天凡、李凯 译

## 数据科学简介 - Data Science Introduction

1. 数据科学概念
- 数据科学
- 数据产品
- 跨界
2. 数据科学工具箱
- 数据科学常用工具
- 数据科学之战:R 和 Python
- 选择哪种语言
3. 数据科学分工与流程
- 数据科学分工
- 数据分析和挖掘流程

## R 语言简介 - R Language Introduction

1. R 相关环境配置
2. R 基础语法
3. R 对象,函数和扩展包
4. R 数据结构
5. R 语言编码风格

## 数据分析基础 (上) - Data Analytics Introduction - Part 1

1. 大神的工具箱
2. 数据导入和导出
3. 数据转换和规整

## 数据分析基础 (下) - Data Analytics Introduction - Part 2

1. 关系数据处理
2. 不同类型数据处理
3. 函数式编程

## 数据可视化 - Data Visualization

1. 数据可视化
2. ggplot2
3. 基于 Web 的绘图库

## 统计分析基础 - Statistical Analytics Introduction

1. 探索性分析
- 描述性统计量
- 常用分布
2. 实验设计
- 假设检验概念
- 常用假设检验
3. 线性回归
- 一元线性回归
- 多元线性回归
- 广义线性回归
- 最小二乘法与梯度下降

## 特征工程 - Feature Engineering

1. 数据预处理
- 数据清洗
- 缺失值,重复值,异常值处理
- 数据采样,数据集分割
2. 特征变换和编码
- 无量纲化
- 分箱
- 哑变量化
3. 特征提取,选择和监控
- 特征提取
- 特征选择
- 特征监控

## 模型评估 & 超参数优化 - Model Evaluation & Hyperparameter Optimization

1. 模型性能评估
- 回归问题
- 分类问题
- 聚类问题
2. 模型生成和选择
- 过拟合问题
- 评估方法
- 偏差和方差
3. 超参数优化
- 搜索算法
- 进化和群体算法
- 贝叶斯优化

## 分类算法 (上) - Classification Algorithms - Part 1

1. 逻辑回归
2. 决策树

## 分类算法 (下) - Classification Algorithms - Part 2

1. Bagging
2. Boosting
3. Stacking

## 时间序列算法 - Time Series Algorithms

1. 时间序列
2. ARIMA 模型
3. 季节性分析
4. Prophet

## 聚类算法 - Clustering Algorithms

1. K-means
2. 层次聚类
3. 基于密度的聚类

## 可重复性研究 - Reproducible Research

1. 可重复性研究
2. Markdown
3. R Markdown
4. Jupyter
5. 版本控制

## 深度学习算法 - Deep Learning Algorithms
## 授权

1. 人工神经网络
2. 卷积神经网络
3. 循环神经网络
4. 深度学习框架
本项目遵守 [CC BY-NC-SA 4.0](http://creativecommons.org/licenses/by-nc-sa/4.0/) 协议。
3 changes: 0 additions & 3 deletions _redirects

This file was deleted.

45 changes: 0 additions & 45 deletions base/scripts/link-base.sh

This file was deleted.

1 change: 0 additions & 1 deletion classification-algorithms-part-1/slide/assets

This file was deleted.

1 change: 0 additions & 1 deletion classification-algorithms-part-1/slide/css

This file was deleted.

1 change: 0 additions & 1 deletion classification-algorithms-part-1/slide/includes

This file was deleted.

1 change: 0 additions & 1 deletion classification-algorithms-part-2/slide/assets

This file was deleted.

1 change: 0 additions & 1 deletion classification-algorithms-part-2/slide/css

This file was deleted.

1 change: 0 additions & 1 deletion classification-algorithms-part-2/slide/includes

This file was deleted.

1 change: 0 additions & 1 deletion clustering-algorithms/slide/assets

This file was deleted.

1 change: 0 additions & 1 deletion clustering-algorithms/slide/css

This file was deleted.

1 change: 0 additions & 1 deletion clustering-algorithms/slide/includes

This file was deleted.

93 changes: 93 additions & 0 deletions config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
baseURL: "https://ds-r.leovan.tech/"
languageCode: "zh-cn"
title: "R 语言数据科学导论 | Data Science Introduction with R"
googleAnalytics: "G-MCPZ7NTLEX"
ignoreFiles: [
"\\.Rmarkdown$", "\\.Rmd$", "\\.knit\\.md$", "\\.utf8\\.md$",
"_files$", "_cache$", "\\.txt$", "\\.csv$", "\\.tsv$",
"\\.tgz$", "\\.tar$", "\\.gz$", "\\.zip$", "\\.npz$"]
hasCJKLanguage: true
enableEmoji: true
rssLimit: 100

disableKinds: ["taxonomy", "term"]

permalinks:
lecture: "/lecture/:slug/"
doc: "/doc/:slug/"

menu:
main:
- name: "讲义"
url: "/lecture/"
weight: 1
- name: "文档"
url: "/doc/"
weight: 2
- name: "关于"
url: "/about/"
weight: 3

params:
title: "R 语言数据科学导论"
subtitle: "Data Science Introduction with R"
author: "范叶亮 | Leo Van"
description: "R 语言数据科学导论 | Data Science Introduction with R"
logo: "/images/data-science-introduction-with-r.png"
logoWidth: 150
footer: "Copyright &copy; 2017-{Year} [范叶亮 | Leo Van](https://leovan.me)"
licenseURL: "https://github.com/leovan/data-science-introduction-with-r/blob/main/LICENSE"
googleAdsense: "ca-pub-2608165017777396"

socialIcons:
- title: "版本"
icon: "https://img.shields.io/github/release/leovan/data-science-introduction-with-r.svg"
- title: "许可证"
icon: "https://img.shields.io/badge/license-CC%20BY--NC--SA%204.0-blue.svg"
- title: "Github Stars"
icon: "https://img.shields.io/github/stars/leovan/data-science-introduction-with-r?style=social"
url: "https://github.com/leovan/data-science-introduction-with-r"
buttons:
- title: "讲义"
url: "/lecture/"
- title: "文档"
url: "/doc/"
- title: "关于"
url: "/about/"

jsCookieVersion: "3.0.0"
jsCookieCDN: "//cdnjs.cloudflare.com/ajax/libs"

clipboardjsVersion: "2.0.11"
clipboardjsCDN: "//cdnjs.cloudflare.com/ajax/libs"

prismjsVersion: "1.29.0"
prismjsCDN: "//cdnjs.cloudflare.com/ajax/libs"
prismjsPluginJS: ["autoloader", "show-language", "toolbar"]
prismjsPluginCSS: ["toolbar"]

mathjaxVersion: "3.2.2"
mathjaxCDN: "//cdnjs.cloudflare.com/ajax/libs"

jQueryVersion: "3.6.3"
jQueryCDN: "//cdnjs.cloudflare.com/ajax/libs"

lazysizesVersion: "5.3.2"
lazysizesCDN: "//cdnjs.cloudflare.com/ajax/libs"

vanillaBackToTopVersion: "latest"
vanillaBackToTopCDN: "//cdn.jsdelivr.net/npm"

pdfjsVersion: "3.2.146"
pdfjsCDN: "//cdnjs.cloudflare.com/ajax/libs"

markup:
highlight:
codeFences: false
goldmark:
renderer:
unsafe: true
parser:
autoHeadingIDType: blackfriday
tableOfContents:
startLevel: 1
Loading

0 comments on commit c850b10

Please sign in to comment.