You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/src/models/layers.md
+13-10
Original file line number
Diff line number
Diff line change
@@ -1,18 +1,21 @@
1
1
# Built-in Layer Types
2
2
3
-
If you started at the beginning, then you have already met the basic [`Dense`](@ref) layer, and seen [`Chain`](@ref) for combining layers. These core layers form the foundation of almost all neural networks.
3
+
If you started at the beginning of the guide, then you have already met the
4
+
basic [`Dense`](@ref) layer, and seen [`Chain`](@ref) for combining layers.
5
+
These core layers form the foundation of almost all neural networks.
4
6
5
-
The `Dense`layer
7
+
The `Dense`exemplifies several features:
6
8
7
-
*Weight matrices are created ... Many layers take an `init` keyword, accepts a function acting like `rand`. That is, `init(2,3,4)` creates an array of this size. ... always on the CPU.
9
+
*It contains an an [activation function](@ref man-activation-functions), which is broadcasted over the output. Because this broadcast can be fused with other operations, doing so is more efficient than applying the activation function separately.
8
10
9
-
*An activation function. This is broadcast over the output: `Flux.Scale(3, tanh)([1,2,3]) ≈ tanh.(1:3)`
11
+
*It take an `init` keyword, which accepts a function acting like `rand`. That is, `init(2,3,4)` should create an array of this size. Flux has [many such functions](@ref man-init-funcs) built-in. All make a CPU array, moved later with [gpu](@ref Flux.gpu) if desired.
10
12
11
-
* The bias vector is always intialised `Flux.zeros32`. The keyword `bias=false` will turn this off.
13
+
* The bias vector is always intialised [`Flux.zeros32`](@ref). The keyword `bias=false` will turn this off, i.e. keeping the bias permanently zero.
12
14
15
+
* It is annotated with [`@layer`](@ref Flux.@layer), which means that [`params`](@ref Flux.params) will see the contents, and [gpu](@ref Flux.gpu) will move their arrays to the GPU.
13
16
14
-
* All layers are annotated with `@layer`, which means that `params` will see the contents, and `gpu` will move their arrays to the GPU.
15
-
17
+
By contrast, `Chain` itself contains no parameters, but connects other layers together.
18
+
The section on [dataflow layers](@ref man-dataflow-layers) introduces others like this,
16
19
17
20
## Fully Connected
18
21
@@ -32,7 +35,7 @@ They all expect images in what is called WHCN order: a batch of 32 colour images
32
35
33
36
Besides images, 2D data, they also work with 1D data, where for instance stereo sound recording with 1000 samples might have `size(x) == (1000, 2, 1)`. They will also work with 3D data, `ndims(x) == 5`, where again the last two dimensions are channel and batch.
34
37
35
-
To understand how `stride` ?? there's a cute article.
38
+
To understand how strides and padding work, the article by [Dumoulin & Visin](https://arxiv.org/abs/1603.07285) has great illustrations.
36
39
37
40
```@docs
38
41
Conv
@@ -77,9 +80,9 @@ Flux.Embedding
77
80
Flux.EmbeddingBag
78
81
```
79
82
80
-
## Dataflow Layers, or Containers
83
+
## [Dataflow Layers, or Containers](@id man-dataflow-layers)
81
84
82
-
The basic `Chain(F, G, H)` applies the layers it contains in sequence, equivalent to `H ∘ G ∘ F`. Flux has some other layers which contain layers, but connect them up in a more complicated way: `SkipConnection` allows ResNet's ??residual connection.
85
+
The basic `Chain(F, G, H)` applies the layers it contains in sequence, equivalent to `H ∘ G ∘ F`. Flux has some other layers which contain layers, but connect them up in a more complicated way: `SkipConnection` allows ResNet's residual connection.
83
86
84
87
These are all defined with [`@layer`](@ref)` :exand TypeName`, which tells the pretty-printing code that they contain other layers.
0 commit comments