Skip to content

Commit 79c5b50

Browse files
committed
fixup
1 parent 5886dc4 commit 79c5b50

File tree

3 files changed

+15
-12
lines changed

3 files changed

+15
-12
lines changed

docs/src/models/activation.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11

2-
# Activation Functions from NNlib.jl
2+
# [Activation Functions from NNlib.jl](@ref man-activation-functions)
33

44
These non-linearities used between layers of your model are exported by the [NNlib](https://github.com/FluxML/NNlib.jl) package.
55

docs/src/models/layers.md

+13-10
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,21 @@
11
# Built-in Layer Types
22

3-
If you started at the beginning, then you have already met the basic [`Dense`](@ref) layer, and seen [`Chain`](@ref) for combining layers. These core layers form the foundation of almost all neural networks.
3+
If you started at the beginning of the guide, then you have already met the
4+
basic [`Dense`](@ref) layer, and seen [`Chain`](@ref) for combining layers.
5+
These core layers form the foundation of almost all neural networks.
46

5-
The `Dense` layer
7+
The `Dense` exemplifies several features:
68

7-
* Weight matrices are created ... Many layers take an `init` keyword, accepts a function acting like `rand`. That is, `init(2,3,4)` creates an array of this size. ... always on the CPU.
9+
* It contains an an [activation function](@ref man-activation-functions), which is broadcasted over the output. Because this broadcast can be fused with other operations, doing so is more efficient than applying the activation function separately.
810

9-
* An activation function. This is broadcast over the output: `Flux.Scale(3, tanh)([1,2,3]) ≈ tanh.(1:3)`
11+
* It take an `init` keyword, which accepts a function acting like `rand`. That is, `init(2,3,4)` should create an array of this size. Flux has [many such functions](@ref man-init-funcs) built-in. All make a CPU array, moved later with [gpu](@ref Flux.gpu) if desired.
1012

11-
* The bias vector is always intialised `Flux.zeros32`. The keyword `bias=false` will turn this off.
13+
* The bias vector is always intialised [`Flux.zeros32`](@ref). The keyword `bias=false` will turn this off, i.e. keeping the bias permanently zero.
1214

15+
* It is annotated with [`@layer`](@ref Flux.@layer), which means that [`params`](@ref Flux.params) will see the contents, and [gpu](@ref Flux.gpu) will move their arrays to the GPU.
1316

14-
* All layers are annotated with `@layer`, which means that `params` will see the contents, and `gpu` will move their arrays to the GPU.
15-
17+
By contrast, `Chain` itself contains no parameters, but connects other layers together.
18+
The section on [dataflow layers](@ref man-dataflow-layers) introduces others like this,
1619

1720
## Fully Connected
1821

@@ -32,7 +35,7 @@ They all expect images in what is called WHCN order: a batch of 32 colour images
3235

3336
Besides images, 2D data, they also work with 1D data, where for instance stereo sound recording with 1000 samples might have `size(x) == (1000, 2, 1)`. They will also work with 3D data, `ndims(x) == 5`, where again the last two dimensions are channel and batch.
3437

35-
To understand how `stride` ?? there's a cute article.
38+
To understand how strides and padding work, the article by [Dumoulin & Visin](https://arxiv.org/abs/1603.07285) has great illustrations.
3639

3740
```@docs
3841
Conv
@@ -77,9 +80,9 @@ Flux.Embedding
7780
Flux.EmbeddingBag
7881
```
7982

80-
## Dataflow Layers, or Containers
83+
## [Dataflow Layers, or Containers](@id man-dataflow-layers)
8184

82-
The basic `Chain(F, G, H)` applies the layers it contains in sequence, equivalent to `H ∘ G ∘ F`. Flux has some other layers which contain layers, but connect them up in a more complicated way: `SkipConnection` allows ResNet's ??residual connection.
85+
The basic `Chain(F, G, H)` applies the layers it contains in sequence, equivalent to `H ∘ G ∘ F`. Flux has some other layers which contain layers, but connect them up in a more complicated way: `SkipConnection` allows ResNet's residual connection.
8386

8487
These are all defined with [`@layer`](@ref)` :exand TypeName`, which tells the pretty-printing code that they contain other layers.
8588

docs/src/utilities.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Random Weight Initialisation
1+
# [Random Weight Initialisation](@id man-init-funcs)
22

33
Flux initialises convolutional layers and recurrent cells with `glorot_uniform` by default.
44
Most layers accept a function as an `init` keyword, which replaces this default. For example:

0 commit comments

Comments
 (0)