You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: README.md
+45-45
Original file line number
Diff line number
Diff line change
@@ -7,51 +7,6 @@ Rails HTML Sanitizer is only intended to be used with Rails applications. If you
7
7
8
8
## Usage
9
9
10
-
### A note on HTML entities
11
-
12
-
__Rails HTML sanitizers are intended to be used by the view layer, at page-render time. They are *not* intended to sanitize persisted strings that will be sanitized *again* at page-render time.__
13
-
14
-
Proper HTML sanitization will replace some characters with HTML entities. For example, text containing a `<` character will be updated to contain `<` to ensure that the markup is well-formed.
15
-
16
-
This is important to keep in mind because __HTML entities will render improperly if they are sanitized twice.__
17
-
18
-
19
-
#### A concrete example showing the problem that can arise
20
-
21
-
Imagine the user is asked to enter their employer's name, which will appear on their public profile page. Then imagine they enter `JPMorgan Chase & Co.`.
22
-
23
-
If you sanitize this before persisting it in the database, the stored string will be `JPMorgan Chase & Co.`
24
-
25
-
When the page is rendered, if this string is sanitized a second time by the view layer, the HTML will contain `JPMorgan Chase &amp; Co.` which will render as "JPMorgan Chase &amp; Co.".
26
-
27
-
Another problem that can arise is rendering the sanitized string in a non-HTML context (for example, if it ends up being part of an SMS message). In this case, it may contain inappropriate HTML entities.
28
-
29
-
30
-
#### Suggested alternatives
31
-
32
-
You might simply choose to persist the untrusted string as-is (the raw input), and then ensure that the string will be properly sanitized by the view layer.
33
-
34
-
That raw string, if rendered in an non-HTML context (like SMS), must also be sanitized by a method appropriate for that context. You may wish to look into using [Loofah](https://github.com/flavorjones/loofah) or [Sanitize](https://github.com/rgrove/sanitize) to customize how this sanitization works, including omitting HTML entities in the final string.
35
-
36
-
If you really want to sanitize the string that's stored in your database, you may wish to look into [Loofah::ActiveRecord](https://github.com/flavorjones/loofah-activerecord) rather than use the Rails HTML sanitizers.
37
-
38
-
39
-
### A note on module names
40
-
41
-
In versions < 1.6, the only module defined by this library was `Rails::Html`. Starting in 1.6, we define three additional modules:
42
-
43
-
-`Rails::HTML` for general functionality (replacing `Rails::Html`)
44
-
-`Rails::HTML4` containing sanitizers that parse content as HTML4
45
-
-`Rails::HTML5` containing sanitizers that parse content as HTML5 (if supported)
46
-
47
-
The following aliases are maintained for backwards compatibility:
48
-
49
-
-`Rails::Html` points to `Rails::HTML`
50
-
-`Rails::HTML::FullSanitizer` points to `Rails::HTML4::FullSanitizer`
51
-
-`Rails::HTML::LinkSanitizer` points to `Rails::HTML4::LinkSanitizer`
52
-
-`Rails::HTML::SafeListSanitizer` points to `Rails::HTML4::SafeListSanitizer`
53
-
54
-
55
10
### Sanitizers
56
11
57
12
All sanitizers respond to `sanitize`, and are available in variants that use either HTML4 or HTML5 parsing, under the `Rails::HTML4` and `Rails::HTML5` namespaces, respectively.
@@ -219,6 +174,51 @@ Using the `CommentScrubber` from above, you can use this in a Rails view like so
__Rails HTML sanitizers are intended to be used by the view layer, at page-render time. They are *not* intended to sanitize persisted strings that will be sanitized *again* at page-render time.__
180
+
181
+
Proper HTML sanitization will replace some characters with HTML entities. For example, text containing a `<` character will be updated to contain `<` to ensure that the markup is well-formed.
182
+
183
+
This is important to keep in mind because __HTML entities will render improperly if they are sanitized twice.__
184
+
185
+
186
+
#### A concrete example showing the problem that can arise
187
+
188
+
Imagine the user is asked to enter their employer's name, which will appear on their public profile page. Then imagine they enter `JPMorgan Chase & Co.`.
189
+
190
+
If you sanitize this before persisting it in the database, the stored string will be `JPMorgan Chase & Co.`
191
+
192
+
When the page is rendered, if this string is sanitized a second time by the view layer, the HTML will contain `JPMorgan Chase &amp; Co.` which will render as "JPMorgan Chase &amp; Co.".
193
+
194
+
Another problem that can arise is rendering the sanitized string in a non-HTML context (for example, if it ends up being part of an SMS message). In this case, it may contain inappropriate HTML entities.
195
+
196
+
197
+
#### Suggested alternatives
198
+
199
+
You might simply choose to persist the untrusted string as-is (the raw input), and then ensure that the string will be properly sanitized by the view layer.
200
+
201
+
That raw string, if rendered in an non-HTML context (like SMS), must also be sanitized by a method appropriate for that context. You may wish to look into using [Loofah](https://github.com/flavorjones/loofah) or [Sanitize](https://github.com/rgrove/sanitize) to customize how this sanitization works, including omitting HTML entities in the final string.
202
+
203
+
If you really want to sanitize the string that's stored in your database, you may wish to look into [Loofah::ActiveRecord](https://github.com/flavorjones/loofah-activerecord) rather than use the Rails HTML sanitizers.
204
+
205
+
206
+
### A note on module names
207
+
208
+
In versions < 1.6, the only module defined by this library was `Rails::Html`. Starting in 1.6, we define three additional modules:
209
+
210
+
- `Rails::HTML` for general functionality (replacing `Rails::Html`)
211
+
- `Rails::HTML4` containing sanitizers that parse content as HTML4
212
+
- `Rails::HTML5` containing sanitizers that parse content as HTML5 (if supported)
213
+
214
+
The following aliases are maintained for backwards compatibility:
215
+
216
+
- `Rails::Html` points to `Rails::HTML`
217
+
- `Rails::HTML::FullSanitizer` points to `Rails::HTML4::FullSanitizer`
218
+
- `Rails::HTML::LinkSanitizer` points to `Rails::HTML4::LinkSanitizer`
219
+
- `Rails::HTML::SafeListSanitizer` points to `Rails::HTML4::SafeListSanitizer`
0 commit comments