Skip to content

Commit ca29c20

Browse files
committed
doc: update README moving verbose notes after usage
1 parent 3b31be5 commit ca29c20

File tree

1 file changed

+45
-45
lines changed

1 file changed

+45
-45
lines changed

README.md

+45-45
Original file line numberDiff line numberDiff line change
@@ -7,51 +7,6 @@ Rails HTML Sanitizer is only intended to be used with Rails applications. If you
77

88
## Usage
99

10-
### A note on HTML entities
11-
12-
__Rails HTML sanitizers are intended to be used by the view layer, at page-render time. They are *not* intended to sanitize persisted strings that will be sanitized *again* at page-render time.__
13-
14-
Proper HTML sanitization will replace some characters with HTML entities. For example, text containing a `<` character will be updated to contain `&lt;` to ensure that the markup is well-formed.
15-
16-
This is important to keep in mind because __HTML entities will render improperly if they are sanitized twice.__
17-
18-
19-
#### A concrete example showing the problem that can arise
20-
21-
Imagine the user is asked to enter their employer's name, which will appear on their public profile page. Then imagine they enter `JPMorgan Chase & Co.`.
22-
23-
If you sanitize this before persisting it in the database, the stored string will be `JPMorgan Chase &amp; Co.`
24-
25-
When the page is rendered, if this string is sanitized a second time by the view layer, the HTML will contain `JPMorgan Chase &amp;amp; Co.` which will render as "JPMorgan Chase &amp;amp; Co.".
26-
27-
Another problem that can arise is rendering the sanitized string in a non-HTML context (for example, if it ends up being part of an SMS message). In this case, it may contain inappropriate HTML entities.
28-
29-
30-
#### Suggested alternatives
31-
32-
You might simply choose to persist the untrusted string as-is (the raw input), and then ensure that the string will be properly sanitized by the view layer.
33-
34-
That raw string, if rendered in an non-HTML context (like SMS), must also be sanitized by a method appropriate for that context. You may wish to look into using [Loofah](https://github.com/flavorjones/loofah) or [Sanitize](https://github.com/rgrove/sanitize) to customize how this sanitization works, including omitting HTML entities in the final string.
35-
36-
If you really want to sanitize the string that's stored in your database, you may wish to look into [Loofah::ActiveRecord](https://github.com/flavorjones/loofah-activerecord) rather than use the Rails HTML sanitizers.
37-
38-
39-
### A note on module names
40-
41-
In versions < 1.6, the only module defined by this library was `Rails::Html`. Starting in 1.6, we define three additional modules:
42-
43-
- `Rails::HTML` for general functionality (replacing `Rails::Html`)
44-
- `Rails::HTML4` containing sanitizers that parse content as HTML4
45-
- `Rails::HTML5` containing sanitizers that parse content as HTML5 (if supported)
46-
47-
The following aliases are maintained for backwards compatibility:
48-
49-
- `Rails::Html` points to `Rails::HTML`
50-
- `Rails::HTML::FullSanitizer` points to `Rails::HTML4::FullSanitizer`
51-
- `Rails::HTML::LinkSanitizer` points to `Rails::HTML4::LinkSanitizer`
52-
- `Rails::HTML::SafeListSanitizer` points to `Rails::HTML4::SafeListSanitizer`
53-
54-
5510
### Sanitizers
5611

5712
All sanitizers respond to `sanitize`, and are available in variants that use either HTML4 or HTML5 parsing, under the `Rails::HTML4` and `Rails::HTML5` namespaces, respectively.
@@ -219,6 +174,51 @@ Using the `CommentScrubber` from above, you can use this in a Rails view like so
219174
<%= sanitize @comment, scrubber: CommentScrubber.new %>
220175
```
221176
177+
### A note on HTML entities
178+
179+
__Rails HTML sanitizers are intended to be used by the view layer, at page-render time. They are *not* intended to sanitize persisted strings that will be sanitized *again* at page-render time.__
180+
181+
Proper HTML sanitization will replace some characters with HTML entities. For example, text containing a `<` character will be updated to contain `&lt;` to ensure that the markup is well-formed.
182+
183+
This is important to keep in mind because __HTML entities will render improperly if they are sanitized twice.__
184+
185+
186+
#### A concrete example showing the problem that can arise
187+
188+
Imagine the user is asked to enter their employer's name, which will appear on their public profile page. Then imagine they enter `JPMorgan Chase & Co.`.
189+
190+
If you sanitize this before persisting it in the database, the stored string will be `JPMorgan Chase &amp; Co.`
191+
192+
When the page is rendered, if this string is sanitized a second time by the view layer, the HTML will contain `JPMorgan Chase &amp;amp; Co.` which will render as "JPMorgan Chase &amp;amp; Co.".
193+
194+
Another problem that can arise is rendering the sanitized string in a non-HTML context (for example, if it ends up being part of an SMS message). In this case, it may contain inappropriate HTML entities.
195+
196+
197+
#### Suggested alternatives
198+
199+
You might simply choose to persist the untrusted string as-is (the raw input), and then ensure that the string will be properly sanitized by the view layer.
200+
201+
That raw string, if rendered in an non-HTML context (like SMS), must also be sanitized by a method appropriate for that context. You may wish to look into using [Loofah](https://github.com/flavorjones/loofah) or [Sanitize](https://github.com/rgrove/sanitize) to customize how this sanitization works, including omitting HTML entities in the final string.
202+
203+
If you really want to sanitize the string that's stored in your database, you may wish to look into [Loofah::ActiveRecord](https://github.com/flavorjones/loofah-activerecord) rather than use the Rails HTML sanitizers.
204+
205+
206+
### A note on module names
207+
208+
In versions < 1.6, the only module defined by this library was `Rails::Html`. Starting in 1.6, we define three additional modules:
209+
210+
- `Rails::HTML` for general functionality (replacing `Rails::Html`)
211+
- `Rails::HTML4` containing sanitizers that parse content as HTML4
212+
- `Rails::HTML5` containing sanitizers that parse content as HTML5 (if supported)
213+
214+
The following aliases are maintained for backwards compatibility:
215+
216+
- `Rails::Html` points to `Rails::HTML`
217+
- `Rails::HTML::FullSanitizer` points to `Rails::HTML4::FullSanitizer`
218+
- `Rails::HTML::LinkSanitizer` points to `Rails::HTML4::LinkSanitizer`
219+
- `Rails::HTML::SafeListSanitizer` points to `Rails::HTML4::SafeListSanitizer`
220+
221+
222222
## Installation
223223
224224
Add this line to your application's Gemfile:

0 commit comments

Comments
 (0)