Skip to content

Commit

Permalink
Merge pull request #14 from tablecheck/generic-formatter-wip
Browse files Browse the repository at this point in the history
Add generic HTML Formatter
  • Loading branch information
johnnyshields authored Mar 11, 2025
2 parents dcfb2f3 + 451904c commit acaed36
Show file tree
Hide file tree
Showing 8 changed files with 466 additions and 89 deletions.
81 changes: 74 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,25 +49,92 @@ Output:
The <del class="diffdel">quick </del>fox <del class="diffmod">jumped</del><ins class="diffmod">hopped</ins> over the <ins class="diffins">lazy</ins> dog.
```

### Formatting Using HTML Spans
### Formatting the HTML Output

As an alternative to the default `<del>` and `<ins>` tags,
HTMLDiff provides an alternative formatter to use `<span>` tags:
HTMLDiff includes a highly customizable `HtmlFormatter` that gives you fine-grained control over the HTML output. This formatter allows you to specify different HTML tags and CSS classes for various diff elements.

```ruby
diff = HTMLDiff.diff(old_text, new_text, formatter: HTMLDiff::Formatters::SpanFormatter)
old_text = "The quick red fox jumped over the dog."
new_text = "The red fox hopped over the lazy dog."

diff = HTMLDiff.diff(old_text, new_text, html_format: {
tag: 'span',
class_delete: 'highlight removed',
class_insert: 'highlight added'
})
```

Output:
```html
The <span class="highlight removed">quick </span>red fox <span class="highlight removed">jumped</span><span class="highlight added">hopped</span> over the <span class="highlight added">lazy</span> dog.
```

#### Customization Options

`HTMLDiff.diff(format:)` supports the following options:

| Option | Description |
|-------------------------|------------------------------------------------------------------------------------------------------------|
| `:tag` | Base HTML tag to use for all change nodes (default: none) |
| `:tag_delete` | HTML tag for deleted content (overrides `:tag`, default: `"del"`) |
| `:tag_insert` | HTML tag for inserted content (overrides `:tag`, default: `"ins"`) |
| `:tag_replace` | HTML tag for replaced content (overrides `:tag_delete`, `:tag`) |
| `:tag_replace_delete` | HTML tag for deleted content in replacements (overrides `:tag_replace`, `:tag_delete`, `:tag`) |
| `:tag_replace_insert` | HTML tag for inserted content in replacements (overrides `:tag_replace`, `:tag_insert`, `:tag`) |
| `:tag_unchanged` | HTML tag for unchanged content (optional) |
| `:class` | Base CSS class(es) for all change nodes |
| `:class_delete` | CSS class(es) for deleted content (overrides `:class`) |
| `:class_insert` | CSS class(es) for inserted content (overrides `:class`) |
| `:class_replace` | CSS class(es) for replaced content (overrides `:class_delete`, `:class_insert`, `:class`) |
| `:class_replace_delete` | CSS class(es) for deleted content in replacements (overrides `:class_replace`, `:class_delete`, `:class`) |
| `:class_replace_insert` | CSS class(es) for inserted content in replacements (overrides `:class_replace`, `:class_insert`, `:class`) |
| `:class_unchanged` | CSS class(es) for unchanged content (optional) |

#### Example: Wrapping unchanged text in tags

```ruby
diff = HTMLDiff.diff(old_text, new_text, html_format: {
tag_unchanged: 'span',
class_unchanged: 'unchanged',
tag: 'span',
class_delete: 'deleted',
class_insert: 'inserted'
})
```

Output:

```html
The <span class="diff-del">quick </span>fox <span class="diff-mod diff-del">jumped</span><span class="diff-mod diff-ins">hopped</span> over the <span class="diff-ins">lazy</span> dog.
<span class="unchanged">The </span><span class="deleted">quick </span><span class="unchanged">red fox </span><span class="deleted">jumped</span><span class="inserted">hopped</span><span class="unchanged"> over the </span><span class="inserted">lazy</span><span class="unchanged"> dog.</span>
```

#### Example: Special handling for replacements

```ruby
diff = HTMLDiff.diff(old_text, new_text, html_format: {
tag_delete: 'span',
tag_insert: 'div',
tag_replace: 'mark',
class_delete: 'deleted',
class_insert: 'inserted',
class_replace_delete: 'replaced deleted',
class_replace_insert: 'replaced inserted'
})
```

Output:

```html
The <span class="deleted">quick </span>red fox <mark class="replaced deleted">jumped</mark><mark class="replaced inserted">hopped</mark> over the <div class="inserted">lazy</div> dog.
```

### Using a Custom Output Formatter

You can customize the output format by creating your own formatter.
Your formatter can be any object that responds to the `#format` method,
If the HTML formatting options above aren't sufficient for your use case,
or if you'd like to output to an alternative format (e.g. XML, JSON, etc.),
you can further customize the output by creating your own formatter.

Your formatter may be any object that responds to the `#format` method,
and it can return whatever object type you'd like (typically a String).

```ruby
Expand Down
15 changes: 11 additions & 4 deletions lib/html_diff.rb
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

require 'html_diff/tokenizer'
require 'html_diff/differ'
require 'html_diff/formatters/html_formatter'
require 'html_diff/formatters/del_ins_formatter'
require 'html_diff/formatters/span_formatter'
require 'html_diff/diff_builder' # deprecated
Expand All @@ -17,16 +18,22 @@ module HTMLDiff
# @param new_string [String] The new string
# @option tokenizer [Object] An optional object which responds to `tokenize`,
# which is used break the input strings into an Array of LCS-diffable tokens.
# @option format [Hash] An optional hash of options to pass to the formatter.
# @option formatter [Object] An optional object which responds to `format`,
# which renders the LCS-diff output.
# @return [String] Diff of the two strings with additions and deletions marked.
def diff(old_string, new_string, tokenizer: nil, formatter: nil)
def diff(old_string, new_string, tokenizer: nil, html_format: nil, formatter: nil)
tokenizer ||= Tokenizer
formatter ||= Formatters::DelInsFormatter

old_tokens = tokenizer.tokenize(old_string)
new_tokens = tokenizer.tokenize(new_string)

changes = Differ.diff(old_tokens, new_tokens)
formatter.format(changes)

if html_format
Formatters::HtmlFormatter.format(changes, **html_format)
else
formatter ||= Formatters::DelInsFormatter
formatter.format(changes)
end
end
end
35 changes: 7 additions & 28 deletions lib/html_diff/formatters/del_ins_formatter.rb
Original file line number Diff line number Diff line change
@@ -1,44 +1,23 @@
# frozen_string_literal: true

require_relative 'html_formatter'

module HTMLDiff
module Formatters
# The DelInsFormatter renders the diff as HTML with <del> and <ins> tags.
module DelInsFormatter
extend self

# Format a sequence of changes from LcsDiff into HTML
# Format a sequence of diff changes into HTML.
#
# @param changes [Array<Array>] Array of [action, old_string, new_string] tuples,
# where action is one of '=' (equal), '-' (remove), '+' (add), or '!' (replace)
# @return [String] HTML formatted diff
def format(changes)
changes.each_with_object(+'') do |(action, old_string, new_string), content|
case action
when '=' # equal
content << new_string if new_string
when '-' # remove
content << html_tag('del', 'diffdel', old_string)
when '+' # add
content << html_tag('ins', 'diffins', new_string)
when '!' # replace
content << html_tag('del', 'diffmod', old_string)
content << html_tag('ins', 'diffmod', new_string)
end
end
end

private

# Render an HTML tag
#
# @param tag_name [String] The name of the HTML tag to use
# @param css_class [String] The CSS class for the tag
# @param content [String] The words to insert
# @return [String] HTML markup with appropriate tags
def html_tag(tag_name, css_class, content)
return '' unless content

%(<#{tag_name} class="#{css_class}">#{content}</#{tag_name}>)
HtmlFormatter.format(changes,
class_delete: 'diffdel',
class_insert: 'diffins',
class_replace: 'diffmod')
end
end
end
Expand Down
88 changes: 88 additions & 0 deletions lib/html_diff/formatters/html_formatter.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# frozen_string_literal: true

module HTMLDiff
module Formatters
# Renders the diff as HTML with customizable tags and classes.
module HtmlFormatter
extend self

# Format a sequence of diff changes into HTML.
#
# @param changes [Array<Array>] Array of [action, old_string, new_string] tuples,
# where action is one of '=' (unchanged), '-' (delete), '+' (insert), or '!' (replace)
# @option tag [String] HTML tag to use for all delete, insert, and replace
# nodes. Can be overridden by other options.
# @option tag_delete [String] HTML tag to use for delete nodes (overrides :tag)
# @option tag_insert [String] HTML tag to use for insert nodes (overrides :tag)
# @option tag_replace [String] HTML tag to use for replace nodes (overrides
# :tag_delete, :tag_insert, and :tag)
# @option tag_replace_delete [String] HTML tag to use for deleted content
# in replace nodes (overrides :tag_replace, :tag_delete, and :tag)
# @option tag_replace_insert [String] HTML tag to use for inserted content
# in replace nodes (overrides :tag_replace, :tag_insert, and :tag)
# @option tag_unchanged [String] HTML tag to use for unchanged content.
# If not specified, unchanged content is not wrapped in a tag.
# @option class [String, Array<String>] The CSS class(es) to use for all
# deleted, inserted, and replace nodes. Can be overridden by other options.
# @option class_delete [String, Array<String>] The CSS class(es) to use for
# deleted nodes (overrides :class)
# @option class_insert [String, Array<String>] The CSS class(es) to use for
# inserted nodes (overrides :class)
# @option class_replace [String, Array<String>] The CSS class(es) to use for
# replace nodes (overrides :class_delete, :class_insert, and :class)
# @option class_replace_delete [String, Array<String>] The CSS class(es) to
# use for deleted content in replace nodes (overrides :class_replace,
# :class_delete, and :class)
# @option class_replace_insert [String, Array<String>] The CSS class(es) to
# use for inserted content in replace nodes (overrides :class_replace,
# :class_insert, and :class)
# @option class_unchanged [String, Array<String>] The CSS class(es) to use for
# unchanged content. If not specified, unchanged content is not wrapped in a tag.
# @return [String] HTML formatted diff.
def format(changes, **kwargs)
changes.each_with_object(+'') do |(action, old_string, new_string), content|
case action
when '=' # unchanged
next unless new_string

content << (kwargs[:tag_unchanged] ? html_tag(kwargs[:tag_unchanged], kwargs[:class_unchanged], new_string) : new_string)
when '-' # delete
tag = kwargs[:tag_delete] || kwargs[:tag] || 'del'
css_class = kwargs[:class_delete] || kwargs[:class]
content << html_tag(tag, css_class, old_string) if old_string
when '+' # insert
tag = kwargs[:tag_insert] || kwargs[:tag] || 'ins'
css_class = kwargs[:class_insert] || kwargs[:class]
content << html_tag(tag, css_class, new_string) if new_string
when '!' # replace
tag_delete = kwargs[:tag_replace_delete] || kwargs[:tag_replace] || kwargs[:tag_delete] || kwargs[:tag] || 'del'
css_class_delete = kwargs[:class_replace_delete] || kwargs[:class_replace] || kwargs[:class_delete] || kwargs[:class]
content << html_tag(tag_delete, css_class_delete, old_string) if old_string

tag_insert = kwargs[:tag_replace_insert] || kwargs[:tag_replace] || kwargs[:tag_insert] || kwargs[:tag] || 'ins'
css_class_insert = kwargs[:class_replace_insert] || kwargs[:class_replace] || kwargs[:class_insert] || kwargs[:class]
content << html_tag(tag_insert, css_class_insert, new_string) if new_string
end
end
end

private

# Render an HTML tag
#
# @param tag [String] HTML tag to use
# @param css_class [String] The CSS class(es) for the tag
# @param content [String] The words to insert
# @return [String] HTML markup with appropriate tags
def html_tag(tag, css_class, content)
return '' unless content

tag = tag.delete_prefix('<')
tag = tag.delete_suffix('>')
css_class = css_class.join(' ') if css_class.is_a?(Array)
css_class = nil if css_class&.empty?
"<#{tag}#{%( class="#{css_class}") if css_class}>#{content}</#{tag}>"
end
end
end
end
34 changes: 8 additions & 26 deletions lib/html_diff/formatters/span_formatter.rb
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# frozen_string_literal: true

require_relative 'html_formatter'

module HTMLDiff
module Formatters
# The SpanFormatter renders the diff as HTML with <span> tags.
Expand All @@ -12,32 +14,12 @@ module SpanFormatter
# where action is one of '=' (equal), '-' (remove), '+' (add), or '!' (replace)
# @return [String] HTML formatted diff
def format(changes)
changes.each_with_object(+'') do |(action, old_string, new_string), content|
case action
when '=' # equal
content << new_string if new_string
when '-' # remove
content << span_tag('diff-del', old_string)
when '+' # add
content << span_tag('diff-ins', new_string)
when '!' # replace
content << span_tag('diff-mod diff-del', old_string)
content << span_tag('diff-mod diff-ins', new_string)
end
end
end

private

# Render an HTML tag
#
# @param css_class [String] The CSS class(es) for the tag
# @param content [String] The words to insert
# @return [String] HTML markup with appropriate tags
def span_tag(css_class, content)
return '' unless content

%(<span class="#{css_class}">#{content}</span>)
HtmlFormatter.format(changes,
tag: 'span',
class_delete: 'diff-del',
class_insert: 'diff-ins',
class_replace_delete: 'diff-mod diff-del',
class_replace_insert: 'diff-mod diff-ins')
end
end
end
Expand Down
12 changes: 0 additions & 12 deletions spec/html_diff/formatters/del_ins_formatter_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -148,16 +148,4 @@
end
end
end

describe '.html_tag' do
it 'creates an HTML tag with the specified attributes' do
result = described_class.send(:html_tag, 'span', 'highlight', 'content')
expect(result).to eq('<span class="highlight">content</span>')
end

it 'returns empty string when content is nil' do
result = described_class.send(:html_tag, 'span', 'highlight', nil)
expect(result).to eq('')
end
end
end
Loading

0 comments on commit acaed36

Please sign in to comment.