Skip to content

content: Implement syntax hightlighting for code blocks #242

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 17, 2023

Conversation

rajveermalviya
Copy link
Member

@rajveermalviya rajveermalviya commented Jul 25, 2023

Context

Implements the syntax highlighting for the code blocks by adapting the css styles from the zulip web frontend to flutter's TextStyle.

Implementation details

While parsing the HTML nodes we store the text along with the className in CodeSpanNode.
Both of them are strings, but className can probably be changed to dart enums with their conversion functions (string <-> enum), so that app doesn't unnecessarily store the short strings hogging the memory.

In the widgets we do pattern matching over the className to match the specific TextStyle. Then we use it to generate the TextSpans for the RichText widget.

There is also an assumption that a span node will only have a single class associated with it, please let me know if it's an invalid assumption & I will update the code accordingly.

Notes

  • There are some styles that are repeating in the css classes which could be consolidated and referenced multiple times. But for now they are all defined separately for easier review.
  • The .err class in web frontend uses border, it cannot be mapped directly in TextSpans so it's TODO for now. (I read the comment documenting the trouble for _buildInlineCode).
    • Should the .err class follow _buildInlineCode by using a suitable background color? Currently it just doesn't style it.

Testing

Didn't add any new tests, but updated the previous ones.

Fixes: #191

@rajveermalviya
Copy link
Member Author

Preview of how it looks:

Flutter Web
flutter web

Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very interesting! Thanks @rajveermalviya for the contribution.

I haven't done a full code review, but some high-level thoughts:

Didn't add any new tests, but updated the previous ones.

Cool, that's fine for a draft for early feedback. We'll want some tests for this functionality before we merge it, though.

(Some of the existing functionality around it doesn't have tests, because a lot of the content parsing was written when this was still an experimental prototype. But now that we're committed to building the Flutter app, we're generally writing tests for each new feature as we add it, and gradually going back and adding tests for existing features.)

There is also an assumption that a span node will only have a single class associated with it, please let me know if it's an invalid assumption & I will update the code accordingly.

That assumption sounds good, if it's what you're seeing in test messages.

What we should then do is just make sure that if the assumption is ever violated, the parser produces an UnimplementedNode — so probably an UnimplementedBlockContentNode instead of the CodeBlockNode. See #190 for background on our plans there.

Both of them are strings, but className can probably be changed to dart enums with their conversion functions (string <-> enum), so that app doesn't unnecessarily store the short strings hogging the memory.

Yeah, an enum sounds good. That will also let the pattern-match in the widgets code be exhaustive.

Should the .err class follow _buildInlineCode by using a suitable background color? Currently it just doesn't style it.

Yeah, probably the best option is to follow what we do there. Should definitely get a comment calling out that that's what's happening.

Comment on lines 328 to 329
// .gd { color: hsl(0deg 100% 31%); }
final _kCodeBlockStyleGd = TextStyle(color: const HSLColor.fromAHSL(1, 0, 1, 0.31).toColor(), fontStyle: FontStyle.italic);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These appear not to match.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

// .il { color: hsl(0deg 0% 40%); }
final _kCodeBlockStyleIl = TextStyle(color: const HSLColor.fromAHSL(1, 0, 0, 0.40).toColor());

class _CodeBlockHighlightedBuilder {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This happily doesn't need a builder class, because it doesn't have any state. (The reason we have _InlineContentBuilder is in order to track _recognizer and _recognizerStack as they change over the course of the recursion.)

Instead these methods can just go on the widget class that needs them. Or as top-level functions if they end up needing to be shared by two widget classes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

Comment on lines 312 to 313
// .hll { background-color: hsl(60deg 100% 90%); }
final _kCodeBlockStyleHll = TextStyle(backgroundColor: const HSLColor.fromAHSL(1, 60, 1, 0.9).toColor());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW I appreciate how the CSS is interleaved here as comments — it helps a lot in tracing the relationship of this code to the web app's styles.

Comment on lines 254 to 259
if (node.spans.isEmpty) {
return const CodeBlockBasic(text: '');
} else if (node.spans.length == 1 && node.spans.first.className == ''){
return CodeBlockBasic(text: node.spans.first.text);
} else {
return CodeBlockHighlighted(node: node);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we unify the different cases here, and have just one CodeBlock widget that behaves like this CodeBlockHighlighted?

Or perhaps with a _CodeBlockContainer widget if that helps with the code organization — that one is fine, but what I really mean here is that ideally we'd have one CodeBlock instead of _buildCodeBlock, CodeBlockBasic, and CodeBlockHighlighted.

I know this resembles _buildBlockInlineContainer and its helpers. Those set a bit of a bad example, I think — I feel like they ended up somewhat messy, but I was paranoid about performance there. Specifically, every paragraph (and some other things too) in every message is going to go through that code path, and most of them don't contain any links (which are the thing that makes _BlockInlineContainer complicated and stateful), and I was worried about making every paragraph into a StatefulWidget when most of them don't need any state.

Here, code blocks aren't nearly so ubiquitous as paragraphs are, so I don't feel such a need to be paranoid; and in any case it's not clear that what CodeBlockHighlighted is doing (for a given number of spans) will be any more expensive than what CodeBlockBasic is doing. So we can cheerfully keep things simple.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, unified them into single CodeBlock widget

class CodeBlockNode extends BlockContentNode {
// TODO(#191) represent the code-highlighting style spans in CodeBlockNode
const CodeBlockNode({super.debugHtmlNode, required this.text});
class CodeSpanNode extends BlockContentNode {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class CodeSpanNode extends BlockContentNode {
class CodeBlockSpan extends InlineContentNode {

(Or perhaps CodeBlockSpanNode.)

  • Just "code span" sounds like a description of what InlineCodeNode is. This node is specifically for a span found inside a code block.
  • This is a type of inline content node, not a block content node. See the dartdoc comments on InlineContentNode and BlockContentNode.

(Or maybe the base class should even be just ContentNode. While it's true that this is an inline content node, it's not a node we expect to find in any of the containers that have a generic list of InlineContentNodes, i.e. any of the subclasses of BlockInlineContainerNode or InlineContainerNode. So inheriting directly from ContentNode would express that for the benefit of code consuming those container nodes.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also let's put this after CodeBlockNode, rather than before, to match the organization of the other classes in this file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, renamed CodeSpanNode to CodeBlockSpanNode

@rajveermalviya
Copy link
Member Author

Thanks for the review @gnprice, I have rebased the branch with suggested changes.

@rajveermalviya
Copy link
Member Author

rajveermalviya commented Jul 26, 2023

Some new things to note:

  • We don't have a style for all of the token types from pygments library, and zulip also doesn't style all of them. So how we currently handle it is: parser parses all of the token types possible, in ui-view/page/widget/drawing (idk what really to call this phase) we pattern match to style only some of the tokens (this list is same as web).
  • Also one weird thing to note, the .hll class is not in pygments' list of token types and moreover zulip's backend doesn't seem to generate html using it either. For now, I haven't included it.

@gnprice
Copy link
Member

gnprice commented Jul 26, 2023

Huh yeah, odd.

Searching the pygments repo, the hll class does appear. In particular it's emitted here:
https://github.com/pygments/pygments/blob/d0acfff1121f9ee3696b01a9077ebe9990216634/pygments/formatters/html.py#L915-L933
and elsewhere there's code to emit a CSS rule applying to it. It looks like the feature it applies to is "highlighted lines".

And then grepping the zulip repo for "hl_lines", which is the pygments option that activates that feature, it does look like we offer it. I think it can be exercised with a test message like… aha, yeah, like this:

```
::markdown hl_lines="2 4"
# he
## llo
### world
```

which will give a yellow highlight to the lines # he and ### world.

Kind of odd there with that :: syntax. I'm not sure we really intend to support that. But it's there, and it means there are existing messages that use it.

@gnprice
Copy link
Member

gnprice commented Jul 26, 2023

  • We don't have a style for all of the token types from pygments library, and zulip also doesn't style all of them. So how we currently handle it is: parser parses all of the token types possible, in ui-view/page/widget/drawing (idk what really to call this phase) we pattern match to style only some of the tokens (this list is same as web).

Sounds good.

I'd call that phase "the widgets layer", or "at build time".

(Sometimes I also say "rendering", vs. "parsing" for what's in lib/model/content.dart. But probably I should avoid that term, as it has its own meaning in a Flutter context and so is potentially confusing.)

@rajveermalviya
Copy link
Member Author

Searching the pygments repo, the hll class does appear. In particular it's emitted here:

Thanks for finding it @gnprice

I'm not sure we really intend to support that. But it's there, and it means there are existing messages that use it.

Added the token as highlightedLines and the corresponding styles for it.

Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the revisions! Sorry for the bit of delay in replying.

Generally this looks great. The one substantial other piece we'll want before we can merge this is some tests, as mentioned at #242 (review) . If you'd like to discuss ideas for how to write good tests for this functionality, please mention it in the #mobile-team stream on chat.zulip.org and I'll be glad to talk about that.

I've just now gone and given this a full code review, and have a variety of smaller comments below.

Comment on lines 280 to 285
/// 'w'
whitespace,
/// 'esc'
escape,
/// 'err'
error,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move these long boring boilerplatey sections — all the places that contain a list of Pygments token types — into their own separate file. Or I guess their own two files: lib/model/code_block.dart and lib/widgets/code_block.dart.

I think that separation — boilerplate in a boilerplate file, and non-boilerplate code in a non-boilerplate file — helps keep things easy to read, because one uses very different reading styles for reading those two kinds of code.

The CodeBlockNode and CodeBlockSpanNode declarations can stay in this file, though.

Perhaps in the future if the number of files that exist to support the two content.dart files grows, we might organize them further, like lib/model/content/code_block.dart and so on. But let's leave aside that extra nesting for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Just to be clear in case "boring" or "boilerplatey" sound like they're bad things — in some contexts they could be, but boring and boilerplatey are exactly the right approach for some things, and this is an excellent example. I'm quite happy to see this boring boilerplate for translating these CSS classes to enum values and then to text styles.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

//
// Note: If you update this list make sure to update the permalink
// and the `tryFromString` function below.
enum CodeBlockSpanToken {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A value of this enum isn't really a token — it identifies a type of token. So it'd be better to make the last word of this name be "type" rather than "token".

Conversely, I think we need only one of the words "span" or "token" in this name. Each of these spans corresponds to a token Pyjamas found… hmm or I guess perhaps a run of consecutive tokens of the same type. Given that latter point, plus that we sometimes have a CodeBlockSpanNode which doesn't correspond to anything from Pyjamas — what we parse from a code block with no span element, which is what the user gets if they enter a plain code block with no language — I think "span" is the right version of the concept to mention in this name and we can skip "token". So:

Suggested change
enum CodeBlockSpanToken {
enum CodeBlockSpanType {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

Comment on lines 262 to 265
@override
List<DiagnosticsNode> debugDescribeChildren() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
@override
List<DiagnosticsNode> debugDescribeChildren() {
@override
List<DiagnosticsNode> debugDescribeChildren() {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

Comment on lines 264 to 266
return spans
.map((node) => node.toDiagnosticsNode())
.toList();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
return spans
.map((node) => node.toDiagnosticsNode())
.toList();
return spans.map((node) => node.toDiagnosticsNode()).toList();

That makes a bit more compact so it takes up less vertical space, letting the reader see more of the node classes above and below it on their screen at once. It also makes it more visibly parallel to the similar debugDescribeChildren implementations in neighboring classes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

Comment on lines 290 to 291
/// 'kc'
keywordConstant,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate having docs on these. Let's go a bit further and bring them up to our standard docs style, as complete sentences:

Suggested change
/// 'kc'
keywordConstant,
/// A code-block span with CSS class `kc`.
keywordConstant,

(Similarly for all the others. Given how successfully you've produced all the other boilerplate sections of this PR, I imagine you already know how to use find-and-replace or multiple-cursors or similar tools to efficiently make this change to all the enum values at once. For anything in that vein where you do have questions, though, I'd be happy to discuss in #mobile-team on chat.zulip.org.)

For docs style in general, we largely follow the same style as Flutter itself, so Flutter's style guide is a good reference:
https://github.com/flutter/flutter/wiki/Style-guide-for-Flutter-repo#use-correct-grammar
(We don't do the really high-effort parts of Flutter's docs style, though — sample code and images — because that level of effort on API docs makes sense only when lots of developers are going to consume it.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack


Widget _buildText() {
return switch (node.spans) {
[] => Text('', style: _kCodeBlockStyle),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly to the other comment: can we just delete this case and let the default case handle it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

decoration: BoxDecoration(
color: Colors.white,
border: Border.all(
width: 1,
color: const HSLColor.fromAHSL(0.15, 0, 0, 0).toColor())),
color: const HSLColor.fromAHSL(0.15, 0, 0, 0).toColor()),
borderRadius: BorderRadius.circular(4)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good change, thanks. (I think this border radius must be one of the changes that happened in the recent web redesign; see #157.)

Let's put it in a separate commit, though, so it can get its own commit message. See the Zulip style guide for commits in a branch:
https://zulip.readthedocs.io/en/latest/contributing/commit-discipline.html

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, moved to a separate commit

Comment on lines +399 to +269
child: Padding(
padding: const EdgeInsets.fromLTRB(7, 5, 7, 3),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for moving this padding here instead of the Container? On a quick experiment it seems like I get the same output with or without this change.

(Probably it should also be its own commit, or else in the same commit with the borderRadius if the reason for it is related to that change.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a comparison:
(note the padding on ScrollView adds the spacing on all sides between the actual scrollable view and the code block borders)

Before After
before after

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks. Definitely looks like a good change.

Here's a version for the commit message:

content: Move CodeBlock padding to inside scroll view

This way, when the text is wide and needs to scroll, the padding
scrolls with the text, rather than reducing the size of the viewport.
See screenshots:
  https://github.com/zulip/zulip-flutter/pull/242#discussion_r1285096855

InlineSpan _buildNode(CodeBlockSpanNode node) {
final style = switch (node.tokenType) {
CodeBlockSpanToken.text => null, // styles not applied for Text token
CodeBlockSpanToken.highlightedLines => _kCodeBlockStyleHll,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so, here's how this looks for my highlighted-lines test message:
image

But here's how it looks on web:
image

Note that on web, the text colors and bold are still there — there's just also the background color for the line highlight.

It looks like in order to handle this feature, we'll need some more complication in the parser and the data structure. Here's how the HTML comes out (see test/model/content_test.dart for tips on conveniently getting the server's HTML):

<span class="hll"><span class="gh"># he</span>
</span><span class="gu">## llo</span>

Note the nesting, with a span.hll around another span.

Given that it'd require making our implementation more complicated, and that it's a feature Zulip doesn't support very nicely in the first place (#242 (comment)) and it's not clear we actually intend to support, I'd be happy to just not support it, at least for the present. In that case we can have the parser just produce UnimplementedBlockContentNode when it encounters one of these.

Let's do include a test case involving this feature, though. That'll help validate that we don't do anything wild when we run into it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the implementation to generically handle one level of nested span nodes in parsing the code block spans. For now, this is sufficient for span.hll nodes.

Also added a unit test to handle "highlighted lines" tokens.

Comment on lines 266 to 269
// .o { color: hsl(332deg 70% 38%); }
final _kCodeBlockStyleO = TextStyle(color: const HSLColor.fromAHSL(1, 332, 0.7, 0.38).toColor());
// .cm { color: hsl(180deg 33% 37%); font-style: italic; }
final _kCodeBlockStyleCm = TextStyle(color: const HSLColor.fromAHSL(1, 180, 0.33, 0.37).toColor(), fontStyle: FontStyle.italic);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(BTW this way of formatting these is great — I appreciate that it's so compact. No need to turn the comments into doc comments or whatever, which would inevitably make them less compact.)

As I write that, though, I realize there's one tweak which would I think improve the formatting further: add a blank line to separate each definition from the next one's comment. That just makes the grouping more unambiguous so it's quicker for the eye to go from a comment to the corresponding code and back.

It does make things a bit less compact, but so it goes.

That's also an advantage of moving the boilerplate to its own file — if it stayed here in this file, then making this section 50% longer would be a real cost of its own to the readability of the rest of the file, making it more of a trade-off.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

@rajveermalviya rajveermalviya force-pushed the syntax-highlighting branch 2 times, most recently from d81888d to 9eaa9d7 Compare August 5, 2023 17:37
@rajveermalviya
Copy link
Member Author

Sorry for delayed revision @gnprice, I have updated & rebased with the suggested changes.

Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the revision! Generally these changes look good, except I'd like to do without the complexity brought by supporting hll and nested spans. Details below.

if (type == CodeBlockSpanType.unknown) {
return (null, false);
}
return (CodeBlockSpanNode(text: text,type: type), true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
return (CodeBlockSpanNode(text: text,type: type), true);
return (CodeBlockSpanNode(text: text, type: type), true);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

// Further implementation is based on an assumption that there will
// be only a single class associated with the span element.
&& classes.length == 1:
final type = CodeBlockSpanTypeExt.fromClassName(className);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, awkward that this ends up needing to mention the name we come up with for the extension.

Seems like we can't put a static method on an enum, which is too bad.

Probably cleanest to just make it a top-level function of its own, like codeBlockSpanTypeFromClassName. Long name, but not longer than CodeBlockSpanTypeExt.fromClassName :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

@@ -161,4 +161,33 @@ void main() {
check(tester.takeException()).isA<AssertionError>();
});
});

group("CodeBlock", () {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's squash the new tests into the same commit that introduces the code they're testing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, squashed

Comment on lines +399 to +269
child: Padding(
padding: const EdgeInsets.fromLTRB(7, 5, 7, 3),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks. Definitely looks like a good change.

Here's a version for the commit message:

content: Move CodeBlock padding to inside scroll view

This way, when the text is wide and needs to scroll, the padding
scrolls with the text, rather than reducing the size of the viewport.
See screenshots:
  https://github.com/zulip/zulip-flutter/pull/242#discussion_r1285096855

Comment on lines 676 to 678
if (child is dom.Element && child.nodes.isNotEmpty) {
if (child.nodes.length == 1 && child.nodes.first is dom.Text) {
// Do nothing, fallthrough to the switch statement below.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
if (child is dom.Element && child.nodes.isNotEmpty) {
if (child.nodes.length == 1 && child.nodes.first is dom.Text) {
// Do nothing, fallthrough to the switch statement below.
if (child is dom.Element && child.nodes.isNotEmpty) {
if (child.nodes.length == 1 && child.nodes.first is dom.Text) {
// Do nothing, fallthrough to the switch statement below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

@@ -161,4 +161,33 @@ void main() {
check(tester.takeException()).isA<AssertionError>();
});
});

group("CodeBlock", () {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: move this above the two existing groups

That way we keep the tests in the same order as the code they're testing, at least at a coarse level of granularity: CodeBlock/CodeBlockNode, then LinkNode which is handled by InlineContent, then RealmContentNetworkImage.

(That isn't always possible because a test doesn't always correspond neatly to a particular area of code. But in this case they do nicely correspond, so we can.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

expect(find.text('class A {}'), findsOneWidget);
});

testWidgets('without syntax highlighting', (WidgetTester tester) async {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: put "without" case before "with", because it's simpler

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

Comment on lines 178 to 179
'\n</code></pre></div>'
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: when an item in a comma-separated list (not only List, but any syntactic list of things) is followed by a newline, always use a comma:

Suggested change
'\n</code></pre></div>'
);
'\n</code></pre></div>',
);

One has to do that anyway if it's any list item other than the last; so this trailing comma makes the syntax more symmetric, which is helpful for editing and for keeping diffs less noisy.

The other option here is to follow it directly with the close-paren instead of a newline:

        '\n</code></pre></div>');

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

'\n</code></pre></div>'
);

expect(find.text('class A {}'), findsOneWidget);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're writing our tests with package:checks and its check function, rather than expect. See our README:
https://github.com/zulip/zulip-flutter/blob/3cba0582fc68d875872b0b0394d75eedef109bbb/README.md#writing-tests

For this particular use case, we don't have a great idiom yet; that's #232. You could use the idiom mentioned there, or this would also be OK:

Suggested change
expect(find.text('class A {}'), findsOneWidget);
expect(find.text('class A {}'), findsOneWidget); // TODO(#232): use package:checks

i.e., leave a comment that marks clearly that this is something we want to fix (which also helps avoid people taking it as an example for other situations) and that makes it easy to find this when we fix #232 and make a sweep of all the affected spots.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, now uses tester.widget()

Comment on lines 167 to 169
await tester.pumpWidget(MaterialApp(home: BlockContentList(nodes: parseContent(html).nodes)));
await tester.pump();
await tester.pump();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These latter pumps aren't needed:

Suggested change
await tester.pumpWidget(MaterialApp(home: BlockContentList(nodes: parseContent(html).nodes)));
await tester.pump();
await tester.pump();
await tester.pumpWidget(MaterialApp(home: BlockContentList(nodes: parseContent(html).nodes)));

In the other prepareContent, they're needed because GlobalStoreWidget and PerAccountStoreWidget each involve an async step. (To read from the database and fetch from the server, respectively. Actually we go to some trouble to make PerAccountStoreWidget after the data for a given account is first loaded — see _PerAccountStoreWidgetState.didChangeDependencies and the test "PerAccountStoreWidget immediate data after first loaded" in test/widgets/store_test.dart — but on first load, both those async steps are pretty fundamentally needed.)

But here, there's no async load involved, so it all builds on the first frame. You can verify that by just deleting these and seeing that the tests still pass, thanks to those find.text expectations which ensure the tests wouldn't spuriously pass if the widgets we want to test weren't yet built.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

rajveermalviya added a commit to rajveermalviya/zulip-flutter that referenced this pull request Aug 10, 2023
This way, when the text is wide and needs to scroll, the padding
scrolls with the text, rather than reducing the size of the viewport.
See screenshots:
  zulip#242 (comment)
rajveermalviya added a commit to rajveermalviya/zulip-flutter that referenced this pull request Aug 10, 2023
This way, when the text is wide and needs to scroll, the padding
scrolls with the text, rather than reducing the size of the viewport.
See screenshots:
  zulip#242 (comment)
@rajveermalviya
Copy link
Member Author

Thanks for the review @gnprice, I have rebased and updated the changes based on the suggestions.

@rajveermalviya rajveermalviya requested a review from gnprice August 10, 2023 10:16
rajveermalviya added a commit to rajveermalviya/zulip-flutter that referenced this pull request Aug 10, 2023
This way, when the text is wide and needs to scroll, the padding
scrolls with the text, rather than reducing the size of the viewport.
See screenshots:
  zulip#242 (comment)
@rajveermalviya
Copy link
Member Author

(Just rebased to main)

Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the revision! This is getting close; various small comments below.

when localName == 'span'
// We assume that server will always provide us with a span element
// that has at most one class associated with the span element.
&& classes.length <= 1:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be == 1 instead? That is, is there some known case where we get a span at this spot that has no classes?

If so, that'd nicely tighten the logic further.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skimming through pygments html formatter: It doesn't seem to emit a span element if css_class is empty. So, it's safe to say this will never happen; Removing that case.

Comment on lines 677 to 678
final CodeBlockSpanNode span;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
final CodeBlockSpanNode span;
final CodeBlockSpanNode span;

That way this variable is declared immediately before the switch that initializes it. It's naturally closely tied to that, so it's helpful to have it visually adjacent.

Comment on lines 694 to 699
final CodeBlockSpanType type;
if (classes.isEmpty) {
type = CodeBlockSpanType.text;
} else {
type = codeBlockSpanTypeFromClassName(classes.first);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this one is probably simpler in ?: form

(though it'll go away if we can tighten the logic above to require classes.length == 1)

Comment on lines 691 to 692
// We assume that server will always provide us with a span element
// that has at most one class associated with the span element.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can cut the comment — the code just around it expresses the same content

(These lines of code don't exactly directly say the "we assume" angle of it. But that's expressed by the fact that if these conditions fail, we return an UnimplementedNode.)

Comment on lines 713 to 708
}
if (span.text.isNotEmpty) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
}
if (span.text.isNotEmpty) {
}
if (span.text.isNotEmpty) {

The switch above is a big multi-stanza piece of code, and this next bit is separate from that, so I think it's helpful to separate it as a new stanza.

Comment on lines 336 to 337
testParse('parse code blocks, unknown span type',
// A code block span with unknown className.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
testParse('parse code blocks, unknown span type',
// A code block span with unknown className.
testParse('parse code blocks, unknown span type',
// (no markdown; this test is for future Pygments versions adding new token types)

In particular that makes it explicit that this test intentionally doesn't have Markdown (that that's not just a mistake), and explains why.

Comment on lines 339 to 344
'<div class="codehilite"><pre>'
'<code>::markdown hl_lines="2 4"'
'<span class="hll"><span class="gh"># he</span>'
'</span><span class="gu">## llo</span>'
'<span class="hll"><span class="gu">### world</span>'
'</span></code></pre></div>', const [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great. Yeah, the browser inspector is very helpful for things like understanding what the web app's CSS does with the HTML, but it doesn't show quite the same HTML as we get from the server.

In some cases that's because the web app's JS code makes changes to the HTML:
https://chat.zulip.org/#narrow/stream/378-api-design/topic/Video.20markdown.20support.20.2320449/near/1620082
so those can make things arbitrarily different. And then I think some of these other differences may be the browser's doing.

]);

testParse('parse code blocks, with syntax highlighting and highlighted lines',
// "```\n::markdown hl_lines="2 4"\n# he\n## llo\n### world\n```"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not valid syntax :-) (continuing from #242 (comment) )

Try uncommenting the line, and you'll see parse errors. That's because there are double-quotes " inside the intended string, which interfere with the double-quotes used to enclose it.

I think the simplest fix is to enclose with single-quotes ' instead, as in my previous suggestion. Triple quotes would also work:

Suggested change
// "```\n::markdown hl_lines="2 4"\n# he\n## llo\n### world\n```"
// """```\n::markdown hl_lines="2 4"\n# he\n## llo\n### world\n```"""

In principle another solution would be to escape the internal quotes, like \". But I think that's less preferable because it makes it a bit harder to read — it takes a moment to work out that the backslashes aren't ultimately there in the string being represented.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah of course, sorry I should've checked.

Comment on lines 682 to 683
// Parse one level of nested code block span nodes, for example for
// `span.hll` node.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

}

InlineSpan _buildNode(CodeBlockSpanNode node) {
return TextSpan(text: node.text, style: codeBlockTextStyle(node.type));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also squash the commit that adds the styles:
fb44680 content: Add styles for each token type

into the commit that adds the parsing:
739faee content: Add CodeBlockSpanNode to parse span types

The parsing is only useful because of the styles, so I think it makes more sense to read that way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

squashed

rajveermalviya added a commit to rajveermalviya/zulip-flutter that referenced this pull request Aug 12, 2023
This way, when the text is wide and needs to scroll, the padding
scrolls with the text, rather than reducing the size of the viewport.
See screenshots:
  zulip#242 (comment)
@rajveermalviya
Copy link
Member Author

Thanks for the review @gnprice, updated and rebased!

rajveermalviya added a commit to rajveermalviya/zulip-flutter that referenced this pull request Aug 12, 2023
This way, when the text is wide and needs to scroll, the padding
scrolls with the text, rather than reducing the size of the viewport.
See screenshots:
  zulip#242 (comment)
@rajveermalviya rajveermalviya requested a review from gnprice August 12, 2023 06:08
Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Just one ongoing thread now, and a couple of small other comments.

Comment on lines 714 to 711
if (span.text.isNotEmpty) {
// We don't want to emit unnecessary CodeBlockSpanNode with empty text
// which results in a TextSpan with empty text which results in
// nothing displayed in the end anyway.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I see. Yeah, it's that trailing newline — so it's not that we get an empty node in the HTML, but there's that one case where we make the text empty.

In the existing code in main, that just results in calling buffer.write with an empty-string argument. Which is fine; it basically does nothing, and it's unlikely that adding our own conditional there would be any kind of optimization.

I agree it's good to have an optimization like this now. To really have its full effect, it should avoid not just adding the CodeBlockSpanNode to the list, but allocating the CodeBlockSpanNode in the first place when we're not going to need it.

In previous revisions this if check was inside the dom.Text() case but I moved it out when adding the final CodeBlockSpanNode span; because dart doesn't like that we don't assign it in the else case.

The analyzer will be completely happy with not assigning span as long as that control-flow path can't ever end up trying to read the uninitialized span. So for example one can use continue, like this:

          if (text.isEmpty) {
            continue;
          }
          span = CodeBlockSpanNode(text: text, type: CodeBlockSpanType.text);

I think it's fine to not have anything for the dom.Element case. We're not altering the text of those, so we wouldn't get an empty span there unless the server actually emits one in the HTML.

Comment on lines 714 to 711
if (span.text.isNotEmpty) {
// We don't want to emit unnecessary CodeBlockSpanNode with empty text
// which results in a TextSpan with empty text which results in
// nothing displayed in the end anyway.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also curious now to see the behavior on newlines that we don't strip out — that is, newlines on other than the last line. Let's add a small test case where the code in the code block is a couple of lines long, and has syntax highlighting.

Comment on lines 689 to 694
case dom.Element(:final text, :final classes, :final localName)
when localName == 'span'
&& classes.length == 1:
final CodeBlockSpanType type = codeBlockSpanTypeFromClassName(classes.first);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
case dom.Element(:final text, :final classes, :final localName)
when localName == 'span'
&& classes.length == 1:
final CodeBlockSpanType type = codeBlockSpanTypeFromClassName(classes.first);
case dom.Element(localName: 'span', :final text, :final classes)
when classes.length == 1:
final CodeBlockSpanType type = codeBlockSpanTypeFromClassName(classes.first);

The interesting part of this is moving the localName check inside the pattern.

Also it gets confusing for the when to be at the same indentation level as the body — particularly once the when itself is on the line right before the body — so we can indent it a second level. (Similar to what we do with multiline if conditions, like in parseInlineContent above.)

'gt' => CodeBlockSpanType.genericTraceback,
_ => CodeBlockSpanType.unknown,
};
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: last line is missing its newline

(git diff or git log -p will highlight this for you:

+  };
+}
\ No newline at end of file

)

Comment on lines 694 to 698
case CodeBlockSpanType.unknown:
return UnimplementedBlockContentNode(htmlNode: divElement);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unimplemented case is an unusual one in that (a) we totally expect it to come up from time to time, as Pygments adds new token types and the server takes Pygments upgrades; (b) there's a natural sensible way to render it for the user when we do.

So let's add a comment to remind ourselves to handle this case specifically, as part of #194:

Suggested change
case CodeBlockSpanType.unknown:
return UnimplementedBlockContentNode(htmlNode: divElement);
case CodeBlockSpanType.unknown:
// TODO(#194): Show these as un-syntax-highlighted code, in production.
return UnimplementedBlockContentNode(htmlNode: divElement);

rajveermalviya added a commit to rajveermalviya/zulip-flutter that referenced this pull request Aug 15, 2023
This way, when the text is wide and needs to scroll, the padding
scrolls with the text, rather than reducing the size of the viewport.
See screenshots:
  zulip#242 (comment)
@rajveermalviya
Copy link
Member Author

Thanks for the review @gnprice, rebased with the suggested changes.

@rajveermalviya rajveermalviya requested a review from gnprice August 15, 2023 23:59
Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rajveermalviya for all your work on this!

This all looks great, with one small nit. I'm going to fix that and merge.

]),
]);

testParse('parse code blocks, multiline, with sytax highlighting',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
testParse('parse code blocks, multiline, with sytax highlighting',
testParse('parse code blocks, multiline, with syntax highlighting',

- Implement parsing of all the pygments token types that can be emitted
  by pygments as span classes.

- Add styles adapted from web frontend's css for the code block spans.

Fixes: zulip#191
This way, when the text is wide and needs to scroll, the padding
scrolls with the text, rather than reducing the size of the viewport.
See screenshots:
  zulip#242 (comment)
@gnprice gnprice force-pushed the syntax-highlighting branch from 265d288 to 5e6a0ec Compare August 17, 2023 03:40
@gnprice gnprice merged commit 5e6a0ec into zulip:main Aug 17, 2023
@rajveermalviya rajveermalviya deleted the syntax-highlighting branch August 17, 2023 05:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Show highlighting in code blocks
2 participants