Skip to content

Conversation

pajot
Copy link

@pajot pajot commented Oct 13, 2019

We copy tags from shape.text to shape.name at template instantiation to make them resistant to user mischief :)

lysnikolaou and others added 7 commits October 12, 2019 15:00
Add _get_all_shapes internal method. The test simply checks to see if
the attributes we expect to use the shape for are present and whether
they can be coerced to strings.
Copy tags at template instantiation from shape.text to shape.name
to make them resistant to user clobbering.
pajot and others added 2 commits October 14, 2019 21:06
pajot added 3 commits October 15, 2019 10:55
Use the regex match group for shape.name. This way we also include the curly
braces and ignore non-matching text in shape.text.
return matched_shapes

def _get_all_shapes(self) -> Iterable[BaseShape]:
# Do we need all the shapes? Perhaps we should filter on tags here.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to iterate over each shape at least once and apply the re.match to identify whether there's some work for us to do. It's a decision of style if you want to first get all shapes and then match and work, or if you get all matching slides and then do some work. For a (extremely) big presentation, however, generating a list of all slides may be quite memory expensive and possibly slow. Both issues could be eliminated if you'd just replace the square brackets with regular brackets, thereby changing the list comprehension into a generator expression.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, please inline...

Copy link
Author

@pajot pajot Oct 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, please inline...

Do you mean the comment? Isn't it a bit long to be inlined? (PEP8 says "use inline comments sparingly ;) )

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inlining means to eliminate the function and use the code of the function body instead of the function call. This is reasonable in this particular case because

  • the function is only used once (and unlikely to be needed multiple times)
  • the inline version is almost equally short and equally descriptive.
    instead of
    all_shapes = self._get_all_shapes()
    
    just use
    all_shapes = (shape
                  for slide in self._presentation.slides
                  for shape in slide.shapes)
    

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying my comment, Tim. I now see that I could have been a little more verbose :-)

return all_shapes

def _copy_tags_to_name(self) -> None:
all_shapes = self._get_all_shapes()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please inline

def _copy_tags_to_name(self) -> None:
all_shapes = self._get_all_shapes()
# This regex matches on tags
regex_tag = re.compile(r'^\s*(\{\w+\})\s*$')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be defined as a class constant?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Further, why are the curly brackets part of the group? Do we really want to set the shape.name to e.g. '{location}' instead of 'location'?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Further, why are the curly brackets part of the group? Do we really want to set the shape.name to e.g. '{location}' instead of 'location'?

Tim and I had discussed this, yes. Leaving the curly braces in makes it clear that this is a tag.

Copy link
Owner

@timhoffm timhoffm Oct 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be defined as a class constant?

Could do, but if so, please private. I don't want to make this user-configurable as of now. Strictly speaking, it would even be sufficient not to explicitly compile the regexp and just use inline if re.match(r'^\s*(\{\w+\})\s*$', shape.text):. re caches the last few (I belive to remember 100?) expressions. Therefore, the overhead for this basic call vs. the compiled regexp is just a dict lookup, which is negligible.

Do we really want to set the shape.name to e.g. '{location}' instead of 'location'?

I tend to say yes. While the curly brace feels a bit bulky here, it serves to highly reduce the likelihood of a name collision between a tag and an existing shape name.

def test_get_all_shapes(template):
shapes = template._get_all_shapes()
for shape in shapes:
assert str(shape.text) and str(shape.name)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So how does this test that I 'get all the shapes'? I'm afraid that you'd have to mock the _presentation to verify this...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I simply named the test after the function it is testing. But no, it is not actually testing that I have all the shapes. But code to test this is necessarily going to be as complex as the code that performs it.

At the very least, the code checks that what was obtained has at least some properties we would expect to find in slides :)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the test would be rather complex. At least you'd have to monkey patch the Presentation class (because we don't use dependency injection) and configure the mock (which seems tedious). Further, there are people out there that say "don't test private methods, just test the public interface", i.e., you would not write a test for _get_all_shapes() at all, but only test it indirectly by testing the public method it is used in.

Would probably the simplest to just give the test a more descriptive (cool slang for "longer") name ;-)

regex_tag = re.compile(r'\{[\s\w]*\}')
for shape in all_shapes:
if regex_tag.match(shape.text):
assert shape.text == shape.name
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really what you wan't to test? Taking a look at the regex you defined in the implementation, for a shape having the text ' {location} ', you would set the name as '{location}'. Hence, in this case, shape.text != shape.name.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, this test needs to be fixed :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants