-
Notifications
You must be signed in to change notification settings - Fork 44
Count bytes in wc -c example
#27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Bobronium
commented
Jul 20, 2022
```shell
$ echo 🐍 | wc -c
5
$ echo 🐍 | pyp 'len(stdin.read())'
2
$ echo 🐍 | pyp 'len(stdin.read().encode())'
5
```
hauntsaninja
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for spotting this!
Technically, I think stdin.buffer.read() would be a little more accurate, since IIRC stdin isn't guaranteed to be UTF-8.
Also, would you mind making the corresponding change here as well?
Line 90 in 3f16ad5
| example_cmd="wc -c | tr -d ' '", pyp_cmd="pyp 'len(stdin.read())'", input=example |
|
Oh, I forgot to post the issue that answers why I didn't propose Probably should've opened it in the first place :) |
Emoji represents non-ascii input.
This should make tests pass, but I still haven't cloned the project to run any linters, etc.
|
Interesting. Tests are failing because of the difference between sort on macOS and Ubuntu. How should we handle this, @hauntsaninja? On Ubuntu: $ echo "1\n🐍\n2\n3" | sort
🐍
1
2
3On macOS: $ echo "1\n🐍\n2\n3" | sort
1
2
3
🐍With pyp (python sort): ~/dev/contrib/pyp on patch-1 (.venv)
$ echo "1\n🐍\n2\n3" | pyp "sorted(lines)"
1
2
3
🐍 |
|
Oh, interesting. This kind of thing is why I like pyp ;-) Maybe we just remove the emoji from the tests. It doesn't really test the behaviour of pyp itself. The commands in the README are only approximately "like" the standard shell commands, so we don't need to have tests that enforce that they super faithfully match the standard shell commands. |
|
Ok, I've seen your comment way too late :) I woke up with an idea how we can deal with it, keeping emoji (example of UTF-8 input) and shining the light on potential differences between platforms. Basically, I moved arguments from individual calls to |