Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

支持多条的正则表达式, 支持弹幕合并, #15

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

My-Responsitories
Copy link

close #13, close #14

  • 支持由\n分割的多条的正则表达式, 但对utf8字符的处理仍有问题, std和boost的regex都不能很好的处理utf8
  • 支持弹幕合并
  • 简单优化了一下代码

@HFrost0
Copy link
Owner

HFrost0 commented Feb 18, 2024

感谢贡献,如果引入了新的boost文件,需要相应进行添加,不然CI会失败

@HFrost0
Copy link
Owner

HFrost0 commented Feb 18, 2024

如果C++内部不好做正则可以考虑在python中进行过滤,应该不至于影响太多性能

@My-Responsitories
Copy link
Author

感谢贡献,如果引入了新的boost文件,需要相应进行添加,不然CI会失败

boost的正则对utf8的处理也不好, 所以没加, 用的还是std
不过 https://github.com/google/re2 倒是可以正常处理utf8字符, 但我不会写CI

@My-Responsitories
Copy link
Author

如果C++内部不好做正则可以考虑在python中进行过滤,应该不至于影响太多性能

弹幕合并是在正则过滤前进行的, 在python里过滤会把相同内容的弹幕反复判断

@HFrost0
Copy link
Owner

HFrost0 commented Feb 19, 2024

CI我来看看吧,没有加入新的文件应该就没有问题。自测没有问题后可以merge

@sherlcok314159
Copy link

有进展吗?没计划我可以新开一个 PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FR: 去除重复的弹幕 filter支持多条的正则表达式
3 participants