Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jsdomに戻し、brのみを改行とすることに変更 #149

Merged
merged 1 commit into from
Aug 3, 2024

Conversation

ttizze
Copy link
Owner

@ttizze ttizze commented Aug 3, 2024

Closes #148

Summary by CodeRabbit

### リリースノート

#### Refactor
- `extractNumberedElements`関数が`htmlparser2`から`jsdom`に置き換えられました。これにより、HTMLコンテンツの解析方法が改善され、`data-number`属性を持つ要素の抽出と整理が効率化されました。

Copy link

github-actions bot commented Aug 3, 2024

This PR is linked to issue #148

Copy link

github-actions bot commented Aug 3, 2024

Image description CodeRabbit

- **ウォークスルー**: この変更では、HTMLコンテンツの解析に使用するライブラリが`htmlparser2`から`jsdom`に置き換えられました。また、関数`extractNumberedElements`のシグネチャが簡素化され、`data-number`属性を持つ要素を抽出して番号付きリストを生成するロジックが導入されています。

- **変更点**:
  | ファイル | 要約 |
  | --- | --- |
  | web/app/feature/translate/utils/extractNumberedElements.ts | `htmlparser2`から`jsdom`への置き換え。`title`引数の削除とHTML解析方法の変更。`data-number`属性を持つ要素の抽出と整理。 |

Uplevel your code reviews with CodeRabbit Pro

CodeRabbit Pro

If you like this project, please support us by purchasing the Pro version. The Pro version has advanced context, superior noise reduction and several proprietary improvements compared to the open source version. Moreover, CodeRabbit Pro is free for open source projects.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description CodeRabbit

Commits Files that changed from the base of the PR and between 1736ee2 and 59266eb commits.
Files selected (1)
  • web/app/feature/translate/utils/extractNumberedElements.ts (1)
Review comments generated (3)
  • Review: 2
  • LGTM: 1

Tips

Chat with Image description CodeRabbit Bot (@coderabbitai)

  • Reply on review comments left by this bot to ask follow-up questions. A review comment is a comment on a diff or a file.
  • Invite the bot into a review comment chain by tagging @coderabbitai in a reply.

Code suggestions

  • The bot may make code suggestions, but please review them carefully before committing since the line number ranges may be misaligned.
  • You can edit the comment made by the bot and manually tweak the suggestion if it is slightly off.

Pausing incremental reviews

  • Add @coderabbitai: ignore anywhere in the PR description to pause further reviews from the bot.

Comment on lines +6 to +11
const doc = new JSDOM(content);
const numberedElements: Array<{ number: number; text: string }> = [];
// <br>のみを改行とする
doc.window.document.body.innerHTML = doc.window.document.body.innerHTML
.replace(/\n/g, "")
.replace(/<br\s*\/?>/gi, "\n");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description CodeRabbit

doc.window.document.body.innerHTMLの操作は、XSS攻撃に対して脆弱です。ユーザー入力を直接DOMに挿入する前に、適切なサニタイズ処理を行うべきです。

- 	doc.window.document.body.innerHTML = doc.window.document.body.innerHTML
+ 	const sanitizedContent = content.replace(/<script[^>]*>([\S\s]*?)<\/script>/gmi, '');
+ 	doc.window.document.body.innerHTML = sanitizedContent

if (dataNumber !== null) {
numberedElements.push({
number: Number.parseInt(dataNumber, 10),
text: element.textContent?.trim() || "",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description CodeRabbit

Number.parseIntの使用は適切ですが、NaNチェックが不足しています。data-number属性が数値でない場合のエラーハンドリングを追加すべきです。

- 				number: Number.parseInt(dataNumber, 10),
+ 				number: Number.isNaN(Number.parseInt(dataNumber, 10)) ? 0 : Number.parseInt(dataNumber, 10),

@ttizze ttizze changed the title f jsdomに戻し、brのみを改行とすることに変更 Aug 3, 2024
@ttizze ttizze merged commit f8f7d6b into main Aug 3, 2024
5 checks passed
@ttizze ttizze deleted the ttizze/fix-issue-148 branch August 3, 2024 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

htmlparser2で自己閉じタグで問題が起きている
1 participant