Skip to content
Merged

baml #13

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions bun.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
"docker:bg": "docker build -f docker/Dockerfile.bg -t r8y-bg .",
"prepare": "effect-language-service patch",
"db": "bun run --cwd packages/db src/cli.ts",
"backfill": "bun run --cwd packages/channel-sync src/scripts/backfill.ts backfill"
"backfill": "BAML_LOG=warn bun run --cwd packages/channel-sync src/scripts/backfill.ts backfill"
},
"devDependencies": {
"@effect/language-service": "^0.62.0",
Expand Down
2 changes: 2 additions & 0 deletions packages/channel-sync/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ dist
coverage
*.lcov

baml_client/

# logs
logs
_.log
Expand Down
5 changes: 4 additions & 1 deletion packages/channel-sync/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,9 @@
"scripts": {
"check": "tsc --noEmit",
"format": "prettier --write .",
"lint": "prettier --check ."
"lint": "prettier --check .",
"gen:bml": "baml-cli generate --from ./src/baml_src",
"postinstall": "bun run gen:bml"
},
"devDependencies": {
"@effect/cli": "^0.72.1",
Expand All @@ -20,6 +22,7 @@
},
"dependencies": {
"@ai-sdk/groq": "^2.0.33",
"@boundaryml/baml": "^0.215.0",
"@doist/todoist-api-typescript": "^6.2.1",
"@openrouter/ai-sdk-provider": "^1.5.4",
"ai": "^5.0.115",
Expand Down
71 changes: 6 additions & 65 deletions packages/channel-sync/src/ai/index.ts
Original file line number Diff line number Diff line change
@@ -1,10 +1,8 @@
import { generateObject } from 'ai';
import z from 'zod';
import { Effect, Schedule } from 'effect';
import { TaggedError } from 'effect/Data';
import { createOpenRouter } from '@openrouter/ai-sdk-provider';
import { b } from '../baml_client';

const retrySchedule = Schedule.intersect(Schedule.spaced('1 minute'), Schedule.recurs(3));
const retrySchedule = Schedule.intersect(Schedule.spaced('1 minute'), Schedule.recurs(2));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: retry count reduced from 3 to 2. could increase failure rate for transient API issues.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/channel-sync/src/ai/index.ts
Line: 5:5

Comment:
**style:** retry count reduced from 3 to 2. could increase failure rate for transient API issues.

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.


class AiError extends TaggedError('AiError') {
constructor(message: string, options?: { cause?: unknown }) {
Expand All @@ -21,84 +19,27 @@ const aiService = Effect.gen(function* () {
return yield* Effect.die('OPENROUTER_API_KEY is not set');
}

const openrouter = createOpenRouter({
apiKey: openrouterApiKey,
headers: {
'HTTP-Referer': 'https://r8y.app',
'X-Title': 'r8y'
}
});

const hmm = openrouter('openai/gpt-oss-120b', {
extraBody: {
provider: {
only: ['groq', 'cerebras']
}
}
});

return {
classifyComment: (data: { comment: string; videoSponsor: string | null }) =>
Effect.gen(function* () {
const classificationOutputSchema = z.object({
isEditingMistake: z.boolean(),
isSponsorMention: z.boolean(),
isQuestion: z.boolean(),
isPositiveComment: z.boolean()
});

const result = yield* Effect.tryPromise({
try: () =>
generateObject({
model: hmm,
prompt: `Your job is to classify this youtube video's comment. You need to return a boolean true/false for each of the following criteria:

- The comment is flagging an editing mistake
- The comment is flagging a packaging mistake (typo in title/description/thumbnail, missing link in description, etc.)
- The comment mentions the video's sponsor (or the channel's sponsors in general)
- The comment is a question
- The comment is a positive comment (the general sentiment of the comment is positive, this should be true unless the comment is a direct complaint/critique, if it's neutral it should be true)

The video sponsor is:
${data.videoSponsor || 'No sponsor'}

The comment is:
${data.comment}
`,
schema: classificationOutputSchema
}),
try: () => b.ClassifyComment(data.comment, data.videoSponsor),
catch: (err) => {
return new AiError('Failed to classify comment', { cause: err });
}
}).pipe(Effect.retry(retrySchedule));

return result.object;
return result;
}),

getSponsor: (data: { sponsorPrompt: string; videoDescription: string }) =>
Effect.gen(function* () {
const sponsorOutputSchema = z.object({
sponsorName: z.string(),
sponsorKey: z.string()
});

const result = yield* Effect.tryPromise({
try: () =>
generateObject({
model: hmm,
prompt: `Your job is to parse this youtube video's description to find the sponsor, and a key to identify the sponsor in the db. The following will tell you how to get each of those for this channel:

${data.sponsorPrompt}

The video description is:
${data.videoDescription}
`,
schema: sponsorOutputSchema
}),
try: () => b.GetSponsor(data.sponsorPrompt, data.videoDescription.toLocaleLowerCase()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: toLocaleLowerCase() added to video description. original implementation didn't lowercase the description. this changes the input to the LLM and could affect sponsor detection accuracy, especially for sponsors with specific capitalization in descriptions. was this lowercasing intentional to improve sponsor matching, or unintended change?

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/channel-sync/src/ai/index.ts
Line: 38:38

Comment:
**logic:** `toLocaleLowerCase()` added to video description. original implementation didn't lowercase the description. this changes the input to the LLM and could affect sponsor detection accuracy, especially for sponsors with specific capitalization in descriptions. was this lowercasing intentional to improve sponsor matching, or unintended change?

How can I resolve this? If you propose a fix, please make it concise.

catch: (err) => new AiError('Failed to get sponsor', { cause: err })
}).pipe(Effect.retry(retrySchedule));

return result.object;
return result;
})
};
});
Expand Down
21 changes: 21 additions & 0 deletions packages/channel-sync/src/baml_src/clients.baml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
retry_policy TryThreeTimes {
max_retries 3
}

client<llm> OpenrouterGptOssClient {
provider "openai-generic"
retry_policy TryThreeTimes
options {
base_url "https://openrouter.ai/api/v1"
api_key env.OPENROUTER_API_KEY
model "openai/gpt-oss-120b"
provider {
only ["groq", "cerebras"]
}
headers {
"HTTP-Referer" "https://r8y.app"
"X-Title" "r8y"
}
}
}

58 changes: 58 additions & 0 deletions packages/channel-sync/src/baml_src/comment-sentiment.baml
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
class CommentSentiment {
isEditingMistake bool
isSponsorMention bool
isQuestion bool
isPositiveComment bool
}

function ClassifyComment(comment: string, videoSponsor: string | null) -> CommentSentiment {
client OpenrouterGptOssClient
prompt #"
Read this comment decide if it mentions an editing mistake (something wrong with the video's audio, video, title, thumbnail, or description), the video's sponsor, and/or is a question.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: prompt missing "packaging mistake" criteria from original implementation. original prompt checked for "typo in title/description/thumbnail, missing link in description, etc." under editing mistakes. this changes what gets flagged as editing mistakes.

Suggested change
Read this comment decide if it mentions an editing mistake (something wrong with the video's audio, video, title, thumbnail, or description), the video's sponsor, and/or is a question.
Read this comment decide if it mentions an editing mistake (something wrong with the video's audio, video, title, thumbnail, or description) or a packaging mistake (typo in title/description/thumbnail, missing link in description, etc.), the video's sponsor, and/or is a question.
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/channel-sync/src/baml_src/comment-sentiment.baml
Line: 11:11

Comment:
**logic:** prompt missing "packaging mistake" criteria from original implementation. original prompt checked for "typo in title/description/thumbnail, missing link in description, etc." under editing mistakes. this changes what gets flagged as editing mistakes.

```suggestion
        Read this comment decide if it mentions an editing mistake (something wrong with the video's audio, video, title, thumbnail, or description) or a packaging mistake (typo in title/description/thumbnail, missing link in description, etc.), the video's sponsor, and/or is a question.
```

How can I resolve this? If you propose a fix, please make it concise.

Also decide if the comment is generally positive or negative.

The video's sponsor is {{ videoSponsor }}

The comment is:
{{ comment }}

{{ ctx.output_format }}
"#
}

test g2i_mention {
functions [ClassifyComment]
args {
comment "Can confirm, used g2i and have made a hire. BUT... they did not go from interview to first pull request in one week. It was 2 pull requests. :)"
videoSponsor "g2i"
}
@@assert({{ this.isSponsorMention == true}})
@@assert({{ this.isQuestion == false}})
@@assert({{ this.isEditingMistake == false}})
// need to optimize the prompt for this...
@@assert({{ this.isPositiveComment == true}})
}

test positive_comment {
functions [ClassifyComment]
args {
comment "AI therapist is so funny 😂 'You passed the test!' lmao"
videoSponsor null
}
}

test question_comment {
functions [ClassifyComment]
args {
comment "what's `renanme`"
videoSponsor "greptile"
}
}

test negative_question_comment {
functions [ClassifyComment]
args {
comment "Why are you paying for API usage? Just use Max for 100 bucks! Aren't you an investor who can afford $ 100 a month?"
videoSponsor "rork"
}
}
18 changes: 18 additions & 0 deletions packages/channel-sync/src/baml_src/generators.baml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
// This helps use auto generate libraries you can use in the language of
// your choice. You can have multiple generators if you use multiple languages.
// Just ensure that the output_dir is different for each generator.
generator target {
// Valid values: "python/pydantic", "typescript", "ruby/sorbet", "rest/openapi"
output_type "typescript"

// Where the generated code will be saved (relative to baml_src/)
output_dir "../"

// The version of the BAML package you have installed (e.g. same version as your baml-py or @boundaryml/baml).
// The BAML VSCode extension version should also match this version.
version "0.215.0"

// Valid values: "sync", "async"
// This controls what `b.FunctionName()` will be (sync or async).
default_client_mode async
}
99 changes: 99 additions & 0 deletions packages/channel-sync/src/baml_src/video-sponsor.baml
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
class SponsorInfo {
sponsorName string
sponsorKey string
}

function GetSponsor(sponsorPrompt: string, videoDescription: string) -> SponsorInfo {
client OpenrouterGptOssClient
prompt #"
Parse this youtube video's description to find the video's sponsor. You are looking for the sponsor's name and a key to identify the sponsor in the db. Both the key and name should be lowercase.

The following will tell you how to get each of those for this channel:
{{ sponsorPrompt }}

The video description is:
{{ videoDescription }}

{{ ctx.output_format }}
"#
}

test no_sponsor {
functions [GetSponsor]
args {
sponsorPrompt "The sponsor key for this channel is `https://soydev.link/${SPONSOR_NAME}`. There are often multiple soydev links in the description. The one for the sponsor will come after something similar to 'Thank you ${SPONSOR_NAME} for sponsoring!'. If it doesn't mention that the sponsor name is a sponsor, then there is no sponsor and you should set the sponsor name to 'no sponsor' and the sponsor key to 'https://t3.gg'"
videoDescription #"
There's a lot wrong with MCP, thankfully Anthropic seems to be finally doing something about it...

no sponsor today, go checkout t3 chat: https://soydev.link/chat

SOURCE
https://www.anthropic.com/engineering...

Want to sponsor a video? Learn more here: https://soydev.link/sponsor-me

Check out my Twitch, Twitter, Discord more at https://t3.gg

S/O Ph4se0n3 for the awesome edit 🙏
"#
}
}

test sevalla_sponsor {
functions [GetSponsor]
args {
sponsorPrompt "The sponsor key for this channel is `https://soydev.link/${SPONSOR_NAME}`. There are often multiple soydev links in the description. The one for the sponsor will come after something similar to 'Thank you ${SPONSOR_NAME} for sponsoring!'. If it doesn't mention that the sponsor name is a sponsor, then there is no sponsor and you should set the sponsor name to 'no sponsor' and the sponsor key to 'https://t3.gg'"
videoDescription #"
There's a lot wrong with MCP, thankfully Anthropic seems to be finally doing something about it...

Thank you Sevalla for sponsoring! Check them out at: https://soydev.link/sevalla

SOURCE
https://www.anthropic.com/engineering...

Want to sponsor a video? Learn more here: https://soydev.link/sponsor-me

Check out my Twitch, Twitter, Discord more at https://t3.gg

S/O Ph4se0n3 for the awesome edit 🙏
"#
}
}

test convex_sponsor {
functions [GetSponsor]
args {
sponsorPrompt "The sponsor key for this channel is `https://soydev.link/${SPONSOR_NAME}`. There are often multiple soydev links in the description. The one for the sponsor will come after something similar to 'Thank you ${SPONSOR_NAME} for sponsoring!'. If it doesn't mention that the sponsor name is a sponsor, then there is no sponsor and you should set the sponsor name to 'no sponsor' and the sponsor key to 'https://t3.gg'"
videoDescription #"
My journey with Arc was...weird. If you followed me here, I'm sorry. I hope this helps.

Thank you Convex for sponsoring! Check them out at https://soydev.link/convex

PLEASE PAY ZEN BROWSER, THEY NEED TO WIN
https://zen-browser.app/donate/

Check out my Twitch, Twitter, Discord more at https://t3.gg

S/O Ph4se0n3 for the awesome edit 🙏
"#
}
}

test g2i_sponsor {
functions [GetSponsor]
args {
sponsorPrompt "The sponsor key for this channel is `https://soydev.link/${SPONSOR_NAME}`. There are often multiple soydev links in the description. The one for the sponsor will come after something similar to 'Thank you ${SPONSOR_NAME} for sponsoring!'. If it doesn't mention that the sponsor name is a sponsor, then there is no sponsor and you should set the sponsor name to 'no sponsor' and the sponsor key to 'https://t3.gg'"
videoDescription #"
Claude Code just got a ton of new features, all of which are really really cool. I just wish they worked better...

Thank you G2i for sponsoring! Check them out at: https://soydev.link/g2i

Want to sponsor a video? Learn more here: https://soydev.link/sponsor-me

Check out my Twitch, Twitter, Discord more at https://t3.gg

S/O Ph4se0n3 for the awesome edit 🙏
"#
}
}

Loading