Skip to content

feat: Audiobook tab with chunked generation, preview, and Story auto-save#259

Closed
jamiepine wants to merge 2 commits intomainfrom
audiobook-tab-pr
Closed

feat: Audiobook tab with chunked generation, preview, and Story auto-save#259
jamiepine wants to merge 2 commits intomainfrom
audiobook-tab-pr

Conversation

@jamiepine
Copy link
Owner

@jamiepine jamiepine commented Mar 13, 2026

Summary

Rebased/merged version of #154 by @omercelik, updated to resolve conflicts with latest main.

  • Audiobook tab with chunked text-to-speech generation
  • Preview and playback of generated chunks
  • Auto-save generated audio to Stories

See #154 for full description and discussion.


Original PR: #154 by @omercelik. Merged with latest main to resolve conflicts.

Summary by CodeRabbit

  • New Features

    • Audiobook Tab: Import or paste text, preview content in a 5-sentence sample, automatically chunk large texts, and generate audio with real-time processing, pause/resume controls, and retry capability.
    • Enhanced scrollbar styling for improved UI consistency.
  • Navigation

    • Added Audiobook tab to sidebar for easy access to long-form audio generation workflow.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 13, 2026

📝 Walkthrough

Walkthrough

This PR adds a new Audiobook Tab feature enabling users to import text, chunk it intelligently, generate audio for each chunk, and create associated Story records. Changes include a comprehensive React component, text chunking utility, navigation updates, styling enhancements, and route configuration.

Changes

Cohort / File(s) Summary
Audiobook Feature
app/src/components/AudiobookTab/AudiobookTab.tsx
New 1173-line component implementing end-to-end audiobook workflow: text intake (file/paste), analytics, chunking, preview generation, audio generation with retry logic, Story creation/linking, and real-time processing controls (pause/resume/stop).
Navigation & Routing
app/src/components/Sidebar.tsx, app/src/router.tsx
Added Audiobook tab to sidebar with BookText icon; introduced AudiobookTab route at /audiobook and integrated into route tree; updated active tab detection using useRouterState for root path handling.
Text Processing & Styling
app/src/lib/utils/textChunking.ts, app/src/index.css
Added sentence-aware text chunking utility with TextChunk interface and core chunkText() function supporting safe target and max size constraints; introduced scrollbar-visible CSS utility for long-scroll panels.
Documentation
CHANGELOG.md
Updated changelog documenting Audiobook Tab feature, chunking utility, navigation changes, API updates, and styling additions.

Sequence Diagram

sequenceDiagram
    actor User
    participant AudiobookTab
    participant textChunking
    participant API as Speech API
    participant StoryAPI as Story Management
    participant UI as UI State

    User->>AudiobookTab: Load text file or paste
    AudiobookTab->>textChunking: chunkText(rawText, targetSize, maxSize)
    textChunking-->>AudiobookTab: TextChunk[] with metadata
    AudiobookTab->>UI: Display chunk summary and preview
    
    User->>AudiobookTab: Review & start generation
    AudiobookTab->>StoryAPI: Create Story with name
    StoryAPI-->>AudiobookTab: Story ID
    
    loop For each chunk
        AudiobookTab->>AudiobookTab: Set chunk state (pending)
        AudiobookTab->>API: Generate audio for chunk
        alt Generation success
            API-->>AudiobookTab: Audio result
            AudiobookTab->>StoryAPI: Create StoryItem + link audio
            StoryAPI-->>AudiobookTab: StoryItem created
            AudiobookTab->>UI: Update chunk state (done)
        else Generation fails
            API-->>AudiobookTab: Error
            AudiobookTab->>AudiobookTab: Retry (up to MAX_CHUNK_RETRIES)
            AudiobookTab->>UI: Update chunk state (failed/retry)
        end
    end
    
    AudiobookTab->>UI: Show completion + export option
    User->>AudiobookTab: Export/download Story
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 From paragraphs I weave with care,
Chunking prose into segments fair,
Each sentence sings with voice so bright,
A tale of code in digital night! 📖✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the main feature added: an Audiobook tab with chunked generation, preview, and Story auto-save functionality.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch audiobook-tab-pr
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
app/src/lib/utils/textChunking.ts (1)

8-24: Export the shared text parsing helpers.

app/src/components/AudiobookTab/AudiobookTab.tsx reimplements the same normalization and sentence regex at Lines 99-123. Keeping two copies will make quick preview and real chunking drift the first time one matcher changes.

♻️ Minimal extraction
-function normalizeText(text: string): string {
+export function normalizeText(text: string): string {
   return text.replace(/\r\n/g, '\n').replace(/\r/g, '\n').trim();
 }

-function splitParagraphIntoSentences(paragraph: string): string[] {
+export function splitParagraphIntoSentences(paragraph: string): string[] {

Then import these helpers into AudiobookTab and delete the local splitTextIntoSentences copy.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/src/lib/utils/textChunking.ts` around lines 8 - 24, Export the shared
helpers normalizeText and splitParagraphIntoSentences from textChunking.ts (make
them exported functions) and update AudiobookTab to import and use these helpers
instead of its local splitTextIntoSentences implementation; then remove the
duplicate splitTextIntoSentences from AudiobookTab so normalization and
sentence-splitting logic is centralized in normalizeText and
splitParagraphIntoSentences.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/src/components/AudiobookTab/AudiobookTab.tsx`:
- Around line 279-311: The loop currently checks pause/stop before verifying if
all chunks are complete, which can leave a run stuck paused when the final chunk
finished; change the loop ordering in the handler in AudiobookTab (the block
using stopRequestedRef, pauseRequestedRef, wait, current.chunks, nextChunkIndex,
and setRun) to determine completion first: compute nextChunkIndex and
hasFailures (using current.chunks) and if nextChunkIndex === -1 update
run.status/finishedAt and break, then only after that handle pauseRequestedRef
and stopRequestedRef (or only pause when there are pending chunks), so paused
state is never applied when no pending chunks remain.
- Around line 377-387: The retry loop currently calls
apiClient.generateSpeech(...) and apiClient.addStoryItem(... ) together so a
failed addStoryItem causes generateSpeech to run again and create duplicate
orphaned generations; change the logic so generateSpeech is called once and its
generation.id is persisted (e.g., store generation in a local variable or state)
before any retries, then only retry apiClient.addStoryItem(latest.storyId, {
generation_id: generation.id }) on failure; alternatively implement an
idempotency key when calling apiClient.generateSpeech (using a stable key
derived from chunk or latest) so repeated calls do not create new
generations—update the code around the generateSpeech/addStoryItem calls to
reflect one of these approaches (refer to generateSpeech, generation.id,
addStoryItem, latest.storyId).
- Around line 200-203: The computed heavy derivations (preparedChunks via
useMemo calling chunkText, plus the word/line counting, sentence splitting, and
preview fingerprinting referenced around the same area) run synchronously on
every keystroke and must be deferred; wrap those computations so they only
recompute after typing pauses or off the main thread — e.g., debounce the inputs
used (text/targetChunkSize/HARD_MAX_CHUNK_SIZE) or use React features like
useDeferredValue/useTransition to delay computing preparedChunks and the other
derived values, or move the logic into a web worker; update the code paths that
call chunkText, the sentence splitter, and the preview fingerprinter to read
from the deferred/debounced value instead of the raw text to prevent re-analysis
on each keystroke.

---

Nitpick comments:
In `@app/src/lib/utils/textChunking.ts`:
- Around line 8-24: Export the shared helpers normalizeText and
splitParagraphIntoSentences from textChunking.ts (make them exported functions)
and update AudiobookTab to import and use these helpers instead of its local
splitTextIntoSentences implementation; then remove the duplicate
splitTextIntoSentences from AudiobookTab so normalization and sentence-splitting
logic is centralized in normalizeText and splitParagraphIntoSentences.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 10a34d44-cde1-4611-bf41-d9be73b36a95

📥 Commits

Reviewing files that changed from the base of the PR and between 3e6513c and 53f640a.

⛔ Files ignored due to path filters (1)
  • bun.lock is excluded by !**/*.lock
📒 Files selected for processing (6)
  • CHANGELOG.md
  • app/src/components/AudiobookTab/AudiobookTab.tsx
  • app/src/components/Sidebar.tsx
  • app/src/index.css
  • app/src/lib/utils/textChunking.ts
  • app/src/router.tsx

Comment on lines +200 to +203
const preparedChunks = useMemo(
() => chunkText(text, targetChunkSize, HARD_MAX_CHUNK_SIZE),
[text, targetChunkSize],
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid full-book re-analysis on every keystroke.

For the 2 MB inputs allowed at Line 46, each edit reruns chunkText, word/line counting, sentence splitting, and preview fingerprinting synchronously. On real book-length text this will make the editor stutter badly. Debounce/defer these derived values, or recalculate them only after typing pauses.

Also applies to: 244-260

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/src/components/AudiobookTab/AudiobookTab.tsx` around lines 200 - 203, The
computed heavy derivations (preparedChunks via useMemo calling chunkText, plus
the word/line counting, sentence splitting, and preview fingerprinting
referenced around the same area) run synchronously on every keystroke and must
be deferred; wrap those computations so they only recompute after typing pauses
or off the main thread — e.g., debounce the inputs used
(text/targetChunkSize/HARD_MAX_CHUNK_SIZE) or use React features like
useDeferredValue/useTransition to delay computing preparedChunks and the other
derived values, or move the logic into a web worker; update the code paths that
call chunkText, the sentence splitter, and the preview fingerprinter to read
from the deferred/debounced value instead of the raw text to prevent re-analysis
on each keystroke.

Comment on lines +279 to +311
if (stopRequestedRef.current) {
setRun((prev) => {
if (!prev) {
return prev;
}
return {
...prev,
status: 'stopped',
finishedAt: new Date().toISOString(),
};
});
break;
}

if (pauseRequestedRef.current) {
await wait(250);
continue;
}

const nextChunkIndex = current.chunks.findIndex((chunk) => chunk.status === 'pending');
if (nextChunkIndex === -1) {
const hasFailures = current.chunks.some((chunk) => chunk.status === 'failed');
setRun((prev) => {
if (!prev) {
return prev;
}
return {
...prev,
status: hasFailures ? 'completed_with_errors' : 'completed',
finishedAt: new Date().toISOString(),
};
});
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Check for completion before applying pause/stop state.

If the user pauses while the final chunk is in flight, the next loop iteration hits the pause/stop branch before nextChunkIndex === -1. That can leave the run stuck in paused even though all chunks are done, and canResume will never reappear because there are no pending chunks left.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/src/components/AudiobookTab/AudiobookTab.tsx` around lines 279 - 311, The
loop currently checks pause/stop before verifying if all chunks are complete,
which can leave a run stuck paused when the final chunk finished; change the
loop ordering in the handler in AudiobookTab (the block using stopRequestedRef,
pauseRequestedRef, wait, current.chunks, nextChunkIndex, and setRun) to
determine completion first: compute nextChunkIndex and hasFailures (using
current.chunks) and if nextChunkIndex === -1 update run.status/finishedAt and
break, then only after that handle pauseRequestedRef and stopRequestedRef (or
only pause when there are pending chunks), so paused state is never applied when
no pending chunks remain.

Comment on lines +377 to +387
const generation = await apiClient.generateSpeech({
profile_id: latest.profileId,
text: chunk.text,
language: latest.language,
model_size: latest.modelSize,
instruct: latest.instruct || undefined,
});

await apiClient.addStoryItem(latest.storyId, {
generation_id: generation.id,
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Don't retry generation when only Story linking failed.

apiClient.generateSpeech() and apiClient.addStoryItem() are both POST-backed calls, but they sit inside the same retry loop. If /generate succeeds and /stories/:id/items fails, the next attempt creates a second generation and leaves the first one orphaned in History. Persist the first generation.id and retry only the Story attachment, or add an idempotency key around generation creation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/src/components/AudiobookTab/AudiobookTab.tsx` around lines 377 - 387, The
retry loop currently calls apiClient.generateSpeech(...) and
apiClient.addStoryItem(... ) together so a failed addStoryItem causes
generateSpeech to run again and create duplicate orphaned generations; change
the logic so generateSpeech is called once and its generation.id is persisted
(e.g., store generation in a local variable or state) before any retries, then
only retry apiClient.addStoryItem(latest.storyId, { generation_id: generation.id
}) on failure; alternatively implement an idempotency key when calling
apiClient.generateSpeech (using a stable key derived from chunk or latest) so
repeated calls do not create new generations—update the code around the
generateSpeech/addStoryItem calls to reflect one of these approaches (refer to
generateSpeech, generation.id, addStoryItem, latest.storyId).

@jamiepine jamiepine closed this Mar 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants