Skip to content

Protocol Handler: make it possible to set contentType based on content as it arrives #56

Closed
@lidel

Description

@lidel

Problem

Streaming protocols are unable to set contentType, because current Protocol API (@ 99a15b6) requires contentType to be provided before we start streaming response:

browser.protocol.registerProtocol("dweb", request => {
  return {
    contentType: "text/html",
    content: asyncResponseStreamFor(request.url)
  }
})

This means we need to skip contentType parameter and are required to rely on mime-sniffing present in Firefox itself.

This is far from ideal (svg breakage being a known problem: ipfs-inactive/faq#224 (comment)) and may be also insecure, as there will be use cases where specific array of bytes can render as different things, depending on assumed contentType.

What we want is effectively ability to support X-Content-Type-Options: nosniff scenarios and provide own type without sniffing in browser.

Solution

Protocol handler should be able to set (optional) contentType and contentEncoding based on content it read from internal source, something along this pseudocode:

browser.protocol.registerProtocol("dweb", request => {
  return {
     // contentEncoding: "utf-8"
     // contentType: "text/html",
    content: asyncResponseStreamWithContentType(request.url) 
  }
})

async function asyncResponseStreamWithContentType(url) {
  const response = await responseStream(request.url)
  const {contentType, contentEncoding} = await contentMetadata(response)
  return {
    contentType,
    contentEncoding,
    content: response
  }
}

Basically, we all should be able to buffer a bit of data and extract contentType from it (be it from a static value in content header, or own heuristics for mime-type sniffing) before passing response to the browser.

@Gozala when we discussed this briefly last week, you mentioned it should be possible, but in case more context is needed below are details from IPFS side of things.

Additional Notes on IPFS Use Case

In case of IPFS we should be able to store media type within IPLD structures, as one of "Extended Attributes" in unixfs-v2 (ipld/legacy-unixfs-v2#11) or just mime-sniff it as soon data starts arriving inside of handler itself (with ability to account for edge cases such as SVG).

Below is a snippet with working stream-based mime-type sniffer ported from our Brave PoC:

  const peek = require('buffer-peek-stream')
  // mime-sniff over initial bytes
  const { stream, contentType } = await new Promise((resolve, reject) => {
    peek(ipfs.files.catReadableStream(path), 512, (err, data, stream) => {
      if (err) return reject(err)
      const contentType = mimeSniff(data, path) || 'text/plain'
      resolve({ stream, contentType })
    })
  })
  console.log(`[ipfs-companion] [ipfs://] handler read ${path} and internally mime-sniffed it as ${contentType}`)

  return {stream, contentType}

This is as generic as it gets, I suspect other protocols could also find this ability to set contentType based on beginning of incoming data very useful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions