Description
Problem
Streaming protocols are unable to set contentType
, because current Protocol API (@ 99a15b6) requires contentType
to be provided before we start streaming response:
browser.protocol.registerProtocol("dweb", request => {
return {
contentType: "text/html",
content: asyncResponseStreamFor(request.url)
}
})
This means we need to skip contentType
parameter and are required to rely on mime-sniffing present in Firefox itself.
This is far from ideal (svg breakage being a known problem: ipfs-inactive/faq#224 (comment)) and may be also insecure, as there will be use cases where specific array of bytes can render as different things, depending on assumed contentType
.
What we want is effectively ability to support X-Content-Type-Options: nosniff
scenarios and provide own type without sniffing in browser.
Solution
Protocol handler should be able to set (optional) contentType
and contentEncoding
based on content it read from internal source, something along this pseudocode:
browser.protocol.registerProtocol("dweb", request => {
return {
// contentEncoding: "utf-8"
// contentType: "text/html",
content: asyncResponseStreamWithContentType(request.url)
}
})
async function asyncResponseStreamWithContentType(url) {
const response = await responseStream(request.url)
const {contentType, contentEncoding} = await contentMetadata(response)
return {
contentType,
contentEncoding,
content: response
}
}
Basically, we all should be able to buffer a bit of data and extract contentType from it (be it from a static value in content header, or own heuristics for mime-type sniffing) before passing response to the browser.
@Gozala when we discussed this briefly last week, you mentioned it should be possible, but in case more context is needed below are details from IPFS side of things.
Additional Notes on IPFS Use Case
In case of IPFS we should be able to store media type within IPLD structures, as one of "Extended Attributes" in unixfs-v2 (ipld/legacy-unixfs-v2#11) or just mime-sniff it as soon data starts arriving inside of handler itself (with ability to account for edge cases such as SVG).
Below is a snippet with working stream-based mime-type sniffer ported from our Brave PoC:
const peek = require('buffer-peek-stream')
// mime-sniff over initial bytes
const { stream, contentType } = await new Promise((resolve, reject) => {
peek(ipfs.files.catReadableStream(path), 512, (err, data, stream) => {
if (err) return reject(err)
const contentType = mimeSniff(data, path) || 'text/plain'
resolve({ stream, contentType })
})
})
console.log(`[ipfs-companion] [ipfs://] handler read ${path} and internally mime-sniffed it as ${contentType}`)
return {stream, contentType}
This is as generic as it gets, I suspect other protocols could also find this ability to set contentType
based on beginning of incoming data very useful.