rasl.html

<!DOCTYPE html><html lang="en"><head><!--
!!!!! GENERATED SPEC — DO NOT EDIT. Look for the .src.html instead !!!!
-->
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>RASL — Retrieval of Arbitrary Structures &amp; Links</title>
  <link rel="stylesheet" href="spec.css"><link rel="icon" href="data:image/svg+xml,<svg xmlns=%22http://www.w3.org/2000/svg%22 viewBox=%220 0 100 100%22><rect x=%220%22 y=%220%22 width=%22100%22 height=%22100%22 fill=%22%2300ff75%22></rect></svg>"><meta name="twitter:card" content="summary_large_image"><meta name="twitter:title" property="og:title" content="DASL: RASL — Retrieval of Arbitrary Structures &amp; Links"><meta name="twitter:description" property="og:description" content="RASL is a URL scheme used to identify content-addressed DASL resources         along with a simple HTTP-based retrieval method."><meta name="twitter:image" property="og:image" content="https://dasl.ing/rasl.png"><meta name="twitter:image:alt" content="Very colourful stripes, so colourful it hurts"><meta name="twitter:url" property="og:url" content="https://dasl.ing/"><meta property="og:site_name" content="DASL"><meta property="og:locale" content="en"><meta name="theme-color" content="#00ff75"></head>
  <body><div class="nav-back">A specification of the <a href="/">DASL Project</a>.</div><main><header><h1>RASL — Retrieval of Arbitrary Structures &amp; Links</h1><table><tbody><tr><th>date</th><td>2025-03-17</td></tr><tr><th>editors</th><td><a href="https://berjon.com/">Robin Berjon</a> &lt;<a href="mailto:robin@berjon.com">robin@berjon.com</a>&gt;<br><a href="https://bumblefudge.com/">Juan Caballero</a> &lt;<a href="mailto:bumblefudge@learningproof.xyz">bumblefudge@learningproof.xyz</a>&gt;</td></tr><tr><th>issues</th><td><a href="https://github.com/darobin/dasl.ing/issues">list</a>, <a href="https://github.com/darobin/dasl.ing/issues/new">new</a></td></tr><tr><th>abstract</th><td><div id="abstract">
      <p>
        RASL is a URL scheme used to identify content-addressed DASL resources
        along with a simple HTTP-based retrieval method.
      </p>
    </div></td></tr></tbody></table></header>
    
    <!--
      XXX FOR FUTURE WORK
      - API for publishing & looking up over AT, with storage in repos
      - do we want hosts gossipping to one another or is AT enough?
    -->
    <section>
      <h2>Introduction</h2>
      <p>
        Content-addressed resources are "self-certifying," which is to say that
        you don't need any external authority to certify that the content you
        have when you resolve the identifier is correct: because the identifier
        contains a hash, you can (and should) verify that you obtained the right
        content yourself ([<a href="#ref-ipfs-principles" class="ref">ipfs-principles</a>]). The identifier is enough to
        certify the content. This has several implications, but two are
        particularly relevant for this specification:
      </p>
      <ul>
        <li>
          When resolving a content-addressed identifier, you can obtain the content
          from anyone. It doesn't have to be the content's author. You can even
          obtain it from entirely untrusted sources — given that you can always
          certify it, you don't need to trust whoever gives it to you. As a result,
          the authority part of a URL — the part that can certify the content you
          get, which is the domain part in an <code>https</code> URL — is the
          CID itself ([<a href="#ref-cid" class="ref">cid</a>]).
        </li>
        <li>
          Because it doesn't matter where you get content from, content-addressed
          URLs are inherently transport-independent. There are benefits to agreeing
          on transport (if only so that people can find one another's content)
          but as a client, if you know of several potential ways of obtaining a
          CID you are free to use whichever you prefer or to try several in
          whatever order.
        </li>
      </ul>
      <p>
        Taking these aspects into consideration, this specification defines a URL
        scheme in which the CID is the authority, along with optional hints of
        potential look-up locations, and defines a retrieval method but does not
        mandate that RASL retrieval rely on it.
      </p>
      <section>
        <h2>The <code>web+rasl</code> URL Scheme</h2>
        <p>
          RASL URLs look like this:
          <code>web+rasl://bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4;berjon.com,bsky.app/</code>.
          This breaks down into the following components:
        </p>
        <ul>
          <li>
            The <code>web+rasl</code> scheme. This uses the <code>web+</code> prefix
            to facilitate registration of the scheme in browser contexts so that
            RASL URLs can be used on the web directly.
          </li>
          <li>
            An authority composed of:
            <ul>
              <li>a DASL CID <code>bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4</code> and</li>
              <li>
                (optionally) a comma-separated list of URI-encoded hostnames (any authority parts that are
                acceptable in HTTP) that can be used to attempt retrieval from.
              </li>
            </ul>
          </li>
          <li>
            A path (here just <code>/</code>) that is empty or <code>/</code> for raw
            CIDs like the above but can contain a complete path if the CID resolves
            to MASL data ([<a href="#ref-masl" class="ref">masl</a>]).
          </li>
        </ul>
        <p>
          Use the following steps to <dfn id="dfn-parse-a-rasl-url">parse a RASL URL</dfn>:
        </p>
        <ol>
          <li>Accept a string <var>url</var> and parse it according to the <a href="https://url.spec.whatwg.org/#url-parsing">URL Standard</a> ([<a href="#ref-url" class="ref">url</a>]).</li>
          <li>If that's a failure, return the failure.</li>
          <li>Read the <code>host</code> part of the parsed URL up to either the first <code>;</code> character or to the end of the string. Store that in <var>cid</var>.</li>
          <li>If <var>cid</var> is not a valid CID ([<a href="#ref-cid" class="ref">cid</a>]), return failure.</li>
          <li>
            If there was no <code>;</code> then <var>hints</var> is an empty array. Otherwise:
            <ol>
              <li>Split the remainder of the string on <code>,</code>.</li>
              <li>Apply <code>decodeURIComponent()</code> to each part.</li>
              <li>The results is the <var>hints</var> array.</li>
            </ol>
          </li>
          <li>Return the URL's parts as well as <var>cid</var> and <var>hints</var>.</li>
        </ol>
      </section>
      <section>
        <h2>Fetching RASL</h2>
        <p>
          A user agent may retrieve a CID in whichever way it prefers. This section
          provides a simple standard for HTTP-based CID retrieval, to make it
          easy for authors to publish content to their own sites and have it
          retrieved, without having to worry about operating any infrastructure
          beyond the web server they already have.
        </p>
        <p>
          Use the following steps to <dfn id="dfn-fetch-a-rasl-url">fetch a RASL URL</dfn>:
        </p>
        <ol>
          <li>Accept a string <var>url</var> and parse it according to the steps to <a href="#dfn-parse-a-rasl-url" class="dfn-ref">parse a RASL URL</a>.</li>
          <li>
            Construct a <var>request</var> using <var>cid</var> from the <var>url</var> as well as <var>hints</var> that may
            be from the URL or from elsewhere (this is entirely up to you):
            <ol>
              <li>
                For each hint, construct a request URL that is the concatenation of <code>https://</code>,
                the hint as host, <code>/.well-known/rasl/</code>, and the <var>cid</var>.
              </li>
              <li>
                Prepare the request such that it has a method of either <code>GET</code> or <code>HEAD</code>,
                that it is stateless (no cookies, no credentials of any kind), and that it uses no content
                negotiation.
              </li>
            </ol>
          </li>
          <li>
            Fetch the <var>request</var>s. How these get prioritised is entirely up to the implementation. It
            is common to run them all in parallel and abort them with the first success response.
            Note that the <code>.well-known</code> path may redirect, so be ready to handle
            that. This makes it possible to create sites that are published
            the usual way and to have a RASL that is simply a redirect to the
            resource. So for instance, you may have an existing
            <code>https://berjon.com/kitten.jpg</code> the CID for which is
            <code>bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4</code>.
            This can be published as this RASL URL:
            <code>web+rasl://bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4;berjon.com/</code>.
            A client can retrieve it by constructing the a request to this URL:
            <code>https://berjon.com/.well-known/rasl/bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4</code>.
            In turn, the latter may simply 307 back to <code>https://berjon.com/kitten.jpg</code>.
            (Yes, this is HTTP with extra steps, but the extra steps get you
            self-certifying content.)
          </li>
          <li>
            If the response is a redirect but not a 307, the client should treat it as if it
            had been a 307 anyway.
          </li>
          <li>
            If none of the responses are successful, return failure.
          </li>
          <li>
            Set the response's media type to <code>application/octet-stream</code>. (The server should have
            done that already, but may not have done so, notably if it relied on a redirect.) The purpose
            of RASL is to retrieve data in ways that are independent of the server — any media type
            processing must therefore take place at another layer. Without this, we lose the self-certifying
            nature of the system. (Note that servers are encouraged to enforce that so as not to have their
            RASL endpoints used for general-purpose web serving, which can be a security vector depending on
            where the data being served came from.)
          </li>
          <li>
            Produce a CID for the retrieved data. If that CID does not match the requested <var>cid</var>,
            return failure.
          </li>
          <li>
            Return the data.
          </li>
        </ol>
      </section>
    </section>
    <section>
      <h2>RASL Pathing</h2>
      <p>
        The <code>pathname</code> component of a RASL URL can only be interpreted in the context of the
        content that is retrieved. It is never transmitted to the RASL server and is purely interpreted
        on the client side.
      </p>
      <p>
        As of this time, the only case in which pathing is defined is if the CID has a dCBOR42 type
        ([<a href="#ref-dcbor42" class="ref">dcbor42</a>]) <em>and</em> the content is MASL ([<a href="#ref-masl" class="ref">masl</a>]) <em>and</em> it has a
        <code>resources</code> map. But carrying that resolution happens outside of RASL fetching.
      </p>
    </section>
  

<section><h2>References</h2><dl><dt id="ref-cid">[cid]</dt><dd>Robin Berjon &amp; Juan Caballero. <a href="https://dasl.ing/cid.html"><cite>Content IDs (CIDs)</cite></a>. 2025-03-17. URL:&nbsp;<a href="https://dasl.ing/cid.html">https://dasl.ing/cid.html</a></dd><dt id="ref-dcbor42">[dcbor42]</dt><dd>Robin Berjon &amp; Juan Caballero. <a href="https://dasl.ing/dcbor42.html"><cite>Deterministic CBOR with tag 42 (dCBOR42)</cite></a>. 2025-03-17. URL:&nbsp;<a href="https://dasl.ing/dcbor42.html">https://dasl.ing/dcbor42.html</a></dd><dt id="ref-ipfs-principles">[ipfs-principles]</dt><dd>Robin Berjon. <a href="https://specs.ipfs.tech/architecture/principles/"><cite>IPFS Principles</cite></a>. march 2023. URL:&nbsp;<a href="https://specs.ipfs.tech/architecture/principles/">https://specs.ipfs.tech/architecture/principles/</a></dd><dt id="ref-masl">[masl]</dt><dd>Robin Berjon &amp; Juan Caballero. <a href="https://dasl.ing/masl.html"><cite>MASL — Metadata for Arbitrary Structures &amp; Links</cite></a>. 2025-03-17. URL:&nbsp;<a href="https://dasl.ing/masl.html">https://dasl.ing/masl.html</a></dd><dt id="ref-url">[url]</dt><dd>WHATWG. <a href="https://url.spec.whatwg.org/"><cite>URL</cite></a>. Living Standard. URL:&nbsp;<a href="https://url.spec.whatwg.org/">https://url.spec.whatwg.org/</a></dd></dl></section></main></body></html>