-
Notifications
You must be signed in to change notification settings - Fork 8
SPICE-0012: URL standard library #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 2 commits
9709094
fc2b167
bb785df
a214ae4
e83369d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,336 @@ | ||||||
| = URL standard library module | ||||||
|
|
||||||
| * Proposal: link:./SPICE-0012-url-standard-library-module.adoc[SPICE-0012] | ||||||
| * Author: https://github.com/bioball[Dan Chao] | ||||||
| * Status: TBD | ||||||
| * Implemented in: TBD | ||||||
| * Category: Standard Library | ||||||
|
|
||||||
| == Introduction | ||||||
|
|
||||||
| This proposal introduces a new standard library module for managing and describing URLs. | ||||||
|
|
||||||
| == Motivation | ||||||
|
|
||||||
| A URL (URI) is a common type used within service configuration. | ||||||
|
|
||||||
| Examples: | ||||||
|
|
||||||
| * Website addresses | ||||||
| * Database connection strings | ||||||
| * Binary objects (data URIs) | ||||||
|
|
||||||
| In the base module is a typealias for `Uri`, but only defines it as a string and does not provide any extra validation. | ||||||
|
|
||||||
| Currently, there exists an https://pkl-lang.org/package-docs/pkg.pkl-lang.org/pkl-pantry/pkl.experimental.uri/current/URI/index.html[experimental URI library]. | ||||||
| Much of this design is drawn from the learnings of that library. | ||||||
|
|
||||||
| == Proposed Solution | ||||||
|
|
||||||
| A new standard library module will be added, called `pkl.Url`. | ||||||
|
|
||||||
| A new external property on `String` will be added, called `isValidUrl`. | ||||||
|
|
||||||
| The `Uri` typealias will be changed to `typealias Uri = String(isValidUrl)`. | ||||||
|
|
||||||
| == Detailed design | ||||||
|
|
||||||
| Pkl's URL implementation will follow rules described in https://url.spec.whatwg.org[WHATWG URL standard]. | ||||||
|
|
||||||
| Following the standard, it will be called "URL", and not "URI" nor "IRI". | ||||||
| The https://url.spec.whatwg.org/#goals[rationale] for this naming: | ||||||
|
|
||||||
| > Standardize on the term URL. URI and IRI are just confusing. In practice a single algorithm is used for both so keeping them distinct is not helping anyone. URL also easily wins the https://trends.google.com/trends/explore?q=url,uri[search result popularity contest]. | ||||||
|
|
||||||
| === module-level properties | ||||||
|
|
||||||
| The following make up the properties of the Url class: | ||||||
bioball marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
|
|
||||||
| .pkl.Url | ||||||
| [source,pkl] | ||||||
| ---- | ||||||
| module pkl.Url | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How would a URI like In go, this is parsed as
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This would be parsed as See https://url.spec.whatwg.org/#example-url-components Actually, I wonder how we can better support opaque URLs. The env URL
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd like to proffer Go's URL implementation as a carrot and Swift's as the stick: GoGo's
In practice, I've found this API to be extremely usable and flexible enough for working with the varied URIs found in the Pkl ecosystem. This may be a good design to reference that departs somewhat from the literal spec in the name of usability. SwiftHaving worked with pkl-swift, I've found Foundation URL type's lack of similar support makes working with opaque URIs like PklGiven that opaque URIs are already commonplace in the Pkl ecosystem (and used for several of the runtime's built-in resources), I think having first-class support for them in this proposal is necessary. A similar model to Go's could be adopted, but with added type constraints so ensure that A case like opaque: String = "<rendered authority/path>"
Thoughts?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You actually need to do wrangling too with Go's func getSchemeSpecificPart(u *url.URL) (string, error) {
return url.PathUnescape(strings.Split(u.String(), ":")[1])
}Whether a URL is opaque or not is really up to the scheme, and a simple rule like "if the char after the colon is |
||||||
|
|
||||||
| /// The scheme component. | ||||||
| scheme: AsciiString | ||||||
|
|
||||||
| /// The username component. | ||||||
| /// | ||||||
| /// If the URL does not require a username, set to the empty string. | ||||||
| username: AsciiString | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What's the reason to make the empty string the "null" value instead of proper
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good question. According to the spec, this field (and several other ones) do not admit null: https://url.spec.whatwg.org/#url-representation There are some fields that can be null, but this is not one of them. Practically, I don't know if this makes any difference. |
||||||
|
|
||||||
| /// The password component. | ||||||
| /// | ||||||
| /// If the URL does not require a password, set to the empty string. | ||||||
| password: AsciiString | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same question as above. |
||||||
|
|
||||||
| /// A domain name, IPV4 address, IPV6 address or an otherwise opaque host. | ||||||
| hostname: String? | ||||||
bioball marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
|
||||||
| /// The port component. | ||||||
| port: UInt16? | ||||||
|
|
||||||
| /// The path component. | ||||||
| /// | ||||||
| /// It typically refers to a directory or a file, but has no predefined meaning. | ||||||
| path: String? | ||||||
|
|
||||||
| /// The query string component. | ||||||
| query: String? | ||||||
|
|
||||||
| /// The fragment component. | ||||||
| fragment: AsciiString? | ||||||
|
|
||||||
| /// A string whose characters are in the printable ASCII range (code points `0x20` through `0x7e`). | ||||||
| local typealias AsciiString = String(matches(Regex("[ -~]*"))) | ||||||
| ---- | ||||||
|
|
||||||
| === Parser API | ||||||
|
|
||||||
| A parser API will be introduced for parsing string inputs into URLs. This parser is a class within module `pkl.Url`. | ||||||
|
|
||||||
| The parser will follow the steps as described in https://url.spec.whatwg.org/#concept-basic-url-parser[WHATWG]. | ||||||
|
|
||||||
| The base URL, as per the specification, is used to help resolve relative-URL strings. | ||||||
|
|
||||||
| .pkl.Url | ||||||
| [source,pkl] | ||||||
| ---- | ||||||
| module pkl.Url | ||||||
|
|
||||||
| import "pkl:Url" | ||||||
|
|
||||||
| // etc | ||||||
|
|
||||||
| /// A URL parser. | ||||||
| /// | ||||||
| /// Follows the specification in <https://url.spec.whatwg.org/#concept-basic-url-parser>. | ||||||
| class Parser { | ||||||
| /// The base URL, if any. | ||||||
| base: Url? | ||||||
|
|
||||||
| /// Parses [source] into a URL. | ||||||
| /// | ||||||
| /// Throws if [source] is an invalid URL. | ||||||
| external function parse(source: String): Url | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm fine with having one version that throws on failure, for when users know their URL is correct. But we should have
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Our other parsers don't have a If you need to recover from errors, you can use
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. After chatting a little bit more on this, I'm convinced that we should just have a The rationale is: we also have similar "OrNull" methods in other places. It makes it much easier for users; e.g. We should have the YAML and JSON parsers conform to this as a future task. |
||||||
| } | ||||||
| ---- | ||||||
|
|
||||||
| === `SearchParams` API | ||||||
|
|
||||||
| A search params API will be introduced for working with `application/x-www-form-urlencoded` encoded query strings. | ||||||
|
|
||||||
| Additionally, a `hidden fixed` property is added representing the parsed search params of the current URL's query string. | ||||||
|
|
||||||
| .pkl.Url | ||||||
| [source,pkl] | ||||||
| ---- | ||||||
| module pkl.Url | ||||||
|
|
||||||
| // etc | ||||||
|
|
||||||
| /// The parsed query as search params. | ||||||
| hidden fixed searchParams: SearchParams? = | ||||||
| if (query != null) SearchParams(query) | ||||||
| else null | ||||||
|
|
||||||
| /// Creates a [SearchParams] from the given form encoded string. | ||||||
| const function SearchParams(input: String): SearchParams = // etc | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it possible
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good question; I don't think so. Here is the parsing algorithm: https://url.spec.whatwg.org/#urlencoded-parsing None of these steps involve failure. |
||||||
|
|
||||||
| /// A representation of data encoded in `application/x-www-form-urlencoded` format. | ||||||
| class SearchParams { | ||||||
| values: Mapping<String, Listing<String>> | ||||||
|
|
||||||
| function toString() | ||||||
| } | ||||||
| ---- | ||||||
|
|
||||||
| === Percent encoding API | ||||||
|
|
||||||
| Several new methods will be introduced for working with percent encoding. | ||||||
|
|
||||||
| The `encode` method follows the `encodeURI` method as described in https://262.ecma-international.org/5.1/#sec-15.1.3.3[ECMA-262 15.1.3.3]. | ||||||
|
|
||||||
| The `encodeComponent` method follows the `encodeURIComponent` method as described in https://262.ecma-international.org/5.1/#sec-15.1.3.4[ECMA-262 15.1.3.4] | ||||||
|
|
||||||
| .pkl.Url | ||||||
| [source,pkl] | ||||||
| ---- | ||||||
| module pkl.Url | ||||||
|
|
||||||
| /// The [percent-encoding](https://en.wikipedia.org/wiki/Percent-encoding) of the UTF-8 bytes of | ||||||
| /// [source]. | ||||||
| /// | ||||||
| /// Example: | ||||||
| /// ``` | ||||||
| /// percentEncode(" ") == "%20" | ||||||
| /// percentEncode("/") == "%2F" | ||||||
| /// ``` | ||||||
| const external function percentEncode(source: String): String | ||||||
|
|
||||||
| /// The [percent-decoding](https://en.wikipedia.org/wiki/Percent-encoding) of [source] as utf-8 bytes into its underlying string. | ||||||
| /// | ||||||
| /// Example: | ||||||
| /// ``` | ||||||
| /// percentDecode("%20") == " " | ||||||
| /// percentDecode("%2F") == "/" | ||||||
| /// ``` | ||||||
| const external function percentDecode(source: String): String | ||||||
|
|
||||||
| /// Encodes [value] using percent-encoding to make it safe for the literal use as a URI. | ||||||
| /// | ||||||
| /// All characters except for alphanumeric chracters, and the chracters `!#$&'()*+,-./:;=?@_~` | ||||||
| /// are percent-encoded. | ||||||
| /// | ||||||
| /// Follows the rules for the `encodeURI` function as described by | ||||||
| /// [ECMA-262](https://262.ecma-international.org/5.1/#sec-15.1.3.3). | ||||||
| /// | ||||||
| /// Facts: | ||||||
| /// ``` | ||||||
| /// encode("https://example.com/some path/") == "https://example.com/some%20path" | ||||||
| /// ``` | ||||||
| const external function encode(value: String): String | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The module's name is already By the way, one of the suggestions of this SPICE is to prefer "URL" over "URI". |
||||||
|
|
||||||
| /// Encodes [str] using percent-encoding to make it safe to literal use as a URI component. | ||||||
| /// | ||||||
| /// All characters except for alphanumeric characters, and the characters `-_.!~*'()` are | ||||||
| /// percent-encoded. | ||||||
| /// | ||||||
| /// Follows the rules for the `encodeURIComponent` function as described by | ||||||
| /// [ECMA-262](https://262.ecma-international.org/5.1/#sec-15.1.3.4). | ||||||
| /// | ||||||
| /// Facts: | ||||||
| /// ``` | ||||||
| /// encodeComponent("https://example.com/some path") == "https%3A%2F%2example.com%2Fsome%20path" | ||||||
| /// ``` | ||||||
| const external function encodeComponent(value: String): String | ||||||
| ---- | ||||||
|
|
||||||
| === Method `toString()` | ||||||
|
|
||||||
| The `toString()` will be overloaded to return the serialized URL. | ||||||
|
|
||||||
| .pkl.Url | ||||||
| [source,pkl] | ||||||
| ---- | ||||||
| module pkl.Url | ||||||
|
|
||||||
| // etc | ||||||
|
|
||||||
| function toString() = // implementation | ||||||
| ---- | ||||||
|
|
||||||
| ==== Sample usage: | ||||||
|
|
||||||
| [source,pkl] | ||||||
| ---- | ||||||
| myUrl: Url = new { | ||||||
| scheme = "https" | ||||||
| host = "example.com" | ||||||
| path = "/foo.txt" | ||||||
| } | ||||||
|
|
||||||
| result = myUrl.toString() // <1> | ||||||
| ---- | ||||||
| <1> `result = "\https://example.com/foo.txt"` | ||||||
|
|
||||||
| === Method `resolveUrl()` | ||||||
|
|
||||||
| A method, `resolveUrl()`, accepts another URL and resolves it as a reference to this URL. | ||||||
|
|
||||||
| It follows the rules described in https://www.rfc-editor.org/rfc/rfc3986#section-5.2[RFC-3986 Section 5.2]. | ||||||
|
|
||||||
| .pkl.Url | ||||||
| [source,pkl] | ||||||
| ---- | ||||||
| module pkl.Url | ||||||
|
|
||||||
| import "pkl:Url" | ||||||
|
|
||||||
| // etc | ||||||
|
|
||||||
| /// Resolves [other] as a URI reference to this URI. | ||||||
| /// | ||||||
| /// Follows the rules described in | ||||||
| /// [RFC-3986 Section 5.2](https://www.rfc-editor.org/rfc/rfc3986#section-5.2). | ||||||
| function resolveUrl(other: Url) = // implementation | ||||||
| ---- | ||||||
|
|
||||||
| === Sample usage | ||||||
|
|
||||||
| URLs can be constructed either by using the parser, or directly by setting fields on the struct. | ||||||
|
|
||||||
| [source,pkl] | ||||||
| ---- | ||||||
| import "pkl:Url" | ||||||
|
|
||||||
| myUrl: Url = new { // <1> | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can ensure that you cannot construct invalid Urls through property types and constraints. |
||||||
| scheme = "https" | ||||||
| host = "example.com" | ||||||
| path = "/foo.txt" | ||||||
| } | ||||||
|
|
||||||
| local parser: Url.Parser = new {} | ||||||
|
|
||||||
| myUrl2: Url = parser.parse("https://example.com/foo.txt") // <2> | ||||||
|
|
||||||
| myUrl3: Url = new { // <3> | ||||||
| local sp: Url.SearchParams = new { | ||||||
| values { | ||||||
| ["key"] { "730d67" } | ||||||
| } | ||||||
| } | ||||||
| scheme = "https" | ||||||
| host = "example.com" | ||||||
| path = "/foo.txt" | ||||||
| query = sp.toString() | ||||||
| } | ||||||
|
|
||||||
| myUrl4: Url = // <4> | ||||||
| let (parsed = parser.parse("https://example.com/foo.txt?foo=bar")) | ||||||
| (parsed) { | ||||||
| query = (super.searchParams) { | ||||||
| values { | ||||||
| ["qux"] { "corge" } | ||||||
| } | ||||||
| }.toString() | ||||||
| } | ||||||
| ---- | ||||||
| <1> Constructing URL directly | ||||||
| <2> Constructing a URL using `Url.Parser.parse()` | ||||||
| <3> Constructing a URL query using the `SearchParams` API | ||||||
| <4> Constructing a URL from an existing URL, and adding to its query string via the `SearchParams` API | ||||||
|
|
||||||
| == Compatibility | ||||||
|
|
||||||
| This is purely a new API, and is backwards compatible with existing Pkl. | ||||||
|
|
||||||
| == Future directions | ||||||
|
|
||||||
| === IP Address Library | ||||||
|
|
||||||
| A URL's host can possibly contain IPV4 and IPV6 addresses. | ||||||
| To enhance using these types of URLs, Pkl can possibly introduce an IP Address library in the future. | ||||||
|
|
||||||
| With an IP address library, it is possible to provide better constraints on the `host` property (either ASCII string or IP address). | ||||||
|
|
||||||
| === Modifying other standard library properties | ||||||
|
|
||||||
| There are some other places throughout the standard library that make use of URIs. | ||||||
|
|
||||||
| These include: | ||||||
|
|
||||||
| * `pkl.reflect.Module.uri` | ||||||
| * `pkl.reflect.Module.imports` | ||||||
| * `pkl.Project.projectFileUri` | ||||||
| * `pkl.EvaluatorSettings.Proxy.address` | ||||||
|
|
||||||
| Currently, these are typed using typealias `Uri`. | ||||||
| A possible future direction is to change these types to `pkl.Url`. | ||||||
|
|
||||||
| == Alternatives considered | ||||||
|
|
||||||
| Instead of introducing a new module, we can add these as types to `pkl.base`. | ||||||
| However, any name added to the base module is a breaking change (a variable resolved off implicit `this` will break). | ||||||
|
|
||||||
| Additionally, adding new classes adds more overhead to the evaluation of any module. | ||||||
Uh oh!
There was an error while loading. Please reload this page.