Skip to content

Data Integrity -> external resources #323

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
alenhorvat opened this issue Dec 4, 2024 · 8 comments
Open

Data Integrity -> external resources #323

alenhorvat opened this issue Dec 4, 2024 · 8 comments
Assignees
Labels
editorial This issue or PR constitutes an editorial change. future security-tracker Group bringing to attention of security, or tracked by the security Group but not needing response.

Comments

@alenhorvat
Copy link

Dear,

the following references that some issuer are using in the VCs are NOT available, meaning that verification of those credentials is impossible if the contexts are not stored locally by the verifier.

Furthermore, if the verifier wants to include them, the references cannot be easily found.

References:
https://w3c-ccg.github.io/did-resolution/contexts/did-resolution-v1.json
https://w3id.org/did-resolution/v1

This is a critical issue for the signature design. I urge the authors to consider defining a method of how the contexts CAN/SHOULD/MUST be included in the protected or unprotected signature part and what are the rules for 3rd parties hosting the contexts to ensure longevity.

Thank you for your understanding.

Alen

@msporny msporny added CR1 This item was processed during the first Candidate Recommendation phase. editorial This issue or PR constitutes an editorial change. labels Dec 8, 2024
@msporny
Copy link
Member

msporny commented Dec 8, 2024

Hi Alen, we have discussed the issue you are raising before and it took 6+ months to debate in the group, the previous issue can be seen here:

#272

... and the changes we made can be found here:

#272 (comment)

the following references that some issuer are using in the VCs are NOT available, meaning that verification of those credentials is impossible if the contexts are not stored locally by the verifier.

Yes, in this case, a verifier MUST NOT verify a VC that it doesn't understand in a production setting. We have text to this effect in the spec today:

https://www.w3.org/TR/vc-data-integrity/#validating-contexts

I urge the authors to consider defining a method of how the contexts CAN/SHOULD/MUST be included in the protected or unprotected signature part and

This is application specific, for example, the rules for VCs can be found here:

https://w3c.github.io/vc-data-model/#integrity-of-related-resources

specifically, this text:

A conforming verifier implementation that makes use of a resource based on the id of a relatedResource object inside a conforming document with a corresponding cryptographic digest appearing in a relatedResource object value MUST compute the digest of the retrieved resource. If the digest provided by the issuer does not match the digest computed for the retrieved resource, the conforming verifier implementation MUST produce an error.

So, if an issuer includes the hash for a context, then a verifier MUST ensure that the value it has matches or throw an error.

what are the rules for 3rd parties hosting the contexts to ensure longevity.

The rules are the same for any long-lived content on the Web: Publish it to a URL that won't change and don't change the content. We do have some other guidance that we might put in an implementation guide, but we don't have consensus yet on exactly what we should say (other than what should already be obvious to developers that want to ensure long-lived content at long-lived URLs.

Does the above answer your questions, @alenhorvat?

@alenhorvat
Copy link
Author

Hi. Thank you for the exhaustive answer, but I would expect that specification defines how to embed (either in a protected or unprotected way) the context if it is unavailable. In an open ecosystem, where an verifier may ask for a set of claims, without referring to a specific VC type, it may happen that a holder presents a VC with a context where the full context is not cached by the verifier.

The main issue is that a signature can be invalidated due to unavailable remote content, even though the signature/credential are actually valid.

AdES signature profiles, for example, define how to embed additional content/metadata. If possible, I would kindly ask the WG to define how to embed remote content when connectivity might be limited.

@msporny
Copy link
Member

msporny commented Jan 18, 2025

I would kindly ask the WG to define how to embed remote content when connectivity might be limited.

To be clear, your use case presumes that:

  1. The verifier is receiving data that is arbitrary (it doesn't know about the context its receiving). This could happen w/ an extension context used on top of a base credential (and context) that the verifier does understand.
  2. The issuer/ecosystem has failed to publish (or keep operational) a well known context on the web. This could happen if the issuer goes out of business or is being subjected to a DDoS attack.
  3. There is no other location, software library, or specification that the verifier can retrieve the context from. This could happen for a barely used extension.

Is that the general use case presumptions you'd like the WG to discuss?

One solution could be to say that it is ok to include a fully expanded JSON-LD Context in a secured document like a VC if the above three things are a possibility for the use case. This might best go in the implementation guide. This is possible to do today, but the WG decided that telling implementers to do that could be error prone and lead to interoperability issues -- we had a few WG members that objected to the use of expanded context values in VCs.

If this is an archival use case, like what AdES tries to do, then the guidance could be to fetch the remote context immediately in the archival system and store it in the archival system.

I will also note that this will only work for JSON-LD contexts, which can be embedded. It is true that all other things linked to can be embedded as well, so it's really a judgement call on the issuer's part of what to embed and what not to embed. I'll also note that it's pretty common for the Web and Internet to depend on links to work in the general case. We do speak to this here:

https://w3c.github.io/vc-data-integrity/#network-requests

I'll also mention that the WG is effectively done with the v1.0 specification at this point in time, we are waiting for a few reviews to come in and then will transition to the global standard publication.

We'll try to get this discussion scheduled with the group, but if we are not able to in time, this suggestion might need to go in v1.1 of the specification (the Working Draft would be published immediately after the v1.0 release).

@msporny msporny added the discuss This item needs WG discussion to resolve. label Jan 18, 2025
@alenhorvat
Copy link
Author

Congratulations on the v1.0 release!

Short answer to

Is that the general use case presumptions you'd like the WG to discuss?
yes.

Use case is actually very simple:

  • split the technical and business processing
  • technical processing: signature verification, issuer/subject information, binding, revocation, ...: all properties that are common to VCs
  • business processing: VC-dependent business logic

Due to the coupling of the business/tech layer via the LD context, it's not possible to verify the VC signature (a very basic verification) if the business context (extension context) cannot be resolved.

Attaching remote content in AdES world is solved: protect the remote content by protecting the digest and put the remote content into the unsigned payload. It applies to any remote content.

Thank you for your time and consideration.

@selfissued
Copy link
Contributor

It's the job of profiles to normatively specify ways to archive this data and to access the archive. I believe this issue should be closed on this basis.

@msporny msporny removed the discuss This item needs WG discussion to resolve. label Jan 22, 2025
@msporny msporny self-assigned this Jan 22, 2025
@iherman
Copy link
Member

iherman commented Jan 22, 2025

The issue was discussed in a meeting on 2025-01-22

  • no resolutions were taken
View the transcript

2.3. Data Integrity -> external resources (issue vc-data-integrity#323)

See github issue vc-data-integrity#323.

Ivan Herman: this one is on data integrity.

Manu Sporny: this issue was about what happens when URLs which are linked to which later disappear.
… the concern is mostly about archival systems.
… there are specifications that define how you fully encapsulate things for archival.
… and I think the hope is that we deal with some of those scenarios in our spec vs. downstream in archival specs.
… there are scenarios like extensions or the loss of the document being linked to from all systems.
… such as an issuer shutting down and no archives being made for the issuers documents.
… if there is nothing anywhere that one could get the context, then what do archives need to do?

Dave Longley: -1 to make any of our specs create and provide a process that everyone has to follow to fetch and cache URLs.

Manu Sporny: one idea is to expand it inline for archival.
… or they could download it into the archival record.
… there are several ways to address it.
… do we want to explain that in the spec? or do we want to differ that to other specs?

Dave Longley: I'd not be opposed to stating you could store any URL and it's value for archival purposes, etc.
… this would apply to any URL, so we'd also need to update JOSE/COSE to reflect that.

Ivan Herman: is there anything here that is VC specific?
… this is a 404 problem which happens all the time.

Dave Longley: +1 to Ivan, if you are fetching URLs and you want to be able to "replay what happened" you need to do this.

Michael Jones: it's not useful to describe something informatively if that text could result in normative changes.

Dave Longley: +1 to selfissued that we don't need to say anything here.

Manu Sporny: agreed with selfissued. I don't think there's much more to say here beyond what we already say.
… we could perhaps add something to the long term security considerations section.
… maybe we just ask for language this person would like to see.
… and then consider that.

Dave Longley: "if your system fetches URLs at time X, you'll want to save that content from that time if you want to replay exactly what happened.".

Ivan Herman: sounds good. can someone help manu with that?

@alenhorvat
Copy link
Author

It's the job of profiles to normatively specify ways to archive this data and to access the archive. I believe this issue should be closed on this basis.

I agree about the archiving, but issue is not about archiving, but about an important feature of Data Integrity as they require (at least the one that relies on LD expansion) a valid context in order to even construct a content to be validated (we're talking about a simple data integrity check).

I hope the topic can be addressed.

@msporny
Copy link
Member

msporny commented Jan 26, 2025

I hope the topic can be addressed.

The WG discussed this during the last meeting and noted that we could add some language to the Security Considerations section to address your concern. Could you provide the language you'd like to see in the section? If not, could you at least provide a bulleted list of topics that you'd like to see addressed? If you do that, the Editor's can try to construct some language that might address your concerns.

@brentzundel brentzundel added future and removed CR1 This item was processed during the first Candidate Recommendation phase. labels Feb 19, 2025
@plehegar plehegar added the security-tracker Group bringing to attention of security, or tracked by the security Group but not needing response. label Mar 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
editorial This issue or PR constitutes an editorial change. future security-tracker Group bringing to attention of security, or tracked by the security Group but not needing response.
Projects
None yet
Development

No branches or pull requests

6 participants