Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The identity model explained in the text is probably incomplete #5

Closed
ehsan opened this issue Aug 22, 2019 · 7 comments
Closed

The identity model explained in the text is probably incomplete #5

ehsan opened this issue Aug 22, 2019 · 7 comments

Comments

@ehsan
Copy link

ehsan commented Aug 22, 2019

The text summarizes the web's "identity model" into the following:

  • Per-domain state, especially cookies, which let one eTLD+1 maintain a consistent notion of a visitor's identity. This identity extends across top-level sites due to 3p cookies, storage within iframes, etc.
  • In-browser passing of information, among the parties co-occurring on a web page (via mechanisms like shared state in DOM or JS, or HTTP redirects, or postMessage).

However this is an overly simplistic model if the aim is trying to explain the data sharing/leakage that makes current online surveillance practices possible.

For example, it doesn't take into consideration data exchanged between sites as part of navigation (e.g. through the navigation destination URL), PII exfiltrated from the browser, PII exfiltrated from embedded content, behavioural fingerprints, etc.

@michaelkleber
Copy link
Owner

You're quite right that my two-line summary isn't exhaustive. But those two mechanisms alone are enough to get a global identity, and indeed the global identities used by the display ads ecosystem have been historically been built on those two capabilities.

Basically, since lots of this document is about the scope of an identifier, I wanted to be clear that the scope of identifiers historically was "global".

@ehsan
Copy link
Author

ehsan commented Aug 30, 2019

You're quite right that my two-line summary isn't exhaustive. But those two mechanisms alone are enough to get a global identity, and indeed the global identities used by the display ads ecosystem have been historically been built on those two capabilities.

Sure.

Basically, since lots of this document is about the scope of an identifier, I wanted to be clear that the scope of identifiers historically was "global".

I fail to understand how for example we can talk about sharding the user's identity if we can't even define what user's identity actually means. FWIW if you read the examples I gave carefully, one of them can very easily be used to jeopardize the partitioning you described in the proposal.

Anyway, if you think closing this issue is the right way to address it, it's your repo I guess...

@ehsan
Copy link
Author

ehsan commented Aug 30, 2019

#12 is one such example...

michaelkleber added a commit that referenced this issue Aug 30, 2019
Per #5, clarify that per-domain state and information-passing are only some pieces of how identity has worked, by no means all of it.
@michaelkleber
Copy link
Owner

Sorry, I didn't mean to snub your point by closing this issue — rather I was agreeing with you that the identity model is incomplete! Will continue discussion on your new issue.

@jyasskin
Copy link

jyasskin commented Aug 30, 2019

Part of the problem might be that the document never says "The web's identity model is ..." or "In our vision of the web, an identity is ...". Once we have a declarative definition, we can talk about ways an identity can be stored and transferred between origins/sites/parties.

The first bullet in https://github.com/michaelkleber/privacy-model#identity-is-partitioned-by-first-party-site comes fairly close to this kind of definition, and probably deserves to be promoted to somewhere more visible.

As I mention in #10 (comment), we might even want to use some core term other than "identity", to avoid the connotation baggage that has from real-world identities.

@ehsan
Copy link
Author

ehsan commented Sep 3, 2019

Sorry, I didn't mean to snub your point by closing this issue — rather I was agreeing with you that the identity model is incomplete! Will continue discussion on your new issue.

No problem! As long as the discussion happens I'm happy, wherever it may occur.

Part of the problem might be that the document never says "The web's identity model is ..." or "In our vision of the web, an identity is ...". Once we have a declarative definition, we can talk about ways an identity can be stored and transferred between origins/sites/parties.

The first bullet in https://github.com/michaelkleber/privacy-model#identity-is-partitioned-by-first-party-site comes fairly close to this kind of definition, and probably deserves to be promoted to somewhere more visible.

Yes, I agree. I also think it is worth trying to formulate that more precisely rather than those two example sites (e.g. what about my identity on google.com vs youtube.com? what about my identity on accounts.google.com vs www.google.com? what about my identity when I log into my browser to turn on sync vs www.google.com? etc. etc. etc.)

As I mention in #10 (comment), we might even want to use some core term other than "identity", to avoid the connotation baggage that has from real-world identities.

Absolutely.

At the very least, there is the user's aspect to this, and there is the publisher's aspect to this (commonly known as "data leakage" in the parlance used among the publisher folks).

@Alex210386
Copy link

@jyasskin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants