Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify a metadata format for external release descriptions #436

Merged
merged 5 commits into from
Jan 16, 2023

Conversation

ximion
Copy link
Owner

@ximion ximion commented Sep 29, 2022

Hi!
This PR contains the specification for a "new" metadata format that contains release descriptions. Having them split out of the main metainfo file allows them to be updated independently and separately, and even allows us to fetch updated release information automatically at catalog build time.

This PR follows the discussion at #354

Short summary of the changes:

  • To use external release metadata, the main metainfo component must contain a releases tag with the property external="yes" (note: maybe type="external" is better / more consistent?). This is done so we don't need to stat() around too much and to have the metadata author communicate their intent clearly (good for validation!).
  • External release metadata files follow the existing specification for AppStream releases, the releases tag is the root node of their XML.
  • External release metadata files must be installed as /usr/share/metainfo/releases/%{cid}.releases.xml (cid being the component ID they belong to).
  • External releases may be downloaded from a HTTPS web URL set in the metainfo file's releases tag.
  • The external release data must be installed locally, even if it is incomplete. Only having the web URL is not permitted.
  • Any artifact present in a release file downloaded from a remote source must not be trusted without further verification for now, until release metadata signing is implemented.

As you can see, I cut out the signing & verification issue for now, which, given that we only allow HTTPS links for release metadata, is probably also only a real problem in case of domain hijacking. That way we can make progress on this feature and add signing later (I have ideas as for how that may work too, but taking one step at a time feels like the better approach).

Feedback is very welcome before this gets implemented! (Especially from @pwithnall and @Pointedstick :-))

@JakobDev
Copy link
Contributor

external="yes" (note: maybe type="external" is better / more consistent?).

I would suggest using type, so more types (e.g. automaticity import releases from other format) could be added in the future.

External release metadata files must be installed as /usr/share/metainfo/releases/%{cid}.releases.xml (cid being the component ID they belong to).

It is part of the AppStream specification, but as it is made to be used for other purposes, where Program,s may want to get a changelog, too, I would suggest a extra directory like share/releaseinfo.

Any artifact present in a release file downloaded from a remote source must not be trusted without further verification for now, until release metadata signing is implemented.

Is the artifact tag actually used by anyone?

As you can see, I cut out the signing & verification issue for now, which, given that we only allow HTTPS links for release metadata, is probably also only a real problem in case of domain hijacking.

I personally think this is a little bit overengineering. The changelog of a program for software centers is not security critical. Nothing bad happens if a Domain would be hijacked and someone would provide another changelog, this would not do something else than trolling Users.

A big problem that I see is that the changelog is not mirrored. Screenshots are mirrored, so we can be sure that they are always available and the Screenshots are always the same file until next compose. But as I read, the purpose of the external release URL is that the changelog can be changed after release without a new version. So this would not be mirrored and and the software centers would access the URL each time someone is viewing the App. This brings a few problems:

  1. What happens if the URL disappear/changed?
  2. What about privacy? The Server owner could log every person who views the App with the IP. Especially under Linux, there are a lot people who really care about privacy, even if it's just the IP.

And what about the Flathub website. They need to download the file each time someone visits the App page. you can cache i for let's say a day, but the backbend needs to download and parse a file from some random URL. I'm not sure if the devs are a fan of this.

Another problem would be the backwards compatibility. First, everyone (including Flatpak) needs to switch to appstreamcli from appstream-glib. I'm not sure how much distros still use appstream-glib. This would also break the experience for Users which use older Distros. If AppStream adds a new feature it is usually a addition like the new internet tag and will be just ignored by appstream-glib and older versions of libappstream. But this would be a breaking change for some existing and every time used feature.

@ximion
Copy link
Owner Author

ximion commented Sep 30, 2022

It is part of the AppStream specification, but as it is made to be used for other purposes, where Program,s may want to get a changelog, too, I would suggest a extra directory like share/releaseinfo.

It's still very much part of the metainfo specification though, which projects should already have. I don't think scattering files across too many places makes sense, tbh...

Is the artifact tag actually used by anyone?

Yes, it's used a lot actually, there's a bunch of internal uses I know about but publicly fwupd and KDE make heavy use of it. Whether it's used or not, as long as it is a supported part of the specification we need to support it anyway.

I personally think this is a little bit overengineering. The changelog of a program for software centers is not security critical.

It may not just be a changelog though, it may also contain links to the actual program data.

A big problem that I see is that the changelog is not mirrored.

It is. This change only makes modifications to the metainfo spec, not to the catalog spec. That means for aggregated catalog data, a tool reading that data will never see an external changelog, it is always downloaded by the catalog metadata generator and embedded into the catalog data. That also ensures quite a great deal of backwards compatibility.

What happens if the URL disappear/changed?

A warning is emitted and the generator tool will likely fall back to the locally provided data (I could probably add some recommendations for good behavior to the spec, but I don't want to mandate what implementations do for that case - you could also argue that the component should be rejected entirely if the URL vanished).

What about privacy? The Server owner could log every person who views the App with the IP.

Not the case, software centers should never hit the original source unless they read the metainfo file directly. In every other case, a tool like appstream-generator will have take care of the task (of course, there's always Arch Linux which doesn't mirror anything, but for them the privacy issue already exists for screenshots and the problem could be trivially mitigated by only reading the locally provided data instead of the remote one. But that's a fringe case, for most people the catalog generator will have been the only thing that downloaded the metadata.)

Another problem would be the backwards compatibility. First, everyone (including Flatpak) needs to switch to appstreamcli from appstream-glib. I'm not sure how much distros still use appstream-glib.

AFAIK only Fedora and OpenSUSE. But yeah, it will take a while before projects can rely on this feature, same for when we introduced em and code highlights in descriptions and appstream-glib took forever to support it and appstream needed to percolate through all distros first. But that is precisely why I want to add this now, before AppStream 1.0 is released.

This would also break the experience for Users which use older Distros.

That would not be the case, the catalog format remains fully compatible. In some odd cases, distros may need to update the generator server-side, but most do that already anyway :-)

@ximion ximion force-pushed the wip/mak/external-releases branch from 4665d91 to 3df299d Compare October 2, 2022 02:33
@CarlSchwan
Copy link
Contributor

I like the idea and from KDE point of view I don't see any issue in implementing and we already provide rss feeds with the release information from the appstream data. See for example:

curl https://apps.kde.org/neochat/index.xml | xmllint --format -

That we will probably need to do is to copy this release feeds to a cdn so that we don't end up with ddos-ing of our poor servers once discover and co will fetch the metadata ;)

@ximion ximion force-pushed the wip/mak/external-releases branch from 3df299d to 0980de7 Compare November 27, 2022 00:17
@ximion ximion force-pushed the wip/mak/external-releases branch from 0980de7 to 894da43 Compare January 4, 2023 01:22
@ximion
Copy link
Owner Author

ximion commented Jan 6, 2023

I am currently working on an implementation of this feature as described. Once merged, I will release AppStream 0.16 early next month, which will be the last series before the 1.0 release, and likely will receive support for quite a while. It will also include code to make the transition to AppStream 1.0 easier.
The goal is to get 0.16.x into the next Debian release (and therefore Ubuntu and all the derivatives too), which means 0.16.x needs to be released before the soft freeze early next month.

So, any feedback on this feature is highly appreciated! Fortunately, with the more limited scope set for it, it actually is a lot easier to implement and much less disruptive than I originally assumed :-)

@ximion ximion force-pushed the wip/mak/external-releases branch 5 times, most recently from c3391e0 to 4db736d Compare January 12, 2023 01:42
@ximion
Copy link
Owner Author

ximion commented Jan 12, 2023

If there are no more objections or comments on this, I will merge the change this week (in a few days).

Copy link
Contributor

@pwithnall pwithnall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the slow reply on this one. The spec looks great! Thanks for putting some much work into this and driving it to completion.

I’ve left a few minor comments, but nothing significant apart from one query about file naming.

docs/xml/metainfo-component.xml Outdated Show resolved Hide resolved
docs/xml/metainfo-component.xml Outdated Show resolved Hide resolved
<title>Local Release Data</title>
<para>
Please note that even if release data is external and also provided on a remote location, it also <emphasis>must</emphasis> be
available locally, installed as a file into <filename>/usr/share/metainfo/releases/%{cid}.releases.xml</filename>.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could there be collisions over /usr/share/metainfo/releases/%{cid}.releases.xml if %{cid} is listed in multiple appstream files? For example, the situation where org.gnome.Builder.desktop is provided in both the flathub and flathub-beta flatpak repos, as two different versions with two different changelogs.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There shouldn't be any collision, as the local release metadata will always be alongside the MetaInfo file (so e.g. /usr/share/metainfo/releases/org.gnome.Builder.releases.xml and /usr/local/share/metainfo/releases/org.gnome.Builder.releases.xml don't collide).
For catalog data, we currently don't allow remote releases files (but might do that in future - the code would already read those if we decide to add them later), so each source would have its own URL and we wouldn't have any collisions again.

compose/asc-utils-metainfo.c Show resolved Hide resolved
src/as-component-private.h Outdated Show resolved Hide resolved
ximion added a commit that referenced this pull request Jan 16, 2023
@ximion ximion force-pushed the wip/mak/external-releases branch from 4db736d to 28877f5 Compare January 16, 2023 02:25
@ximion
Copy link
Owner Author

ximion commented Jan 16, 2023

Thank you very much @pwithnall for the review! It's very appreciated :-)
I addressed all comments, so I think this is safe to merge now. Once the code is in master, we'll see if anything breaks (hopefully not, the feature is written in a way that is fully backwards compatible and should be safe to use once the new libappstream has received some more widespread adoption. But of course it's also certainly one of the more invasive changes done on AppStream...)

Thanks to everyone for the feedback!

@ximion ximion force-pushed the wip/mak/external-releases branch from 28877f5 to f563650 Compare January 16, 2023 02:34
@ximion ximion merged commit e4df93c into master Jan 16, 2023
@ximion ximion deleted the wip/mak/external-releases branch January 16, 2023 02:46
@ximion ximion restored the wip/mak/external-releases branch December 16, 2023 17:44
@ximion ximion deleted the wip/mak/external-releases branch December 16, 2023 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants