Skip to content

Hierarchical source code structures #259

@magnusbaeck

Description

@magnusbaeck

Description

There are a number of scenarios where source code repositories form hierarchies, i.e. the state of one repository links to the states of one or more other subrepositories that should be checked out together:

  • Git submodules
  • Git subtree?
  • Repo

There's currently no obvious way of representing this with Eiffel. You could use a composition that flattens the hierarchy by linking to all SCC/SCS in one level,
Flat CD hierarchy
or use multiple CDs to model the original hierarchy:
Multilevel CDs

On the surface this wouldn't be unreasonable. When constructing an ArtC and you want to express what source code was built you don't link directly to the SCC and SCS but rather to a CD that very well could have multiple ELEMENT links to SCSes of parent commits and the submodule commits. In fact, it could make sense since recursive checkouts are optional, i.e. you could build an artifact without the submodules.

However, say you have a commit id and locate its SCS from the ER (which I think is a non-contrived use case). Don't you want to be able to get the complete source code state from that commit? You could search for a CD that links to that SCS, but that search could turn up hundreds of events. Besides, how would you even know when to perform that extra search? For every SCS? Having to search for the "official source code state CD" of an SCS just to be able to reproduce the full source code state doesn't sound great to me.

A cleaner option would be to introduce a new link type for SCC that can be used to link to one or more other SCCs representing the submodules (or equivalent). That would allow us to correctly model the actual source code hierarchy while maintaining a unidirectional event graph. Unfortunately I'm struggling to come up with a good name for that link type, which is a bad sign.

Motivation

Multi-repository codebases are extremely common and there should be a good way of modeling them with Eiffel. Using a composition to group SCCs is likely to lead to ambiguity.

Exemplification

N/A

Benefits

Better modeling of multi-repository codebases.

Possible Drawbacks

When would this be used besides submodules? Should it apply to similar forms of hierarchical source code structures like Repo's manifest gits? It's conceptually the same thing, it's just not built into the VCS. The SCS and SCC are typically sent by the server-side VCS (or some tightly coupled sidecar service), but while we can ask for it to support "native" hierarchy builders like submodules I don't think we can expect it to have knowledge of e.g. Repo.

Metadata

Metadata

Assignees

Labels

protocolAll protocol changes

Type

No type

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions