-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
If/how to include ‘anclillary results’ and ‘quality measures’ modeling patterns in the core IM #144
Comments
I will defer to @ahwagner on this for the final word, but I think these two specific data structures may exceed what I (we) envisioned for the I understand that this is all non-standard, but then again, the profile is meant to be based on the standard and implementers want to make additions in any way they see fit then so be it. The gnomad cohort allele frequency data needed these ancillary results for our immediate use. And, yes, we could have gone back to the standard's drawing board to attempt to model this in a more democratized way, but we just didn't have the bandwidth, time and resources to do that.
Again, deferring to @ahwagner for the final word. I'm fine with moving |
No, this is not about importance. These are important properties. Arguably, the content of these fields are at least as important as the CAF result itself. The problem is that there is no consensus across resources on what types of quality measures or ancillary results should be used. We are starting with an open approach, and (down the road) can add in common quality measures or ancillary results as they are identified across resources.
I agree with the first half of this statement: these properties provide the semantics of quality measures and ancillary results. I disagree this is essentially the same end as use of Extensions on the parent object. I do not think these should be moved down to the gnomAD profile; they should be useful across CAF implementations and should stay with the CAF standard profile. We may consider a new parent class that includes these, as I expect this pattern will be useful for other evidence types. |
I left these two attributes in the standard profile for @mbrush Let me know if you think this issue is worth continuing to discuss and make changes. In the spirit of finding a good compromise so that we can move forward we may want to close this out and revisit when/if another CAF-like implementation arises. |
To be clear, and as indicated in the title and description of this issue, the question is about whether we want to add these This is probably an issue of lower priority relative to others, so let's table it for now. |
Seeing that this issue is relevant to a questions raised by @Mrinal-Thomas-Epic for the Connect Implementation Warrior session, I will add one more thought here. I wanted to note that the Core-IM The ancillary results and quality measures that are captured by the attributes in question (e.g. What this means is that implementation CAF StudyResult models like the one for gnomad gk-pilot can just go ahead and create new named properties in their implementation schema for these ancillary or quality measure data items directly, at the same level / alongside the specific data item attributes defined in the schema. These are just additional 'specializations' of the core-im 'dataItem' attribute. No need to bucket them in nested structures. The gk pilot schema for this would simply import the standard CAFStudyResult profile, and add a few more attributes to this class. Something like:
And gk pilot CAF data would look something like:
Here, the implementation specific attributes sit at the same level as the ones from the standard CAF profile - but this is allowed by the va-spec, as again, conceptually they are just additional specializations of the That's it . . . just wanted put this out there as an option, not to say it is better or worse than other approaches. |
Removing this commented-out text from the source as part of clean-up; dropping code here for subsequent discussion about reintegration. # Note that the `dataItem` property below is a *placeholder* that should not be inherited or used
# as is in derived StudyResult profiles. It is meant to be specialized into more specific properties
# that are defined to capture a specific kind data item relevant to a particular StudyResult profile.
# For example, in the the CohortAlleleFrequencyStudyResult profile, it is specialized into 'focusAlleleCount',
# 'focusAlleleFrequency', and 'locuslAlleleCount' attributes.
# We are commenting out this property for now to avoid its unwanted inheritance in StudyResult profiles
# that import the gks-core. We will re-instate this property once we determine how the modeling framework
# and tools can support formal specificaiton of the conceptual specialization that is happening here.
#
# dataItem:
# type: object
# description: >-
# One or more data items that are included in the StudyResult because it pertains to the 'focus' of the result.
# This can be data that directly describes this 'focus' (e.g. the population frequency of an allele focus), or
# be metadata about how data about the 'focus' were generated (e.g the quality measures for the sequencing run
# used to determine this allele frequency).
# comment: >-
# Note that in profiles of the StudyResult class, 'dataItem' is typically specialized into one or more
# datatype-specific attributes that are defined to capture a specific kind of data. e.g. 'focusAlleleCount'
# and 'focusAlleleFrequency' in a CohortAlleleFrequencyStudyResult profile. |
If I understand, the
ancillary results
andquality measures
modeling patterns that provide empty buckets for implementations to create properties to capture project-specific types of data and quality measures in a Study Result which don't belong in the standard profile. More specifically, they are simply properties that take an object of an unspecified type with additional properties allowed. e.g. from the ga4gh standard caf profile:A project-specific implementation profile/schema may define specific properties withiin these untyped object, to capture ancillary results or quality measures they want to report that are specific to their project. These properties are not generally useful enough however to be considered for inclusion in a standard a ga4gh 'standard' profile (e.g the caf-profile). Is this right?
A few thoughts/questions:
It strikes me that these offer an alternative to using the Extension mechanism that is built into the va model. The rationale/value add here is that defining these specific properties (
ancillary results
andquality measures
) gives a bit more semantics/guidance about what types of extended content may be collected here. But the Extension mechanism could be used by an implementation profile here to achieve essentially the same end. Is this right?One question to address here is if/how/where we want to include this modeling pattern in the v1 va-spec release? (i.e. at what level do we define
ancillary results
andquality measures
properties.Consider also if our profiling conventions/methodology would even allow this in a caf ‘standard’ profile (defining new named properties that do not extend core-im properties)?
The text was updated successfully, but these errors were encountered: