Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Case: Pathology Whole Slide Imaging (WSI) and Their Morphological Features #13

Open
ebremer opened this issue Nov 7, 2024 · 0 comments

Comments

@ebremer
Copy link

ebremer commented Nov 7, 2024

Pathology Whole Slide Imaging (WSI) is the digitization of entire pathology slides at high resolution. By converting glass slides into digital images, WSI enables pathologists to view, analyze, and share slide images seamlessly, facilitating remote consultations, collaborative research, and improved diagnostic accuracy. High-throughput scanners capture entire tissue sections in full detail, which can then be reviewed on a computer screen or analyzed through computational methods, including AI-based techniques. WSI are very large and can and are often in the realm of 100,000x100,000 pixels. Through the use of Deep Learning AI Models, vast amounts of morphological features can be extracted and represented as polygons and classed according to their feature type. Engineered features derived from this data including polygon perimeter, area, texture values, etc, can be annotated on these polygons. The number of polygons can number in excess of a million polygons per WSI.

It is of interest to do spatial searches looking for features that may be within a certain distance, overlap, or containment in conjunction with other features. WSI can also appear as registered slices forming a "3D" ~2.5D volume with a need for x,y,z Cartesian coordinate systems as well as x,y.

Often, derived polygonal feature reflect that exact same coordinates as the reference WSI image. Different scanners can have different physical pixel sizes for each image and these values are not always equal.

GeoSPARQL offers a way to represent most of this information (see below). The below example uses http://www.opengis.net/def/crs/EPSG/0/4087 which is Cartesian, loaded into a GeoSPARQL-enabled Virtuoso instance and using "units:GridSpacing" in spatial function will yield the right numbers but there isn't an obvious way to indicate that this is a grid with a certain SizeX, SizeY for grid spacing. In the case of WSI Pathology Imaging, this would be in the realm of micrometers. Although the integer pixel x,y values could be converted to meters, it is far more readable and convenient to express them in integers and be able to specify a CRS that uses a specific unit and a size that could be different between the X and Y coordinates. A link to a custom CRS in RDF that allows for the various CRS for particular images would help homogenize queries across different images.

Further, due to the volume of data, it is also convenient and somewhat necessary to pre-generate different scales of the WSI images and their related data in order to have performant viewing. Often done in the form of an image pyramid with 1/4 resolution between layers and a scaled polygon pyramid for derived features.

@prefix dc:   <http://purl.org/dc/terms/> .
@prefix exif: <http://www.w3.org/2003/12/exif/ns#> .
@prefix geo:  <http://www.opengis.net/ont/geosparql#> .
@prefix hal:  <https://halcyon.is/ns/> .
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sno:  <http://snomed.info/id/> .
@prefix so:   <https://schema.org/> .
@prefix xsd:  <http://www.w3.org/2001/XMLSchema#> .

<urn:md5:a923c8367e61792f531e65d966d4cb78>
        a            so:ImageObject;
        exif:height  "82984"^^xsd:int;
        exif:width   "112231"^^xsd:int .

[ a                    geo:FeatureCollection;

  dc:creator           "http://orcid.org/0000-0003-0223-1059";
  dc:date              "2023-11-09T19:48:15.406625700Z"^^xsd:dateTime;
  dc:description       "Nuclear segmentation of TCGA cancer types";
  dc:publisher         <https://ror.org/01882y777> , <https://ror.org/05qghxh33>;
  dc:references        "https://doi.org/10.1038/s41597-020-0528-1";
  dc:title             "cnn-nuclear-segmentations-2019";
  prov:wasGeneratedBy  [ a                       prov:Activity;
                         prov:used               <urn:md5:a923c8367e61792f531e65d966d4cb78>;
                         prov:wasAssociatedWith  <https://github.com/SBU-BMI/quip_cnn_segmentation/releases/tag/v1.1>
                       ];					   
					   
  rdfs:member          [ a                   geo:Feature;
                         geo:hasGeometry     [ geo:asWKT  "<http://www.opengis.net/def/crs/EPSG/0/4087> POLYGON ((69379 61479, 69378 61480, 69373 61480, 69370 61483, 69370 61492, 69371 61493, 69371 61494, 69372 61495, 69373 61495, 69374 61496, 69375 61496, 69376 61495, 69379 61495, 69380 61494, 69381 61494, 69383 61492, 69384 61492, 69385 61491, 69385 61490, 69386 61489, 69386 61488, 69387 61487, 69387 61483, 69384 61480, 69383 61480, 69379 61479))" ];
                         hal:classification  sno:48512009;
                         hal:measurement     [ hal:classification  sno:48512009;
                                               hal:hasProbability  "1.0"^^xsd:float
                                             ]
                       ];
  rdfs:member          [ a                   geo:Feature;
                         geo:hasGeometry     [ geo:asWKT  "<http://www.opengis.net/def/crs/EPSG/0/4087> POLYGON ((87135 28142, 87134 28143, 87133 28143, 87133 28144, 87132 28145, 87132 28146, 87131 28147, 87131 28160, 87132 28161, 87132 28163, 87134 28165, 87135 28165, 87137 28167, 87138 28167, 87140 28169, 87141 28169, 87142 28170, 87147 28170, 87147 28169, 87148 28168, 87148 28158, 87149 28157, 87149 28156, 87150 28155, 87151 28155, 87152 28154, 87152 28151, 87147 28146, 87146 28146, 87145 28145, 87144 28145, 87135 28142))" ];
                         hal:classification  sno:48512009;
                         hal:measurement     [ hal:classification  sno:48512009;
                                               hal:hasProbability  "1.0"^^xsd:float
                                             ]
                       ];
  rdfs:member          [ a                   geo:Feature;
                         geo:hasGeometry     [ geo:asWKT  "<http://www.opengis.net/def/crs/EPSG/0/4087> POLYGON ((90041 34682, 90040 34683, 90035 34683, 90031 34687, 90031 34688, 90030 34689, 90030 34690, 90031 34691, 90031 34694, 90034 34697, 90034 34698, 90036 34700, 90037 34700, 90039 34702, 90040 34702, 90041 34703, 90042 34703, 90043 34704, 90044 34704, 90045 34705, 90046 34705, 90047 34706, 90048 34706, 90049 34707, 90062 34707, 90063 34706, 90065 34706, 90066 34705, 90067 34705, 90067 34704, 90068 34703, 90068 34700, 90067 34699, 90067 34693, 90066 34692, 90065 34692, 90062 34689, 90060 34689, 90059 34688, 90057 34688, 90056 34687, 90054 34687, 90053 34686, 90052 34686, 90051 34685, 90050 34685, 90041 34682))" ];
                         hal:classification  sno:48512009;
                         hal:measurement     [ hal:classification  sno:48512009;
                                               hal:hasProbability  "1.0"^^xsd:float
                                             ]
                       ];
  rdfs:member          [ a                   geo:Feature;
                         geo:hasGeometry     [ geo:asWKT  "<http://www.opengis.net/def/crs/EPSG/0/4087> POLYGON ((49207 26483, 49206 26484, 49205 26484, 49202 26487, 49201 26487, 49199 26489, 49199 26490, 49198 26491, 49198 26493, 49197 26494, 49198 26495, 49198 26498, 49199 26499, 49199 26500, 49201 26502, 49202 26502, 49203 26503, 49204 26503, 49205 26504, 49211 26504, 49213 26502, 49214 26502, 49215 26501, 49215 26500, 49216 26499, 49216 26496, 49217 26495, 49217 26487, 49214 26484, 49213 26484, 49207 26483))" ];
                         hal:classification  sno:48512009;
                         hal:measurement     [ hal:classification  sno:48512009;
                                               hal:hasProbability  "1.0"^^xsd:float
                                             ]
                       ]
] .

This data loaded into a GeoSPARQL enabled Virtuoso instance will allow a request like this to work:

PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX geof: <http://www.opengis.net/def/function/geosparql/>
PREFIX odd: <https://halcyon.is/geosparql/ns/>
PREFIX units: <http://www.opengis.net/def/uom/OGC/1.0/>
PREFIX : <https://halcyon.is/geosparql/>
select * where {
  ?s :odd ?wkt
   FILTER(geof:distance(?wkt, "POINT(0 0)"^^geo:wktLiteral, units:GridSpacing) < 100000) 
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant