Working Group Meeting 4 May 2022

Planned agenda:

Our focus this week is on the quality and provenance issues that you need to know about datasets before you use them. We have done some research, including wading our way through the ISO Geospatial Data Quality standards to understand what has been done before. Issue #27

A quick summary:

Many of the ISO data quality specifications are more appropriate to a country-scale data set that is being formally reviewed for completeness and accuracy. For our purpose at farm scale, there seem to be three possible routes:

Track the effective scale of the imagery or coverage used to create the map (e.g. metres per pixel) as a scale or maybe topology level
Have a set of known values that are placed in a "Derivation" and/or "Acquisition" field (what might these values be?)
Define a set of "suitable for xxxxxx" usages and declare conformance with these usages (suitable for planning is different from suitable for precision application, for instance).

As we viewed issue #27 we asked the following questions:

1. How do you measure that a dataset is suitable for your purpose?

Key points brought up:

A date stamp for aerial and satellite images.
The source and how the data was recorded.
An example provided about having suitable data was in regard to imagery that is used by machine learning.
Keeping lineage of records as passed from system to system that is both human and machine readable.
Legitimacy of the data i.e., that has not been modified.
Temporal consistency and the history of the data and changes over time.

2. What challenges have been faced when trying to get data from another system?

The definition of terms used.
Data aggregation based on other system’s data.
What standard has been complied to e.g. ISO, NZTM.
How was the data derived eg manual versus automatically.
Geometry being modified by a system.
A mixture of IoT data and manually read data, for example rainfall records, dam levels and food on offer records.

3. When you get provided with data how do you know that this is good data set to use in your software tool?

Topology issues e.g. knowing whether features should crossover for example paddocks shouldn't.
There are no standards so often have scripts to spot mistakes to clean them.
The number of decimal places that are used.
One challenge is getting enough metadata about data collected in the field.

Summary

The starting point for adding things in needs to be the Australia and New Zealand metadata standards.

We also need:

Something in terms of ground sample distance or resolution.
The dates of data acquisition and any details about the acquisition device or source.
A definition of terms.
Coordinate reference systems and rounding.

It would be nice to have:

A mechanism to maintain lineage and have a machine-readable version of lineage.
A way to capture temporal consistency and legitimacy of the data.
Electronic lineage to detect if data was manually or automatically captured.

Cross overs and snapping is likely something that a system will need to apply QA to when consuming the data.

Homework for us:

We will review some standards to see if there are any components that are useful to us. We will either provide this as comments against issue #27 or at a future meeting.

Homework for the participants:

A request for those of you who have some background experience using JSON, there is a pull request, Restructure hierarchy and introduced PlotZoneResource #28. We would like you to review the commits. Let us know so you can be named as a reviewer.

Pull Request #28

At the next meeting we will pick up the discussion about observations and event that have been recorded against spatial entities. What things have occurred to the spatial feature for example fertiliser application, spray record, observation of a species etc. There is a useful data model for doing this called the ADAPT common object model. Please take a look using the link to see if this model is useful for representing observations and events that are used in your business.

ADAPT model

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Working Group Meeting 4 May 2022

Planned agenda:

1. How do you measure that a dataset is suitable for your purpose?

2. What challenges have been faced when trying to get data from another system?

3. When you get provided with data how do you know that this is good data set to use in your software tool?

Summary

Homework for us:

Homework for the participants:

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally