-
Notifications
You must be signed in to change notification settings - Fork 13
Description
Capture Matching Function - Possible AI supported outputs
Challenge Brief:
Captures as images of trees and other data are collected with smartphone apps and are used to verify environmental work. Users often return to the same plants/trees over time resulting in "layers of images" of the same tree / same location. These images must be linked together.
Current 'capture matching' is done with a front-end react web app and backed by a RESTful API. This 'capture matching administration panel' is how users match them manually; the interface is supplied with images from an API and displayed based on GPS coordinates/distances and other filters that the user sets, including time range and organization. If several matches are found, the capture matching system displays the GPS-related images in order of distance. These captures are then matched by an operator. The process is slow, has room for improvement accuracy, and requires a level of automation at scale.
Besides being prone to error, the current operation doesn't account for data related to identified species, leaf morphology, trees already capture matched, species, track files, tracking seasons and other attributes.
The goal of this ticket is to identify and test methods and solutions that augment, verify, or replace the current capture matching process.
How to contribute
Just go for it and solve it.
If you get stuck, you can ask questions on this ticket, via Greenstand's Slack, or by emailing the ticket contact below. (join on slack and introduce yourself in the community_intro channel. From there you will be invited to the project specific channels)
Note: IF you believe there is insufficient information or infrastructure provided to solve this critical issue, please reach out.
Deliverable:
- Any integration improvement to the Greenstand stack as a pull request to the appropriate Greenstand repository as a script/airflow function that feeds the user interfaces.
- Any integration improvement to the Greenstand stack as a service based on current API functions that "pre-tags" captures.
- An Open and lead ADR on recommendations to change or improve a process, data collection etc. Note: Viable data collection recommendations cannot increase workload for users or apps.
Full Challenge Narrative:
The underlying value proposition on the Greenstand Token Model is the ability for individuals to earn and trade tokens linked to work surround ecological restoration, which is often based on the growth of plants or trees over time. The issue of identifying repeating captures / visits to individual trees is critical to the success of many projects using the model which encourages the re-tracking of trees to document maintenance, tree health and growth over time as a means of employment, poverty alleviation and ground verification around successfully implementing carbon and reforestation projects.
Solving this challenge will:
- Drive more community based engagement in carbon offset projects
- Support the identification of successful and unsuccessful restoration methods.
- Increase tree survival rates (most tree planting is a plant and forget model)
- Identifies duplicated images and scammers.
- Add value to the Greenstand Token Model
Each capture contains a geo tagged image collected from a mobile app. It enters the Greenstand system and is tagged with various attributes (such as species) using a number of different microservices and manual operators. The first time a capture is captured is unique to the location and context. However, a re-tracked tree creates data points that are similar to the initial tree capture.
Users tend to double track trees, intentionally or unintentionally, in single tracking sessions, or at later dates, or multiple users overlap their tracking at different times, especially when implementing larger operations (hundreds or thousands of trees.)
GPS inaccuracy is an issue. Most user phones are cheap models and limited in their ability to pinpoint locations and many trees are often collected within the "area of GPS error." The GPS data alone is not accurate enough to match the images. Trees are often planted a meter or less apart, while GPS accuracy is often 10 meters or more.
Related operational issues
- Trees die and are often replaced, in the same geo location.
- Users and tree growers are incentivized to take duplicate images of trees and some have tried to scam the system by taking multiple images of the same plant from various directions.
- User and phone specific data is not considered in this as the same user or different users returns to the same cluster of trees at undefined times.
- GPS accuracy radius overlaps multiple trees possibilities.
- Physical tree tags and RFID tags have been tried and ruled out as not a scalable option for our users.
Solutions:
It is not expected to have a single solution to 100% solve this challenge, rather a solution is expected to be built by many incremental improvements and tools added to the process from different sources.
Possible solutions:
GPS coordinate accuracy enhancement, using filtering algorithms. Object recognition coupled with GPS to link trees across the maintenance period. ML image verification
- "pre match" as many captures as possible before showing them to users.
and put in place a machine learning process that will - Create algorithms that automatically match the captures.
- Utilize other layers of data - track files, species data.
- Scrub data priory to evaluation (adjusting inaccurate GPS data)
- Statistically match based on total number of possibilities.
- Use images based attributes to match. (such as background rocks and unique environmental attributes)
- Redesign the UX of the capture matching tool in the admin panel
- Enhancements to GPS accuracy (see issue)
Supporting ideas include:
- Identifying and matching image background.
- Using species diversification to limit options shown to admin operator
- Using leaf morphology to limit options shown to admin operator
- Users tend to travel in relatively predictable paths.
- Each tree is unique
- Many projects have multiple updates of individual trees.
Barriers to completion:
- User Privacy issues limit access to some data.
- Lack of curated or accuracy matched data sets
- Testing solutions may require setting up a number of Greenstand's microservices.
- Limited Feedback on solutions due to limited organizational capacity
- Limited organizational capacity to quickly review and integrate new solutions into the full stack
- Can be challenging to visualize results (Greenstand admin tools can be quite helpful for these and can be set up independently, however production services are not set up for open access with real data.)
Resources:
Links
- Greenstand.org
- Github.com/Greenstand
- Capture Matching ADR
- Capture Matching Production Readiness Project Tickets
- Digital Herbarium Repo
- Figma Design Files for the Capture matching UI
Data Resources
- Data provided Data Sample CSV download
- Track files links (link to be added)
Suggested data sets
There are data sets of trees with repeat captures with sticks painted with colored stripes (In Haiti) which can provide an extra layer of support creating curated sets.
The Freetown City data has been mostly manually matched (although there has not been much quality control on that data set)
Greenstand respects our users privacy. For more data needs, please contact the issue lead and articulate why you need it and be prepared to provide a government issued id and sign a legally binding data privacy policy.
- For access to test based admin tools (they can be set up independently or accessed via the Greenstand community slack)
- Capture matching related links: https://github.com/search?q=org%3AGreenstand+capture+matching&type=code
- Current Capture Matching tool access via slack, or linked upon request.
Related Projects/tools:
Related Issues:
Greenstand/treetracker-admin-client#568
Greenstand/treetracker-admin-client#1029
Greenstand/treetracker-admin-client#949
Greenstand/treetracker-admin-client#781
Greenstand/treetracker-admin-client#568
Https://github.com/Greenstand/Greenstand-Overview/issues/54
Https://github.com/Greenstand/Greenstand-Overview/issues/52
Https://github.com/Greenstand/treetracker-android#197
#75
Contacts on this issue
Primary: Xinyi Hu xinyi.hug@reenstand.org
Secondary: Info (at) Greenstand.org
To do:
- Add Data download file
- Add track Files