Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 81 additions & 0 deletions assignments/2025/verhagen_joris/submission.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# Software Engineering Assignment
### by Joris Verhagen

## 1. Introduction
My research is on motion planning and control of multi-agent robotic systems in extreme environments. I specifically look at planning and control under complex spatio-temporal specifications, going beyond simple reach-avoid behavior. Concretely, this is captured in a formal specification language which, interestingly, originates from the theoretical computer science field: temporal logics. \
For example, one might wish to have a robot be charging at least every hour while, in arbitrary order, visit certain regions of interest to do environmental monitoring. Alternatively, a team of robots may need to survey a large field, with the requirement that every 10 seconds, the communication graph is connected to pass back obtained information to a central computer.\
My research then further focuses on the kind of metrics that are maximized during motion planning and control. Without going into to much detail, the current state-of-the-art maximizes model-agnostic robustness metrics where, if an area needs to be surveyed, the distance to the borders is maximized, or the time to arrival before the deadline is maximized. Instead, I am interested in model-based robustness metrics where the motion plan or controller is found which maximizes the permissible disturbances acting on the robot (which charging port should I dock to that permits the largest disturbance during operation), or the maximal probability of actuation failure (which region should I visit where I can come back home with the highest probability if one of my wheels may fail). In the future, I hope to involve humans in these decision-making process where perceived safety of a human may be maximized while satisfying the mission, or minimal requests are put on the human in order to make the overall plan of the robot more robust or more pleasant to the human and its environment.\
These model-based robustness metrics are able to capture the interplay between the environment, the robot, and the controllers and may lead to qualitatively and quantitatively different results compared to model-agnostic robustness metrics.




## 2. Lecture Principles
I have chosen QA for ML systems and Verification vs. Validation as I feel both are closely related to my research.\
**QA in ML**: First, machine learning is starting to take a place in motion planning and control. Whereas first ML techniques used in the field mostly relate to feature detection and high-level decision making, current work also tries to address the classical trajectory optimization problem with spatio-temporal constraints using ML techniques [1]. A key challenge is that while classic motion planners and controllers are white-box deterministic, current ML-based planners are often black-box and non-deterministic. An example of that is shown in [1] where diffusion is used to create permissible trajectories for a system.\
QA for these systems relies on two techniques: probabilistic (risk-aware) guarantees and statistical verification. Probabilistic techniques rely on confidence intervals on the outputs of an ML-based planner or controller and use these intervals to define a probabilistic or risk-aware notion of e.g. safety. While such works exist[2], it is difficult to obtain valid confidence intervals over decision variables. Statistical verification such as conformal prediction is a technique which has a surge of interest in robotics. Using a test dataset, confidence regions around future predictions can be generated, on the strong assumption that the testing dataset and the deployment environment are of the same distribution [3]. This is incredibly difficult to verify and while robust methods exist, allowing for a shift in distribution, it can be incredibly difficult to capture this shift in real-world environments. \
As such, I would like to conclude that QA for ML in robotics planning and control still has not addressed its fundamental challenges. My guess is that with the incredible performance of LLMs and VLAs, the QA is lagging behind due to the lack of excitement of the results. \
**Verification vs. Validation**: in my research of motion planning with temporal logic constraints, verification and validation is extremely important. While ideally we would like to synthesize motion plans that are valid, this could be intractable in general [4]. Instead, a motion plan could be synthesized in any other way, and it might then be verified w.r.t. the specification afterwards.\
Validation can be an important topic when we try to capture the desires of a human operator into a spatio-temporal specification. It is then key that the resulting trajectory or controller not only satisfies the spatio-temporal specification, but also that the spatio-temporal specification fulfills the requirements of the humans (e.g. feeling safe).




## 3. Guest-Lecture Principles
I have chosen Goal Modeling and Requirements Elicitation\
**Goal Modeling**: for robotics, goal modeling might seem trivial at first (create a plan and controller to guide the robot through rooms while being safe and stable) but as robots are entering the real world more and more, becomes more of a challenge. For example, goals for the end-user might be different from the researcher or the business as while the end-user in an office environment wants to feel safe around a robot, the researcher or business owner wants to make it as fast as possible. Additionally, the end-user wants its privacy protected while the researcher or business owner wants to achieve its best performance, which might require retraining on image data of the office environment which thus needs to be stored somewhere.\
**Requirements Elicitation**: requirements for system goals have an obvious relation to my research as often end-users specify an abstract task that the robot needs to achieve while still having strict requirements that are difficult to put into words. For example, an end-user may wish the robot to clean all rooms while not being in the way of humans. This could easily be captured in a formal specification of (Eventually clean all the rooms AND never collide with humans), however, this is likely not what the human intended. The safety aspect could instead be decomposed into measureable quality requirements which could be relayed to the end-user (for example, a room may not be cleaned when more than 3 humans are present, and the robot should leave the room upon any remark from a human). Reiterating on this process between the robot and the end-user could elicit implicit requirements and could create tangible and measurable requirements that the robot could take into account.





## 4. Data Scientists versus Software Engineers
- **Do you agree on the essential differences between data scientists and software engineers put forward in these chapters? Why or why not?**
Yes I agree with these differences and also agree with the observation that their expertise is complementary rather than overlapping. Data scientists, to my knowledge, seem to explore relationships and patterns in data and, within ML research, exploit this to generate models that can predict or synthesize. This is very much a research-oriented way of generating a product. Software engineers however focus on robust, maintainable, and scalable products.
- **Do you think these roles will evolve and specialise further or that “both sides” will need to learn many of the skills of “the other side” and that the roles somehow will merge? Explain your reasoning**
I think that in the future, many of the skills will merge. While ML has taken over software- and data-science, it is currently still from the generation that was taught with distinct courses on writing maintainable and robust code and extracurricular courses on ML (or vice-versa). As ML becomes more apparent in industry, the requirements on universities shifts from extra-curricular courses to absolving it into the core program (whether this is a software engineering student or a data-scientist student).






## 5. Paper analysis
* Privacy and Copyright Protection in Generative AI: A Lifecycle Perspective [5]
1. The papers main contribution is mapping privacy and copyright challenges across the entire life-cycle of the AI system. It identifies 7 key challenges that span multiple stages rather than existing in isolation. The authors argue that traditional approaches are insufficient because they target specific aspects of the AI system rather than taking a holistic view of the entire life-cycle of the product.
These aspects are important as privacy and copyright are not just concerns that are only important during deployment time or during data collection, but warrant design decisions that must be taken into account in the entire development process. Issues such as consent management and transparency require coordinated solutions in formulation, data collection, training, and deployment together.
2. For planning and control in robotics, this relates when we consider learning-based approaches that are trained in the lab and might be deployed in the real environment. It is then extermely important that training data (especially when training on observed data in the field) has the proper consent from the participants and that the stakeholders and end-users should have insight into how decisions are made. Additionally, continuous monitoring for privacy breaches is also important in robotics with AI as onboard data becomes valuable and bad actors may inject data that could significantly alter the decision-making process of the robot. These concerns need to be considered in the entire design process as well, from the robot’s hardware, to software (encryption, communication), the underlying algorithms, and the deployment of the robot.
3. Consider a company that wants to create a robot that entertains elderly in an elderly home. It might tell stories, display interesting videos, or just be close to comfort them. The company is involved in building the robot (from hardware components from different vendors), writing the entire software stack (minimally relying on external libraries), and gathering data and deploying the robots in the elderly homes. The customer is the elderly home itself, which may be interested in purchasing such a robot to make the stay for the elderly more pleasant. As this project involves the entire robotics pipeline, and the deployment needs to deal with both privacy- and copyright- issues, this paper argues that these issues should be addressed from the very start of the project in the form of a life-cycle approach. From the start, the company must consider what data they need and the legal basis for collection. Stories and videos require copyright licensing, while personalization features need consent for collecting elderly residents' data. The life-cycle approach means establishing legal frameworks before any data gathering begins. Elderly residents may have varying cognitive decline, making informed consent tricky. The paper's emphasis on clear communication about data use becomes critical - families and care staff need transparent information about what the robot learns and remembers about each resident. If the robot learns to tell stories and show videos the company needs licenses built into the AI architecture (such that stories can be taken out of the training data if the license is revoked), not added as an afterthought.
Lastly, the robot has to ensure that personal conversations with residents aren't shared or used without consent. This might require real-time monitoring of the robot.
4. My research does not directly involve the entire robotics pipeline from design to coding and to deployment. However, in the space of spatio-temporal logic planning, which contains task specification, high-level decision-making, lower-level planning, and low-level control, there is certainly room for improvement on privacy conservation. I came up with three concrete ways that the ideas of the paper may influence my research:
- Consent-Responsive Re-planning: When consent is withdrawn (as emphasized in the paper), a robot needs to dynamically re-plan such that the privacy of the human is preserved, while the robot still satisfies its spatio-temporal specification. If a property owner revokes consent for robots to traverse their land, the temporal logic framework must quickly generate alternative plans that maintain mission success while respecting these new constraints. In the elderly home scenario, re-planning might occur when residents have indicated not to want to share specific aspects (how do you navigate when certain elderly residents do not want to be captured on video).
- Temporal Logic for Data Lifecycle Compliance: we could look at extending the formal specification language I use in my research (Signal Temporal Logic) to encode data handling requirements alongside motion planning requirements. A specification might state "survey region X every 30 minutes AND delete any collected data from region Y within 24 hours AND ensure communication graph connectivity when data deletion occurs." This treats privacy compliance as a planning constraint which ensures that privacy is conserved during planning and execution, not just in during training and deployment.
- Privacy-Aware Spatio-temporal planning: My research on complex spatio-temporal specifications could incorporate privacy-critical areas where certain robots (or certain components of robots) cannot operate or share data. For example, temporal logic specifications might require "robot team surveys field A completely while ensuring no robot with facial recognition capabilities enters privacy zone B." This would require developing new planning algorithms that respect both mission objectives and privacy boundaries. This closely relates to what I mentioned with the consent-responsive re-planning.

- Green Runner: A tool for efficient deep learning component selection [6]
1. The paper’s main contribution is an automated, resource-efficient approach to deep learning component selection that combines LLM reasoning with a multi-armed bandit. With the input of natural language descriptions of application scenarios, it automatically generates appropriate configurations (metrics, weights, etc.) to minimize computational waste during model evaluation. These aspects are important from a software engineering perspective as resource optimization is an overlooked topic in ML. Additionally, it enables end-users and software engineers without domain or ML expertise to express requirements in natural language.
2. Resource constrained planning, multi-objective optimization, verbal language into not just spatio-temporal specifications but also into desired robustness metrics.
3. We consider a smart traffic management system that coordinates autonomous vehicles. For example, the company might have a fleet of autonomous taxis in a city that need to be allocated to certain regions in the city. This allocation is highly time-, weather-, and date dependent. Green Runners contribution to this company could be that the company might have several ML components ready that allocate vehicles to certain regions. There might be a dataset of a rainy day and the locations where people are requesting rides. Green Runner could automatically select optimal models for different scenarios based on a set of light-weight low-accuracy models and heavy high-accuracy models. My research could be integrated into this Green Runner framework utilizing text-to-specification tools where verbal or textual instructions are translated into a complex spatio-temporal specification, before it is passed to the LLM. This could tighten the gap between text from the user and the selection of the ML components.
4. I could use the ideas of Green Runner’s AI into my own research via several ways.
- LLM-Guided Specification Generation: Similar to how Green Runner uses GPT-4 to infer the best metrics and weights from natural language descriptions, I could develop a system where domain experts describe motion planning scenarios in natural language (e.g., "autonomous drone delivery in urban environment with battery constraints"), and an LLM automatically generates corresponding STL specifications with appropriate resource-awareness weights. This could democratize formal methods for non-expert users.
- Resource-Aware Temporal Logic: I could extend specification languages to include the computational resource constraints alongside the spatio-temporal requirements I currently consider. For example, specifications could have requirements on the memory footprint of the models used to complete certain tasks. In a high-level decision framework, this could provide optimal planners w.r.t. not just the spatio-temporal part but also with respect to computational complexity, energy usage (which is especially relevant on remote robots). Building on Green Runner's multi-objective approach, I could formalize these trade-offs using weighted temporal logic formulas where computational metrics (energy, memory, latency) are first-class constraints rather than afterthoughts.
- Formal Verification of Model Trade-offs: In robotics formal verification is used to check whether plans or assignments satisfy complex specifications. It would be interesting to look into the compromises of probabilistic success rates using Green Runners framework. Is their framework properly weighing liveness, safety, and computational load? I could adapt their multi-armed bandit approach to formally verify that chosen motion planning algorithms meet both STL specifications and resource constraints with quantifiable confidence bounds.
6. Research Ethics & Synthesis Reflection
I obtained the papers in Question 4 of Section 5 either from memory or by searching Google Scholar with the appropriate terms. I screen the selected papers by reading the title and the abstract, and then making the decision of whether it warrants further reading.
I have not found misleading titles or abstracts. However, some papers oversell their contribution, as always.
My work is original and I have not copied text from LLMs or from sources (cited or not).

References:

[1]: Liu, R., Hou, A., Yu, X., & Yin, X. (2025). Zero-Shot Trajectory Planning for Signal Temporal Logic Tasks. arXiv preprint arXiv:2501.13457.

[2]: Li, S., Liu, F., Cui, L., Lu, J., Xiao, Q., Yang, X., ... & Wang, X. (2025, April). Safe planner: Empowering safety awareness in large pre-trained models for robot task planning. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 39, No. 14, pp. 14619-14627).

[3]: Lindemann, L., Cleaveland, M., Shim, G., & Pappas, G. J. (2023). Safe planning in dynamic environments using conformal prediction. IEEE Robotics and Automation Letters, 8(8), 5116-5123.

[4]: Zhang, C., Kapoor, P., Meira-Goes, R., Garlan, D., Kang, E., Ganlath, A., ... & Ammar, N. (2023). Investigating robustness in cyber-physical systems: Specification-centric analysis in the face of system deviations. arXiv preprint arXiv:2311.07462.

[5]: Zhang, D., Xia, B., Liu, Y., Xu, X., Hoang, T., Xing, Z., ... & Zhu, L. (2024, April). Privacy and copyright protection in generative AI: A lifecycle perspective. In Proceedings of the IEEE/ACM 3rd International Conference on AI Engineering-Software Engineering for AI (pp. 92-97).

[6]: Kannan, J., Barnett, S., Simmons, A., Selvi, T., & Cruz, L. (2024, April). Green Runner: A tool for efficient deep learning component selection. In Proceedings of the IEEE/ACM 3rd International Conference on AI Engineering-Software Engineering for AI (pp. 112-117).