app_control/agent_app_training.md at main · dhanaway/app_control

id like to create a base level agent that can control my computer - the end goal will be to have two paties at play here: an orchestrator / evaluator / student and an actor. id like to basically implement https://github.com/karpathy/autoresearch but for agents who use applications. there should be a few things that happen. how it will work is the agent tries to use an app to construct an artifact (and there will be some list of artificats that an app can produce ex. a song for garadgeband and a photo for photoshop or a playlist for spotify. then there will be an evaluation layer that compares how good that agent's artificat is compared tot he artifact that was its target. these artififacts can be any digital piece of media. the orchestrator will keep track of how this goes, decide whether or not it was progress and either keep the commit or throw it out. the agent is always going to be a computer use agent that is focused on a single application (not a workflow) there will be many artificats to train on and there will be many cycles (each cycle for some set amount of time). id like to use the claude sdk to make the base agent but have it generalizable to any type of app or direcotry (ex. think a really good macOS settings agent where the artifiact is setting if a particular setting was turned on or not) or a mac app store agent which tests to see if an app was downloaded. the orchestrator should be responsible for making the edits in the agent's code, prompt, config ect to try and make it better for the next cycle (use git to keep track of changes - either thrown out or kept)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FilesExpand file tree

agent_app_training.md

Latest commit

History

agent_app_training.md

File metadata and controls