Skip to content

Latest commit

 

History

History
19 lines (17 loc) · 1.6 KB

File metadata and controls

19 lines (17 loc) · 1.6 KB

id like to create a base level agent that can control my computer - the end goal will be to have two paties at play here: an orchestrator / evaluator / student and an actor. id like to basically implement https://github.com/karpathy/autoresearch but for agents who use applications. there should be a few things that happen. how it will work is the agent tries to use an app to construct an artifact (and there will be some list of artificats that an app can produce ex. a song for garadgeband and a photo for photoshop or a playlist for spotify. then there will be an evaluation layer that compares how good that agent's artificat is compared tot he artifact that was its target. these artififacts can be any digital piece of media. the orchestrator will keep track of how this goes, decide whether or not it was progress and either keep the commit or throw it out. the agent is always going to be a computer use agent that is focused on a single application (not a workflow) there will be many artificats to train on and there will be many cycles (each cycle for some set amount of time). id like to use the claude sdk to make the base agent but have it generalizable to any type of app or direcotry (ex. think a really good macOS settings agent where the artifiact is setting if a particular setting was turned on or not) or a mac app store agent which tests to see if an app was downloaded. the orchestrator should be responsible for making the edits in the agent's code, prompt, config ect to try and make it better for the next cycle (use git to keep track of changes - either thrown out or kept)