I've been studying AI for about 10 years, since before it was cool. Some of the most satisfying a-ha moments happened for me before things spiraled out of control in the public narrative. Learning how to find the bottom of a gradient descent or working with an Adam optimizer? Pure fun.
But now it's all about how many tiers ChatGPT has, how many 1,2,3 types of models are out there, and what the "o" in the name stands for. Or - and I find this ridiculous, honestly - how good a model is at "reasoning". An LLM agent doesn't have a reasoning, it is simply trying to guess the most plausible token after the current one, given some input token sequence. Of course, if you narrow the training to just coding, the agent becomes very good at guessing the next token, and you, the builder of the model, can share some of the steps taken in the guessing process, and label them as "reasoning"... But the model doesn't really "reason" and for sure it's not "sentient".
Anyway, in this marketing and advertising nightmare, there is one area which is still nice to hang around. It's not pristine and there's a lot of increasing noise occurring these days but it's still fun to build in it.
It's about agents.
Introducing Lord of the Flips
I've been playing with Coinbase agentkit SDK for a few months now and the most interesting result is an agent called Lord of the Flips. It is linked to the Flippando game I talked about before, and it does a number of interesting things.
First of all, it can play the game autonomously. That was the most time consuming part. It didn't involve any training, it was about building specific tools for the agent. Think of them as "routines" or modules in traditional software. You give the agent a wallet and then teach it specific operations it can do with that wallet. It needs exact parameters, like the contract addresses and function names. But once you wired everything in place, it looks like pure magic. It can start from nothing and present you a fully solved puzzle board, recovering from failed transactions, recognizing when the game is done, etc. Fun.
Second, it can observe some interesting patterns across chains. Again, this needs specific inputs, like chain ids and contract addresses on each chain. When it has all this, the agent can autonomously query the contracts, find interesting metrics and - behold, the magic! - share them on Twitter or Farcaster. Including arbitrage opportunities, such as which chain is the most profitable one to play on. Cool.
And third, it can support players with complete instructions and answers to their questions. It uses a simple RAG mechanism, and it has some limitations, but it works.
What did I learn building this agent?
Agents still need code. They need tools to know exactly what they have to do, otherwise they are "blind".
Agents "look" like magic, but they really aren't. It's just a lot of plumbing and boilerplate wiring together, which, at the end of the day, gives the impression of "autonomous".
Agents don't exist in a vacuum, they need deployment, just like any other piece of software. So, if the deployment fails, poof, there is no agent, they live only as long as the server they're on lives.