May Musings
Quick, raw thoughts
Simulation is taking off. More and more companies are coming out of stealth or being started. Ansatz Labs is one, Sooth came out of stealth, Prior Computers just raised. Demis Hassabis said this is an area he is excited about as the next frontier in AI. This is not an easy area to go after, however. What are the variables and equations you must set? How do you set the parameters when you only have the data? What is the maximum likelihood estimate? These are tough questions but potentially solvable with some effort. Automatic differentiation via JAX allows you to calculate the gradient for a dataset. Bayesian inference gets you closer. The real challenge is setting the right confidence level and getting buy-in from the customers you sell to that a model being 90% accurate is good enough to make decisions on. But maybe the slight edge is all that people are looking for today.
AI science. Simulation is a key part of sciences like physics, and it's no surprise that AI science is taking off in parallel. A neural network-based surrogate for complex simulation is a classical shortcut that AI enables. A neural network's linear and non-linear transformation is truly a spectacular tool to help humans understand the world. It's done a great job in capturing multidimensional data. Will it neatly capture even more complex problems and find patterns in chaos? Even if it's not perfect, I think it will get us closer. We've been surprised many times over the last century: models somehow get from memorization to generalization. You would expect oversaturation given the increasing amount of data pushed through the neural network, but it hasn't happened. A neural network feels like fire from Prometheus for humans.
Automated research. Within AI science, automated research is a trend we're seeing to automate the flywheel and achieve (as much as I hate this word) recursive self improvement. The biggest challenge in AI research seems to be getting the model to think longer, managing context, and avoiding slop. Emulating how human physicists and mathematicians work is a challenging one. Humans perform a basic form of recursion all the time. I can write down a problem to solve and try to create a mini theory to help me solve that problem. I have called upon myself in the form of this mini theory, which I can then use to solve the initial problem. As problems get more and more complicated, I can create multiple theories to try to solve them. These theories themselves will get more complicated, of course. Models seem to have issues with this where their context length limits this theory building aspect. The models are also lazy and may want to stop prematurely or find a way to cheat the process. Even if you manage to prompt your way for the model to keep thinking, you eventually hit a wall. Solving this dichotomy seems to be a big issue. But say you achieve this long thinking and mini theory development by getting over the laziness and context-length issue. When you have a hundred page explanation, how do you check for it? Verifiers may be used, along with surrogates that replicate expensive simulators for much cheaper and faster. But the compounding error may still be an issue.
Is further architectural innovation needed? That's a key question I've been thinking about for the last 2 years. I think this is interesting because it assumes a massive new breakthrough, similar to the impact that the adoption of neural networks, and then deep learning, had on AI. A new architecture would imply an undeniable improvement in performance, especially given the resources that have been invested in the transformer architecture. I do wish that we had invested in SSI and TMI when we had the chance. TMI's Tinker product is absolutely ripping in revenue. SSI would be one of the most radical bets that this breakthrough will happen, considering Ilya's pedigree. Nonetheless, a key thing I hear from some of the top AI researchers is that another architecture innovation is not needed.
Execution. No more architecture innovation means execution becomes key. I wrote a post over 3 years ago that speed kills in AI. Execution seems to be the key factor that the aforementioned researchers emphasize. Don't waste time and resources on research, or at the very least, don't waste massive amounts of money on it by blindly scaling. Instead, execute and sell to customers to establish the right direction for the company. Don't want to beat a dead horse here, but there's so much progress we have already made that it's really about implementation. It's the same thing that Jimmy Ba and Liam Fedus have echoed.
What's left for finance? The panic of white collar job displacement seems to have hit Wall Street. I caught up with a friend and he mentioned the uneasiness that analysts are feeling, especially after Ken Griffin's recent interview, where he stated that work that was done by masters and PhDs over weeks and months is now being done in a few days by AI. Using AI for things like sentiment analysis in HFT has been around for a decade. But now, the role of an analyst investing on a medium time frame of a few quarters is being disrupted. I wrote a private essay to a firm earlier this year talking about the emergence of a new type of investment organization. We saw the golden age of fundamental analysis during the late 70s and 80s when Peter Lynch and George Soros were top of mind. Jim Simons disrupted the regime with Renaissance with his quantitative approach. My thesis was that we are approaching an age when AI will create single PM shops. We seem to be heading into that era head first.