I feel like the trend of AI is shifting from model construct/pre-training to post-training/pipelining/engineering phase, which we use system-level knowledge to implement the model the fastest, efficient as possible.

Of course new models/architectures(world models, System 2, GFlowNet, JEPA etc.), domains(robotics, biology etc.), and paradigms(Reinforcement Learning, Energy based models etc) will come but after all, the “verification” phase of AI seems done after the huge success and impact of ChatGPT.

Now deployment to real world use is getting larger and larger. Big tech companies and AI startups are already noticing and are working on this (more efficient LLM inference, faster and better image/video generation (e.g. Nano Banana), Robot Learning simulation, and smart glasses) a lot on this. Still I think A LOT MORE are about to be deployed in various fields in various forms in the upcoming years.

Thus I believe working on GPU Programming and efficient inference is one of the most leveragable yet still a bit underrated(compared to AI Research) field to work on in 2025 and the next decades.

similar past post: https://www.junupark.xyz/posts/the-io-and-pipelining-era-of-ml/