Sunday, July 20, 2025

Key Takeaways from 6.5 Years in Machine Learning

Share

Navigating the Machine Learning Landscape: Insights from My Journey

I started learning machine learning more than six years ago, a time when the field was just beginning to gain substantial traction. Around 2018, when I enrolled in my first university courses on traditional machine learning methods, behind the scenes, pivotal techniques were being developed. These would eventually lead to the explosive growth of artificial intelligence in the early 2020s. The release of the GPT models set the stage for a race among companies to push the limits of performance and parameter sizes with their models. It was an exhilarating moment to dive into machine learning, as the rapid pace of advancement meant there was always something new to discover and learn.

Every six to twelve months, I find myself reflecting on my journey, mentally fast-forwarding from university lectures to my current work in commercial AI research. This retrospection often unveils fundamental principles that have guided my learning process. One of the most striking realizations has been the importance of honing in on one narrow topic. Beyond this, I’ve identified three additional principles that shape my approach to machine learning. While they may not be strictly technical insights, they pertain to mindset and methodology, which are equally crucial.

The Importance of Deep Work

Winston Churchill’s quick wit is legendary, highlighted in a well-known exchange with Lady Astor, the first woman in British Parliament. When she remarked, “If I were your wife, I’d put poison in your tea,” Churchill retorted, “And if I were your husband, I’d drink it.” While such clever banter is admired, quick wit isn’t the key to success in machine learning research and engineering. What truly propels you forward is the ability to concentrate deeply.

Engaging in machine learning, especially on the research side, is not fast-paced in the traditional sense. It demands long stretches of uninterrupted, intense thought. Developing algorithms, debugging data issues, and formulating hypotheses require focused effort over sustained periods.

When I talk about “deep work,” I’m referring to two aspects:

  • The skill to concentrate deeply for extended periods
  • Creating an environment that fosters and encourages sustained focus

Over the past couple of years, I’ve come to view deep work as vital for making significant strides in my projects. Those hours spent in focused immersion have proven to be far more productive than fragmented moments of distracted labor. Fortunately, deep work is a skill that can be fostered and an environment that encourages it can be designed.

Some of my most rewarding experiences occur before deadlines for paper submissions. These periods offer a unique opportunity to immerse yourself fully in one project, entering a state of flow. As Richard Feynman eloquently expressed, to excel in any scientific discipline, one must carve out substantial time for focused concentration. Replace “physics” with “machine learning,” and the sentiment remains powerful.

Large language models are currently all the rage, with names like LLaMA, Gemini, Claude, and Bard becoming household words in tech discussions. However, for those just starting out, pursuing trends can hinder your ability to gain steady momentum.

I recall a former colleague, whom I’ll call John, who eagerly dove into the burgeoning field of retrieval-augmented generation (RAG) to enhance language model outputs. However, the rapid evolution of these models complicated his efforts. By the time he got an older model operational, a newer one had already made headlines. The fast pace of change and ambiguous evaluation criteria proved challenging for someone still navigating the early stages of research.

This experience isn’t a criticism of John; it’s a cautionary tale about the pitfalls of relying on the latest trends for progress in machine learning. Instead of becoming swept up in the latest hype, focus on developing a deep understanding of solid foundational concepts and techniques.

Doing Boring Data Analysis (Over and Over)

Every time I transition to training a model, I experience a sense of relief. Why? Because I’ve successfully navigated the often tedious and complex task of data analysis.

Let’s break it down:

  1. You start with a project.
  2. You acquire real-world datasets.
  3. You envision training ML models.
  4. But first, you must prepare the data.

It sounds straightforward, but there’s a reason why this step is often where things get tricky.

For instance, while working with ERA5 weather data—a colossal gridded dataset—my goal was to predict the NDVI (Normalized Difference Vegetation Index) reflecting vegetation density using historical patterns from the dataset. After merging the ERA5 weather data with NDVI satellite data from NOAA, I thought I was all set to train my Vision Transformer.

However, when I visualized the model’s predictions days later, I encountered a shocking revelation: the model concluded that the Earth was upside down. Investing hours in preparation, I overlooked an essential detail—how the translation of resolutions flipped the orientation of the NDVI data.

This oversight stemmed from my eagerness to skip directly to the exciting part—machine learning. The hard truth is that in real-world ML applications, mastering data preparation is integral to successful outcomes.

Yes, academic research may provide curated datasets like ImageNet, CIFAR, or SQuAD, but real-world models necessitate:

  • Cleaning, aligning, normalizing, and validating data
  • Debugging unusual edge cases
  • Visually inspecting intermediate data

The cycle continues, often repeated until the data is genuinely ready for training. I’ve learned this lesson the hard way, emphasizing the importance of diligence in data engineering.

(Machine Learning) Research Is a Specific Kind of Trial and Error

From an outsider’s perspective, scientific progress often appears to be a streamlined process:

Problem → Hypothesis → Experiment → Solution

Yet, the reality is considerably messier. Mistakes will occur—some minor, others downright embarrassing, like misorienting Earth. The key lies in how you handle these mistakes.

Missteps are a natural part of the process, but insightful mistakes offer valuable lessons. To expedite my learning from perceived failures, I maintain a simple lab notebook. Before I run an experiment, I document:

  1. My hypothesis
  2. My expectations
  3. The rationale behind those expectations

This simple act allows me to reflect on results—even if they come back unfavorably—offering insights into the viability of my assumptions.

Transforming errors into feedback can accelerate learning. After all, as the saying goes, “An expert is someone who has made all the mistakes that can be made in a very narrow field.” That’s the essence of research.

Key Takeaways for Aspiring ML Practitioners

Through my years in machine learning, I’ve realized that excelling in this domain isn’t merely about riding the waves of the latest technological advancements. Instead, it involves:

  • Carving out time and space for deep work
  • Choosing depth over trendy hype
  • Committing to serious data analysis
  • Embracing the inevitable messiness of trial and error

For anyone embarking on this path, whether just starting out or with some experience under their belts, these lessons hold immense value. They may not be the highlights of keynote speeches at conferences, but they play a crucial role in shaping your growth and success in the field.

Read more

Related updates