Advice for Early Career Data Scientists
Coming out of college, I had some ideas about how I was going to become successful and what my career was going to look like. Of course, I was all wrong. Here is the advice I would offer a young me. Hopefully it is useful to someone early in their career!
Data Science is Interdisciplinary
Data Science is only as valuable as the problems (outside of data science) that it solves. One of the things I love most about data science is how many other fields it touches: e-commerce, marketing, finance, engineering, medicine—the list goes on and on. Most data scientists do a lot more than build and evaluate models. What you learn in school is really just the beginning. You can know the latest deep learning frameworks, you can know the dozen assumptions of linear regression, but if you want to be successful in your career you need domain expertise.
Learn your business. How does your company make money? What value does it provide to its customers? Where does your company spend money? If your company is publicly traded, it has a publicly available Profit and Loss (P&L) statement. Read it! How does your department support the business? How does your team support your department? How do you support your team?
You probably already know who the top data scientists are. Who are the leaders of your field? (Your field is not data science!) For example, if you’re doing data science for marketing, who are the best marketers in the world? What challenges are they facing? You’re really limiting your career if you say you don’t care about any of that stuff.
Organize your finances
Odds are you’re making more money, possibly a lot more money, than you were a few years ago. Congratulations! It’s more important than ever to be disciplined with your finances: discipline equals freedom (that’s a quote from Jocko Willink).
Does your company offer a 401k? It’s never too early to start saving for retirement. Learn about the differences between a 401k, traditional IRA, and Roth IRA. Learn about FSAs and HSAs (medical savings accounts). Learn about index funds, diversification, and dollar cost averaging. Make a budget and get your spending under control. Identify your financial goals and devise a strategy for getting there. I’m not a financial advisor, but educate yourself and take control of your finances. Thanks to compounding interest, the impact of decisions you make today gets magnified exponentially over the rest of your life, so make good decisions.
No one is going to manage your career except for you
You might be thinking, “I’m going to do what I’m told and then I’ll get raises and promotions every few years.” As if you’re on a conveyor belt to success. If you’re lucky, you have a manager or a mentor who can offer you advice, but it’s up to you to take ownership of your own success.
What are your career goals? Everyone thinks they want money, but miserable rich people are cliché. You probably want some combination of autonomy, authority, work/life balance, to like your coworkers, and to work on interesting problems that make a difference in the world. Decide for yourself what’s important to you, and take work assignments that bring you closer to your goals. Take stock every few months: are you closer to your objectives than you were before?
What could you do in the next 6 months to advance your career? It’s not just about raises and promotions, which are not really in your control anyway. What do you want to learn? What experiences do you want to have? What kind of résumé would it take to get your dream job, and how can you get to that résumé?
No one cares how smart you are
Despite the last section, if all you care about is your own career advancement, you’re not going to go very far. There’s a saying, “If you want to go fast, go alone; if you want to go far, go together.” Good teams are more than the sum of their parts: we can accomplish more together than any of us can accomplish alone. To be successful, you need a track record of accomplishment, and that only happens as part of a high-performing team.
Put your team first. If your team succeeds, you will succeed. If your team fails, you will fail. What does success look like for your team? What can you do to make your team more successful? No problem is beneath you.
No one cares how smart you are. They care that you are smart enough to do your job. Past that point, they care about how pleasant you are to work with. Don’t try to be the smartest person in the room: be the kindest person in the room. The most gracious. The most collaborative. Share credit. Lift others up. All the stuff you learned in kindergarten is still true.
At some point you’re going to get thrown under the bus, because not everyone is going to follow this advice. This too shall pass. Do everything you can to help your team, but if your teammates, and especially your leaders, do not do the same, it’s a bad team.
Most problems are not classification problems
This is my only advice that is technical in nature, but I see so many early career data scientists make this mistake (myself included). You probably focused mostly on prediction/classification problems in school. You did this not because those problems are important, or common, but because they are easy and showcase many of the relevant ideas. The sorts of problems that can be solved in a few hours for a homework assignment, not the real-world problems that take weeks or months to solve.
No one has ever asked me to classify a bunch of pictures of cats and dogs. Allegedly there are people who build systems for classifying X-rays for detecting cancer and such. Those problems have clear objectives and clean data. Most problems are not like that.
One of my first real-world projects was investigating why people convert to a subscription service my company offered. I thought, “some people convert, some people don’t, so I’ll build a classifier that predicts who will convert!” It didn’t work at all. The technical explanation of why it didn’t work is data imbalance: the overwhelming majority of people don’t convert, so my classifier achieved really good accuracy by just predicting that no one would convert. The real problem is that classification is the wrong approach here.
The answer is to build a model that predicts propensity to subscribe. Some people are more likely to subscribe than others: who are those people? Logistic regression is a good tool for this. (I have seen multiple sources say something like, “logistic regression is a misnomer; it actually is a tool for classification”. This is not correct.)
Once you frame it like that, these problems are actually pretty straightforward. They have a clear objective at least. Many of the problems I’ve faced in my career are much more complicated; however, I often see new teammates make the exact same mistake, and approach this like a classification problem. Scrutinizing this problem even further leads you to causal inference, so check out the other articles on my blog to learn more!