Neszed-Mobile-header-logo
Thursday, September 11, 2025
Newszed-Header-Logo
HomeAI5 Portfolio Mistakes That Keep Data Scientists From Getting Hired

5 Portfolio Mistakes That Keep Data Scientists From Getting Hired

Data Science Portfolio Mistakes
Image by Author | Canva

 

A strong portfolio is often the difference between making it and breaking it. But what exactly makes a portfolio strong? Numerous complicated projects? Slick design? Impressive data visualization? Yes and no. While these are necessary elements for a portfolio to be great, they’re elements so obvious that everyone knows you can’t make do without them.

However, many data scientists make mistakes when trying to go beyond that. As a result, they’re interviewing with portfolios that nominally have everything but are actually not that great.

 

The Framework

 
Here’s the framework that will help you avoid common mistakes when building a great portfolio.

 
Data Science Portfolio Mistakes
 

The Mistakes

 
Let’s now talk about the portfolio-building mistakes and how to avoid them using that framework.

 

// Mistake #1: Building Projects You Don’t Care About

Many portfolios give the impression that the projects are there just to tick a box: Titanic survival, Iris dataset, MNIST digits. You know — the typical stuff. It’s not only that you’ll be drowned in the thousands of similar portfolios, it also shows a lack of originality and interest in what you’re doing. The autopilot projects.

Fix: Start with domains that interest you, e.g., sports, finance, music. When the topic interests you, you’ll go deeper without even trying. If you’re a sports fan, you might analyze shot efficiency in the NBA or choose from these cool project ideas for practice. A music fan might model playlist recommendations.

 

// Mistake #2: Using Whatever Data Falls Into Your Lap

Candidates often grab the first clean CSV they can find. The problem is that real data science doesn’t work that way.

Fix: You should demonstrate that you know how to find the actual data, access it, and reshape it for further modeling stages. In your projects, use APIs (e.g., Twitter/X API), open government datasets (e.g., data.gov), and web-scraped sources (e.g., Awesome Public Datasets on GitHub). Use as many data sources as you can, evaluate data, merge them into one dataset, and prepare it for modeling.

 

// Mistake #3: Treating Projects Like Kaggle Competitions

Kaggle competitions focus on optimizing for a single metric. This is great for practice but doesn’t cut it in the real world. Accuracy in itself isn’t a goal. You’ll have to make a trade-off between the technical aspects of your model and the actual business or social impact.

Fix: Even if you use common datasets from Kaggle, always offer a different angle and frame the problem so it has business or social value. For example, don’t just classify fake vs. real news. Show which words, phrases, or topics drive misinformation. Another example: Don’t just predict churn.

 
Data Science Portfolio Mistakes
 

Show how a 10% reduction in churn could save $2M in annual revenue.

 
Data Science Portfolio Mistakes
 

// Mistake #4: Showing Only Models, Not Workflows

A lot of projects read like a sequence of Jupyter notebooks: importing libraries, then preprocessing data, then fitting models — here’s accuracy. It’s incomplete and boring. What’s missing is a demonstration of how you handle different stages of a project and why you make certain decisions.

Fix: Make them end-to-end projects. Show every stage, from data collection to deployment and everything in between. Explain why you made key choices, e.g., why you picked one model over another, or why you engineered a certain feature. Use tools like Streamlit, Flask, or Power BI dashboards for others to use. All this will make your projects look like applied problem-solving (e.g., Arch Desai’s portfolio), not a code walkthrough (e.g., this one).

 

// Mistake #5: Ending With a Model, Not Action

Data scientists often end at a technical level, e.g., showing the accuracy score. OK, but what do you do with it? You must remember that what matters is the model’s practical use. The model’s technical aspect is just one part of that, the other being business or social impact.

Fix: Finish the project with a recommendation of what to do. For example, “This model suggests prioritizing inspections in restaurants serving high-risk cuisines during winter.”

 

Project Example: Forecasting City Energy Demand to Cut Costs

 
In this section, I’ll create a mock project walkthrough to show you how the framework can be used in practice.

Domain: The domain I picked is energy consumption and sustainability. Living in a big city made me aware of how cities worldwide struggle with high electricity demand during peak hours. Forecasting demand more accurately can help utilities balance the grid, reduce costs, and cut emissions.

Data: The main source could be the U.S. Energy Information Administration (EIA). In addition, I could use the NOAA Weather API (e.g., for temperature and humidity), and holiday/event calendars (for spikes in demand).

Framing the Problem: Instead of framing the problem as “Predict electricity demand over time.”, I’ll frame it as “How much money could the city save if it shifted peak loads using better demand forecasts?”. With that, I turn a technical forecasting problem into a resource allocation and cost-saving problem.

Building End-to-End: The project would include these stages.

  1. Data Cleaning: Handle missing hours, align timestamps, normalize weather variables.
  2. Feature Engineering:
    • Lag features: demand in previous hours/days
    • Weather features: temperature, humidity
    • Calendar features: weekday, holiday flag, major events
  3. Modeling:
  4. Deployment: For example, I could create a dashboard showing 24-hour forecast vs. actual demand and simulate “what if” scenarios, e.g., adjusting demand by shifting industrial loads.

Action: We won’t stop at “the forecast has low RMSE”. Instead, let’s give a recommendation that has business and social impact, e.g., “If the city incentivized large businesses to shift 5% of consumption away from peak hours (predicted by the model), it could save $3.5M annually in grid costs.”

 

Bonus: Resources

 
As a bonus, here are some suggestions on what platforms you can use for practice and where to find the data.

 

// Platforms for Practicing

 

// Open Data Sources

 

// APIs for Real-Time Data

 

Conclusion

 
You probably noticed that none of the mistakes mentioned are technical. That’s not accidental; the biggest mistake is forgetting that a portfolio is a demonstration of how you solve problems.

Focus on those two aspects — demonstration and problem-solving — and your portfolio will finally start looking like proof you can do the job.
 
 

Nate Rosidi is a data scientist and in product strategy. He’s also an adjunct professor teaching analytics, and is the founder of StrataScratch, a platform helping data scientists prepare for their interviews with real interview questions from top companies. Nate writes on the latest trends in the career market, gives interview advice, shares data science projects, and covers everything SQL.



Source link

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments