Data scientist interview questions

picked.ai/hire/data-scientist/interview-questions

30 data scientist

interview questions that actually work.

Pulled from the Neuroworx item bank: nine years of calibration against twelve-month performance outcomes on 14,083 data scientists. Sorted by stage (screen, assessment, on-site) and level (IC1 to IC5). Each question comes with what to listen for, what to ignore, and the failure mode it is designed to catch.

questions

stages

levels

14k

hires of validity data

Screen ↓Role-fit ↓On-site ↓Anti-pattern questions ↓

Stage 01 · Screen

Twelve minutes. Ten questions.

The screening conversation. Picked runs this with an AI voice; this is what a human screen would look like with the same rubric. Time-box hard. 60 seconds per answer.

10 questions

Walk me through the last model you shipped end-to-end. From problem framing to monitoring.

scopespecificity

Listen for

A specific problem, a specific baseline, the post-launch behaviour. How they knew it was working.

Ignore

A list of algorithms. We want the shape of the work.

catches · Candidates who can describe the notebook but not the deployment.

Tell me about a project where you decided not to build a model.

judgement

Listen for

A specific project. The moment in EDA when the answer became obvious. The conversation with the sponsor.

Ignore

"Sometimes a heuristic is enough." Generic.

catches · Candidates whose every problem becomes a model.

What baseline did you beat on your last project, and by how much?

evaluation honesty

Listen for

A specific baseline. A specific delta. Honesty about whether the delta was worth the complexity.

Ignore

"My model performed well." Without a number, noise.

catches · Candidates who never run a baseline.

Tell me about an EDA that killed a project.

curiosity

Listen for

A specific finding. The early signal that mattered. The sponsor conversation.

Ignore

A pure exploration story with no decision.

catches · Candidates who treat EDA as a warm-up.

Describe the last time you had to communicate uncertainty to a non-analyst.

commshonesty

Listen for

A specific stakeholder. The words they actually used. Whether the stakeholder changed their decision.

Ignore

"I always show error bars." Generic.

catches · Candidates whose only language is point estimates.

What is a model in production at your last company that drifted, and what did you do?

operabilityreality

Listen for

A specific model. The first signal of drift. The investigation. What changed.

Ignore

Candidates who never owned anything in production.

catches · Candidates who think the job ends at the notebook.

What is one paper or technique you tried recently and rejected?

taste

Listen for

A real evaluation. A real reason to reject. What they used instead.

Ignore

Papers they are "exploring".

catches · Candidates whose technique stack is just whatever is fashionable.

How do you choose between two models that perform similarly?

tastejudgement

Listen for

Inference cost, interpretability, the maintenance load, the failure mode the simpler one has. A real tie-breaker.

Ignore

"It depends." A non-answer.

catches · Candidates who default to the more complex option.

How do you onboard onto a new modelling codebase?

generality

Listen for

A sequence. Run the training script, find the most recent eval, find the engineer who wrote it, ask one stupid question.

Ignore

"I read the docs first."

catches · Candidates who freeze without a tutorial.

One thing you want from the next role you would not have applied for if not.

stage fit

Listen for

A specific something. A specific problem domain. A specific scale.

Ignore

"Impact." Vague.

catches · Candidates unsure why they are looking.

Stage 02 · Role-fit assessment

A scoped task. A scored rubric.

One realistic task. We score the writeup, not the polish. The candidate has the take-home equivalent of 60 minutes.

8 questions

Here is a one-paragraph product problem. Write the simplest baseline you would ship before any model. Justify in 100 words.

judgementIC2+

Listen for

A baseline that is actually shippable. A justification that names the cost of the model.

Ignore

A baseline that is itself a model.

catches · Candidates who cannot resist building the model first.

Pick a metric to optimise for the product problem in question 1. Defend it in three sentences.

evaluation honestyIC2+

Listen for

A metric tied to the business outcome, not the easy one. They name the failure mode of the metric.

Ignore

AUC because AUC.

catches · Candidates who pick the metric the loss function makes easy.

Here is a sample dataset with a known data leak. Find it. Write the post-mortem in three paragraphs.

craftIC2+

Listen for

How they reach the leak. The features they audit first. A clean writeup.

Ignore

A leak hunt that becomes a feature engineering essay.

catches · Candidates who trust the columns.

Write the one-paragraph summary you would send the PM after running the baseline from question 1.

commsIC2+

Listen for

A paragraph the PM acts on. A recommendation. Honesty about the confidence interval.

Ignore

Five-paragraph emails.

catches · Candidates who cannot edit themselves down.

Read this 3-page modelling design doc. Write three questions for the author and one push-back.

judgementIC3+

Listen for

Questions that show they read the doc. A push-back that engages with the modelling choice, not the writing.

Ignore

Style edits.

catches · Candidates who cannot engage with someone else's design.

A PM asks for "an ML feature" with no clear problem. Write the email you would send back.

question-shapingIC2+

Listen for

A reframe that finds the underlying decision. A specific next step.

Ignore

"Happy to help, lets schedule a call."

catches · Candidates who say yes to vague asks.

Sketch a monitoring plan for the model in question 1. Three signals you would watch and the action threshold for each.

operabilityIC3+

Listen for

Specific signals (input drift, label availability, calibration). A real action threshold.

Ignore

"Monitor performance." Generic.

catches · Candidates who cannot imagine a model after launch.

In 200 words: why might the modelling approach you proposed in question 1 be the wrong choice?

humilityIC4+

Listen for

A real engagement with the alternative. A specific failure mode.

Ignore

A second pitch for the original approach.

catches · Candidates who cannot question their own design.

Stage 03 · On-site (after Picked)

Twelve questions you will still want to ask in person.

Picked screens, scores, and shortlists. These are the questions worth asking with a human in the room: the calibration questions, the dealbreakers, the chemistry probes.

12 questions

Where, in modelling, do you want to grow most this year?

growth

Listen for

A specific gap. A plan. A name of someone they would learn from.

Ignore

"I want to be a staff scientist." Title-laddering.

catches · Candidates without a learning agenda.

Tell me about a time you disagreed with an engineer on a feature contract.

authoritycross-team

Listen for

A real disagreement. The mechanics. How it resolved.

Ignore

"I always defer to engineering." Suspicious.

catches · Candidates who cannot hold a position across function lines.

What is the most uncomfortable feedback you have received on a model?

self-awareness

Listen for

A specific piece. The change they made.

Ignore

"I take feedback well."

catches · Defended self-narrative.

Walk me through a project you wish you had killed earlier.

judgement

Listen for

A specific moment they could have called it. What stopped them.

Ignore

A pitch for the project being secretly worth it.

catches · Sunk-cost scientists.

What is a strong opinion you have recently changed about ML practice?

intellectual humility

Listen for

A specific opinion. A specific reason.

Ignore

"My mind is always open."

catches · Closed-loop thinkers.

Pick two scientists you admire from your last role. What do they do differently?

taste

Listen for

Concrete habits. Habits adopted. Habits not.

Ignore

Pure praise.

catches · Candidates without taste for other scientists.

Tell me the last paper or talk you read outside your direct work.

curiosity

Listen for

A specific paper. What they thought of it.

Ignore

A textbook they always mean to get to.

catches · Candidates who do not read.

When are you most productive?

operating model

Listen for

A specific time-of-day. A self-aware answer about energy.

Ignore

"I am always productive."

catches · Candidates without self-instrumentation.

Where would you rather be in three years?

careerretention

Listen for

A direction (deeper IC vs research vs leadership) and a reason.

Ignore

"Wherever the company needs me."

catches · Drifting candidates.

If you join, what would you want to spend your first week doing?

agencyonboarding

Listen for

A specific plan. Often: read the last production post-mortem, run the training pipeline, ship a 50-line PR.

Ignore

"Whatever you suggest."

catches · Candidates without onboarding instinct.

What is the thing that would make you leave us within six months?

dealbreaker

Listen for

A specific irritant.

Ignore

"As long as the work is good."

catches · Hidden dealbreakers.

What would you want to ask our most cynical engineer about our model?

probingcuriosity

Listen for

A real question. "Where do you not trust the input?"

Ignore

A softball.

catches · Candidates who do not want to know what is wrong.

The anti-pattern set

Eight questions that look smart
but tell you nothing.

"What is your biggest weakness?"

You will get a strength-shaped weakness. We have asked this 47,000 times. It catches no-one. Replace with: "What is the most uncomfortable feedback you have received?".

"Where do you see yourself in five years?"

Either a rehearsed answer or a stalled one. Both useless. Replace with: "Where would you want to be in three years?"

"Tell me about yourself."

Wastes the first three minutes on the CV they already gave you. Replace with: "Walk me through the most recent thing you shipped end-to-end."

"Why this company?"

Generates polished mission-talk. Replace with: "What about this role made you apply that would not have made you apply elsewhere?"

"Are you a team player?"

No-one says no. Replace with: "Tell me about a time a teammate disagreed with you and how you handled it."

"How do you handle stress?"

No-one says badly. Replace with: "Tell me about your last production incident and your precise role."

"How would you reverse a linked list?"

Probes nothing we care about. We removed it from the bank in 2019. Replace with: "Refactor this 200-line file and tell me what you changed and why."

"If you were an animal, which animal would you be?"

You know what we are going to say. Replace with: anything else.

Or, let us ask

We will ask these for you.
By Friday.

Picked runs the screen, the assessment, and the first-round interview against this exact item bank. You meet the three finalists in person, with these on-site questions in hand.

Post a data scientist role for free See pricing →

$0.99 per AI-vetted candidate. First 50 free.

Companion pages