picked.ai/hire/data-scientist/interview-questions
30 data scientist
interview questions that actually work.
Pulled from the Neuroworx item bank: nine years of calibration against twelve-month performance outcomes on 14,083 data scientists. Sorted by stage (screen, assessment, on-site) and level (IC1 to IC5). Each question comes with what to listen for, what to ignore, and the failure mode it is designed to catch.
30
questions
4
stages
5
levels
14k
hires of validity data
ScreenRole-fitOn-siteAnti-pattern questions
Stage 01 · Screen
Twelve minutes. Ten questions.
The screening conversation. Picked runs this with an AI voice; this is what a human screen would look like with the same rubric. Time-box hard. 60 seconds per answer.
10 questions
01
Walk me through the last model you shipped end-to-end. From problem framing to monitoring.
scopespecificity
Listen for
A specific problem, a specific baseline, the post-launch behaviour. How they knew it was working.
Ignore
A list of algorithms. We want the shape of the work.
catches · Candidates who can describe the notebook but not the deployment.
02
Tell me about a project where you decided not to build a model.
judgement
Listen for
A specific project. The moment in EDA when the answer became obvious. The conversation with the sponsor.
Ignore
"Sometimes a heuristic is enough." Generic.
catches · Candidates whose every problem becomes a model.
03
What baseline did you beat on your last project, and by how much?
evaluation honesty
Listen for
A specific baseline. A specific delta. Honesty about whether the delta was worth the complexity.
Ignore
"My model performed well." Without a number, noise.
catches · Candidates who never run a baseline.
04
Tell me about an EDA that killed a project.
curiosity
Listen for
A specific finding. The early signal that mattered. The sponsor conversation.
Ignore
A pure exploration story with no decision.
catches · Candidates who treat EDA as a warm-up.
05
Describe the last time you had to communicate uncertainty to a non-analyst.
commshonesty
Listen for
A specific stakeholder. The words they actually used. Whether the stakeholder changed their decision.
Ignore
"I always show error bars." Generic.
catches · Candidates whose only language is point estimates.
06
What is a model in production at your last company that drifted, and what did you do?
operabilityreality
Listen for
A specific model. The first signal of drift. The investigation. What changed.
Ignore
Candidates who never owned anything in production.
catches · Candidates who think the job ends at the notebook.
07
What is one paper or technique you tried recently and rejected?
taste
Listen for
A real evaluation. A real reason to reject. What they used instead.
Ignore
Papers they are "exploring".
catches · Candidates whose technique stack is just whatever is fashionable.
08
How do you choose between two models that perform similarly?
tastejudgement
Listen for
Inference cost, interpretability, the maintenance load, the failure mode the simpler one has. A real tie-breaker.
Ignore
"It depends." A non-answer.
catches · Candidates who default to the more complex option.
09
How do you onboard onto a new modelling codebase?
generality
Listen for
A sequence. Run the training script, find the most recent eval, find the engineer who wrote it, ask one stupid question.
Ignore
"I read the docs first."
catches · Candidates who freeze without a tutorial.
10
One thing you want from the next role you would not have applied for if not.
stage fit
Listen for
A specific something. A specific problem domain. A specific scale.
Ignore
"Impact." Vague.
catches · Candidates unsure why they are looking.
Stage 02 · Role-fit assessment
A scoped task. A scored rubric.
One realistic task. We score the writeup, not the polish. The candidate has the take-home equivalent of 60 minutes.
8 questions
01
Here is a one-paragraph product problem. Write the simplest baseline you would ship before any model. Justify in 100 words.
judgementIC2+
Listen for
A baseline that is actually shippable. A justification that names the cost of the model.
Ignore
A baseline that is itself a model.
catches · Candidates who cannot resist building the model first.
02
Pick a metric to optimise for the product problem in question 1. Defend it in three sentences.
evaluation honestyIC2+
Listen for
A metric tied to the business outcome, not the easy one. They name the failure mode of the metric.
Ignore
AUC because AUC.
catches · Candidates who pick the metric the loss function makes easy.
03
Here is a sample dataset with a known data leak. Find it. Write the post-mortem in three paragraphs.
craftIC2+
Listen for
How they reach the leak. The features they audit first. A clean writeup.
Ignore
A leak hunt that becomes a feature engineering essay.
catches · Candidates who trust the columns.
04
Write the one-paragraph summary you would send the PM after running the baseline from question 1.
commsIC2+
Listen for
A paragraph the PM acts on. A recommendation. Honesty about the confidence interval.
Ignore
Five-paragraph emails.
catches · Candidates who cannot edit themselves down.
05
Read this 3-page modelling design doc. Write three questions for the author and one push-back.
judgementIC3+
Listen for
Questions that show they read the doc. A push-back that engages with the modelling choice, not the writing.
Ignore
Style edits.
catches · Candidates who cannot engage with someone else's design.
06
A PM asks for "an ML feature" with no clear problem. Write the email you would send back.
question-shapingIC2+
Listen for
A reframe that finds the underlying decision. A specific next step.
Ignore
"Happy to help, lets schedule a call."
catches · Candidates who say yes to vague asks.
07
Sketch a monitoring plan for the model in question 1. Three signals you would watch and the action threshold for each.
operabilityIC3+
Listen for
Specific signals (input drift, label availability, calibration). A real action threshold.
Ignore
"Monitor performance." Generic.
catches · Candidates who cannot imagine a model after launch.
08
In 200 words: why might the modelling approach you proposed in question 1 be the wrong choice?
humilityIC4+
Listen for
A real engagement with the alternative. A specific failure mode.
Ignore
A second pitch for the original approach.
catches · Candidates who cannot question their own design.
Stage 03 · On-site (after Picked)
Twelve questions you will still want to ask in person.
Picked screens, scores, and shortlists. These are the questions worth asking with a human in the room: the calibration questions, the dealbreakers, the chemistry probes.
12 questions
01
Where, in modelling, do you want to grow most this year?
growth
Listen for
A specific gap. A plan. A name of someone they would learn from.
Ignore
"I want to be a staff scientist." Title-laddering.
catches · Candidates without a learning agenda.
02
Tell me about a time you disagreed with an engineer on a feature contract.
authoritycross-team
Listen for
A real disagreement. The mechanics. How it resolved.
Ignore
"I always defer to engineering." Suspicious.
catches · Candidates who cannot hold a position across function lines.
03
What is the most uncomfortable feedback you have received on a model?
self-awareness
Listen for
A specific piece. The change they made.
Ignore
"I take feedback well."
catches · Defended self-narrative.
04
Walk me through a project you wish you had killed earlier.
judgement
Listen for
A specific moment they could have called it. What stopped them.
Ignore
A pitch for the project being secretly worth it.
catches · Sunk-cost scientists.
05
What is a strong opinion you have recently changed about ML practice?
intellectual humility
Listen for
A specific opinion. A specific reason.
Ignore
"My mind is always open."
catches · Closed-loop thinkers.
06
Pick two scientists you admire from your last role. What do they do differently?
taste
Listen for
Concrete habits. Habits adopted. Habits not.
Ignore
Pure praise.
catches · Candidates without taste for other scientists.
07
Tell me the last paper or talk you read outside your direct work.
curiosity
Listen for
A specific paper. What they thought of it.
Ignore
A textbook they always mean to get to.
catches · Candidates who do not read.
08
When are you most productive?
operating model
Listen for
A specific time-of-day. A self-aware answer about energy.
Ignore
"I am always productive."
catches · Candidates without self-instrumentation.
09
Where would you rather be in three years?
careerretention
Listen for
A direction (deeper IC vs research vs leadership) and a reason.
Ignore
"Wherever the company needs me."
catches · Drifting candidates.
10
If you join, what would you want to spend your first week doing?
agencyonboarding
Listen for
A specific plan. Often: read the last production post-mortem, run the training pipeline, ship a 50-line PR.
Ignore
"Whatever you suggest."
catches · Candidates without onboarding instinct.
11
What is the thing that would make you leave us within six months?
dealbreaker
Listen for
A specific irritant.
Ignore
"As long as the work is good."
catches · Hidden dealbreakers.
12
What would you want to ask our most cynical engineer about our model?
probingcuriosity
Listen for
A real question. "Where do you not trust the input?"
Ignore
A softball.
catches · Candidates who do not want to know what is wrong.
The anti-pattern set
Eight questions that look smart
but tell you nothing.
"What is your biggest weakness?"
You will get a strength-shaped weakness. We have asked this 47,000 times. It catches no-one. Replace with: "What is the most uncomfortable feedback you have received?".
"Where do you see yourself in five years?"
Either a rehearsed answer or a stalled one. Both useless. Replace with: "Where would you want to be in three years?"
"Tell me about yourself."
Wastes the first three minutes on the CV they already gave you. Replace with: "Walk me through the most recent thing you shipped end-to-end."
"Why this company?"
Generates polished mission-talk. Replace with: "What about this role made you apply that would not have made you apply elsewhere?"
"Are you a team player?"
No-one says no. Replace with: "Tell me about a time a teammate disagreed with you and how you handled it."
"How do you handle stress?"
No-one says badly. Replace with: "Tell me about your last production incident and your precise role."
"How would you reverse a linked list?"
Probes nothing we care about. We removed it from the bank in 2019. Replace with: "Refactor this 200-line file and tell me what you changed and why."
"If you were an animal, which animal would you be?"
You know what we are going to say. Replace with: anything else.
Or, let us ask
We will ask these for you.
By Friday.
Picked runs the screen, the assessment, and the first-round interview against this exact item bank. You meet the three finalists in person, with these on-site questions in hand.
$0.99 per AI-vetted candidate. First 50 free.
Data scientist interview questions · Picked.ai