Playbook

The Engineering hiring playbook.

The funnel, the rubric, the screen, the assessment, the interview, the decision, and the offer, for engineering roles. Written for hiring managers who own at least one engineering role and want it filled in 30 days.

Picked Team2 December 202638 min read14 printed pagesEngineering

Contents · 12 sections

01Why engineering is the hardest funnel.~3 min
02The funnel, end to end.~3 min
03The role brief.~4 min
04The rubric.~4 min
05Sourcing.~3 min
06Screening and assessment.~5 min
07The behavioural interview.~4 min
08Code samples and take-homes.~3 min
09Reading the three finalists.~4 min
10The on-site half-day and the offer.~4 min
11The first thirty days.~3 min
12TL;DR and a one-page checklist.~2 min

Section 01

~3 min

Why engineering is the hardest funnel.

Engineering hiring is the hardest funnel in the company. Three reasons, none of which are negotiable.

One, the supply is loud. A backend role at a 200-person company gets 250 to 400 applicants in the first ten days. The signal in any single CV is low; the noise across the stack is high. Manual triage at this volume is theatre.

Two, the assessment is contested. Every approach has a credible counter-argument. Take-homes select for free time on a Sunday. LeetCode selects for interview prep, not engineering judgement. Pair-programming on the call selects for performance under pressure. The community is not aligned on what good looks like.

Three, the wrong hire is expensive. An engineering misfire at a 30-person company is two to three months of wasted runway plus the team morale tax. Recovering from a bad senior hire takes a quarter. Recovering from a bad tech lead takes a year.

What this playbook does.

This playbook is a step-by-step method for running an engineering hire end-to-end with Picked. It assumes you have one role open, a working role brief in your head, and three to five hours of your own time to invest across the next 30 days. It does not assume you have a recruiter, an ATS, or a previous engineering hire under your belt.

Read it once end-to-end. Then come back to the section you need on the day you need it.

If you skip ahead, skip to section 04 (The rubric) first. The rest of the playbook only works if the rubric is right.

Section 02

~3 min

The funnel, end to end.

The engineering funnel has eight stages. Picked owns six of them. You own two: write the role brief, and make the offer.

  Stage                         Owner         Typical drop-off
  --------------------------------------------------------------
  01  Role brief and rubric     You           n/a
  02  Posting and syndication   Picked        n/a
  03  Triage (intake)           Picked        ~40% pass
  04  AI screen (voice)         Picked        ~35% pass
  05  Role-fit assessment       Picked        ~25% pass
  06  Behavioural interview     Picked        ~20% pass
  07  Three finalists arrive    Picked → You  3 vetted
  08  On-site half-day + offer  You           1 of 3 typical

For a typical backend role you should expect 247 applicants to produce 3 vetted finalists, of whom 1 receives an offer. The headline number is 1 hire per 247 applicants; the operational number is 1 hire per 25 vetted candidates (the 25 who clear triage and reach the AI screen). The unit cost at $0.99 per AI-vetted candidate is about $25 of Picked spend per role.

247

typical applicants per role

reach the AI screen

finalists you read

~$25

Picked spend per hire

The two stages you own are stage 01 and stage 08. Everything in between runs without manager input. Spend your time well on the two stages you own; the playbook walks each in detail.

Pin this section. Every other section in the playbook maps back to one of these eight stages.

Section 03

~4 min

The role brief.

The role brief is the load-bearing artefact of the entire funnel. If the brief is vague, the rubric is vague, the AI screen is vague, the assessment is vague, the interview is vague, and the three finalists are not a match. The brief is also the cheapest place to invest your judgement; one hour here saves twelve hours of bad interviews later.

What a good brief contains.

The role title in plain English. "Senior backend engineer" beats "Software engineer III". The title drives the role family and the default rubric.
The first one or two systems the engineer will own end to end. Name them. "The notifications pipeline (~80M events per day)" beats "high-scale systems".
The team shape. Number of engineers, who else writes code, who they will report to, who they will pair with.
The deploy and review cadence. Daily deploys? CI on every commit? Release branches? Trunk based?
Three to five real problems the engineer will work on in their first 60 days. Specific, not aspirational.
Compensation band. Post the band. We mean it; if you cannot post the band the brief is not finished.
Location policy. Remote, hybrid, on-site, and the city if hybrid or on-site.
Must-haves. Three to five things, no more. Each must-have is a yes-or-no question, not a preference.

What a good brief does not contain.

A laundry list of technologies. "TypeScript, React, Node, Postgres, Redis, Kafka, k8s, AWS, Terraform" tells the candidate nothing about the work. Pick the two that matter and explain why.
Hype vocabulary. "Rockstar", "10x", "ninja", "guru", "passionate" do not survive contact with the rubric. Cut them.
A degree requirement. We do not select on credential. The rubric does not see the school.
A "minimum years of experience" floor. Replace it with a competency signal in the must-haves.

The brief is editable after posting. Edits to the brief regenerate the rubric proposal; you can accept the regenerated rubric or keep the prior one. Re-editing the brief mid-funnel is fine, but every edit invalidates a small percentage of the candidates already screened, so try to get the brief right before you publish.

Spend 45 minutes on the brief. Get a teammate or a friend to read it back to you in plain English. If the readback does not match the role in your head, the brief is not done.

Section 04

~4 min

The rubric.

The rubric is what the system scores on. Get it right and every downstream artefact gets it right too. Get it wrong and you will spend the next month interviewing on the wrong dimensions.

The default engineering rubric.

Picked proposes a default rubric for every engineering role family. The default rubric for a senior backend engineer scores on five competencies, weighted as below. You can accept the default, tune the weights, or replace any competency with one of your own.

Systems thinking (25%). Comfort with eventual consistency, capacity reasoning, failure modes. Not jargon.
Pragmatism (20%). Reaches for the boring, working answer first. Aware of trade-offs they made.
Ownership (20%). Has fought for reliability when no-one asked. Has owned a thing past v1.
Communication (20%). Can write a one-paragraph explanation of a complex bug. Can read code the same way.
Curiosity (15%). Has a side-investigation. Reads outside their stack. Asks one good question per interview.

Tuning the rubric for your role.

Three places where the default is almost always wrong for a specific role.

One, if the role is a first-three-engineers role at an early-stage company, raise Pragmatism to 30% and lower Systems thinking to 15%. The constraint is shipping the boring, working answer, not designing for capacity you do not have yet.

Two, if the role is a tech lead or staff role, raise Communication to 30% and lower Curiosity to 5%. The cost of a tech lead who cannot explain their reasoning to the team is higher than the cost of one who does not read outside their stack.

Three, if the role is a specialist (ML, security, devops), add a sixth competency for the specialist domain, weight it at 25%, and lower the other five proportionally. The default rubric is for generalists; specialists need the specialist dimension explicit.

Do not skip rubric tuning. Five minutes of weight tuning is the difference between three finalists you actually want to meet and three finalists you scroll past.

Section 05

~3 min

Sourcing.

Picked syndicates every role to the major engineering channels by default. You do not need a recruiter to source. Your job is to confirm the channel mix and add the two or three places that only you know about.

The default channel mix.

Public job boards: LinkedIn, Indeed, Otta, Hacker News Who Is Hiring (where applicable to the role and seniority).
Engineering-specific: GitHub Jobs, Stack Overflow Talent (sunset note: replaced by the Reuters board for most regions; we syndicate to the active equivalent), Dev.to, dev community Slack and Discord boards we have partnerships with.
Aggregators: Wellfound, Built In, Authentic Jobs, RemoteOK for remote roles, EU Remote Jobs for EU residency-only roles.
Underground: subreddits where engineers actually post about hiring (r/cscareerquestions, r/ExperiencedDevs, r/forhire), well-known engineering Discord servers, the role-specific LinkedIn groups.

You do not configure any of this. Picked posts on your behalf and mirrors the listing to your own careers page (or hosts it if you do not have one).

What only you know about.

Three sourcing moves that only the hiring manager can make. Do these on the day you post the role; the marginal effort is low and the marginal yield is high.

Post the role yourself on your own LinkedIn and Twitter. Tag your team. Two or three of the strongest applications will come from your own network referrals; treat them as first-class citizens of the funnel.
Email three engineers in your network who are not looking but who would be a fit. Two will say no; one will know someone. Do this once; do not repeat-mail.
Send the role to any underground community you are personally in (a slack of ex-colleagues, a discord of practitioners). Picked cannot post to these for you; you can.

A referral applicant goes through the same screen, assessment, and interview as a public applicant. Picked does not have a "back door" for referrals. The pipeline is uniform.

Section 06

~5 min

Screening and assessment.

The AI screen and the role-fit assessment run back-to-back, in the candidate's own time, with no involvement from you. This section explains what runs, what it scores, and what the candidate experiences.

The AI screen.

A 12 to 15 minute voice conversation between the candidate and Picked. Adaptive, role-aware, scored against the rubric in flight. Runs on LiveKit; transcription via OpenAI Whisper; reasoning via Anthropic Claude. The candidate sees a calendar invite within an hour of applying, picks a slot, and runs the screen from any browser or phone.

The screen asks about the candidate's most recent shipped project, the trade-offs they made, the failure mode they did not anticipate, and the smallest thing they would change in retrospect. The questions adapt: a candidate who answers shallowly gets a probe; a candidate who answers deeply gets the next question escalated.

About 40% of triaged applicants pass the AI screen. The rest get a structured response (which competency was below bar, what the score was, an offer to retake in 90 days if circumstances change). No silent rejections.

The role-fit assessment.

Twelve to eighteen items, depending on seniority. Each item is a short forced-choice scenario calibrated against the role family. Time-boxed (about 25 minutes), browser-based, no proctoring software. The item bank is sourced from the Neuroworx psychometric battery (validated against on-the-job performance for over 40,000 hires since 2018).

Example item for a senior backend role: "You have a flaky test that fails 1 in 20 runs. The release is in 36 hours. Your options are (a) skip the test in CI, (b) investigate and fix, (c) revert the change the test covers, (d) hand it off to the on-call engineer to triage. Which do you do, and why?" Scored on the reasoning, not the answer.

About 25% of candidates who pass the AI screen pass the assessment. The 4/5ths fairness check runs on every batch, per protected group, per role family.

What the candidate experiences.

A single coherent flow: apply, get a calendar invite within an hour, run the screen at the chosen slot, complete the assessment immediately after the screen (or within 48 hours), get a result within 24 hours of the assessment. The candidate never wonders what is happening; the candidate never waits more than 48 hours for a signal.

You do not read screen transcripts unless you want to. The score sheet that arrives with the finalist card includes the relevant transcript spans inline; the full transcript is one click away.

Section 07

~4 min

The behavioural interview.

The behavioural interview is the third gate. A 20 to 25 minute live voice conversation, adaptive, scored against the role rubric. Runs after the assessment; only candidates who pass the assessment get to the interview. About 20% of candidates who reach this stage pass.

What the interview asks.

Five anchor questions, each with adaptive follow-ups. The anchors are the same across every senior backend interview; the follow-ups depend on the candidate's answers and the rubric weights you set in section 04.

Walk me through the most recent system you owned end to end. What was the failure mode you did not anticipate, and what did you change?
Describe a time you disagreed with a more senior engineer on a technical decision. What was the disagreement, what did you do, what was the outcome?
Tell me about a piece of code you wrote that you would write differently today. What changed in your thinking?
You have inherited a service with 8% test coverage and a bug report every other day. You have one sprint. What do you do?
You have shipped a feature that turned out to be the wrong call. Engineering was fine; the call was wrong. How did you spot it, and what did you do next?

No LeetCode, no whiteboard, no pair-programming. The interview is voice-only by default with a video option; we do not weight tone of voice in the score; we do not use facial recognition.

Why this works.

Behavioural signal correlates with on-the-job performance at r=0.49 in our held-out cohorts, against r=0.20 for unstructured interviews and r=0.34 for work-sample tests. The combination of structured anchors plus adaptive probes is the part that lifts the correlation; either alone is weaker.

The interview is scored against the rubric in flight. The candidate sees a transcript and a competency-level score sheet within 24 hours, whether or not they advance. The hiring manager sees the same artefacts on the finalist card.

If you want to add a technical deep-dive, save it for the on-site half-day (section 09). Do not duplicate the behavioural interview with a second one of your own.

Section 08

~3 min

Code samples and take-homes.

Picked does not run take-homes. We have opinions on why; this section is the short version.

Why take-homes select on the wrong thing.

A 6-hour take-home selects for candidates with 6 hours of free time on a Sunday. That set under-represents parents, carers, candidates with second jobs, and candidates who are interviewing at three other companies that week. It is a quiet bias filter applied early and never measured.

A take-home also rewards a different skill from the one you want. Spending two days polishing a side project optimises for thoroughness and presentation. The job you are hiring for usually rewards judgement under constraint and shipping past v1. These are not the same skill.

And take-homes are reviewed badly. A tired engineer with 15 minutes per submission cannot reliably distinguish a good solution from an over-engineered one. The signal-to-noise on the review side matches the signal-to-noise on the submission side; both are low.

When code samples are still useful.

Three cases where asking for a code sample (not a take-home) is worth it. Ask for an existing public sample, not new work. The candidate sends a link.

For a tech-lead or staff role, ask for one public repo or one open-source contribution they want to be associated with. Read it for taste, not feature count.
For a specialist role (ML, security, devops), ask for one piece of writing (a blog post, a conference talk, a Github discussion) where they reasoned in public about a domain-specific decision.
For a junior or early-career role, ask for nothing. The signal-to-effort ratio of a code sample at that career stage is the lowest; trust the rubric and the interview.

If your team insists on a take-home, do not lengthen the Picked pipeline; replace the on-site half-day with a paid two-hour pair-programming session and a debrief. That is a higher-fidelity signal than any take-home will ever be.

No take-home is not the same as no code. The on-site half-day in section 09 includes a one-hour pair-programming round if you want one.

Section 09

~4 min

Reading the three finalists.

Three vetted finalists arrive in your inbox on Friday morning. Each one is a single-page finalist card with the score, the evidence, the override path, and the transcripts one click away. Reading three cards takes about 10 minutes. This is the only step of the funnel where your judgement is required and the only step where the playbook cannot make the decision for you.

The finalist card.

Each card has four parts. Read in this order.

The headline. Candidate name, candidate-side public role title, candidate location, role-fit rank (1, 2, or 3), and the rubric-weighted score (out of 100).
The competency breakdown. Five rows, one per competency from the rubric. Each row has the score, the percentile against the role-family bank, and one or two transcript spans that drove the score.
The narrative. Three paragraphs: what this candidate is strong on, what this candidate is weaker on, and the one thing the Picked screener flagged as worth a follow-up question on the on-site.
The next step. A single button to advance to the on-site, a button to park, and a button to override the rank with a note.

How to read three cards efficiently.

Read the headline of all three first; do not read in depth yet. You are looking for the rank-1 candidate to be obviously the right person on the headline. About 60% of the time they are; about 40% of the time you will want to dig in.

Then read the competency breakdown of the rank-1 candidate. If one of the competencies is below your bar, read the transcript span; if the span confirms the score is right, the rank-1 candidate is correct. If you disagree with the score after reading the span, override the rank and note your reasoning.

Then read the narrative of the rank-2 candidate. The narrative is where the screener flags "this person is rank 2 on score but might be rank 1 for your specific role". When this flag fires, override.

Then read nothing else. Make the call. Advance the chosen finalist to the on-site half-day. Park the other two with a structured response that goes back to them within 24 hours.

The override path is a feature, not a failure mode. We expect about 15% of finalist ranks to be overridden by the hiring manager; that rate is healthy. A 0% override rate suggests you are not reading the cards.

Section 10

~4 min

The on-site half-day and the offer.

The on-site half-day is the part of the funnel that Picked does not run. You meet the candidate, the candidate meets the team, the team meets the candidate. By the end of the half-day you are 80% of the way to a yes or no.

A good half-day, in five blocks.

Welcome and team intros (30 minutes). The candidate meets two to four engineers from the team. No interview vibe; just conversation. The candidate asks the team what the company is actually like.
Technical deep-dive (60 minutes). The hiring manager walks the candidate through one real system or one real bug. The candidate asks questions. You score on how they reason in the moment.
Pair-programming or paired-design (60 minutes). Optional. Use a real production repo or a real upcoming feature. The point is to see how the candidate works, not to ship code.
Team lunch (60 minutes). Off the record. The candidate meets the people they will sit next to. The hiring manager is not in this room.
Wrap with the hiring manager (30 minutes). The candidate asks the questions they did not ask earlier. The hiring manager flags anything from the Picked finalist card narrative that they want to follow up on.

The offer.

Make the call within 48 hours of the on-site. Make the offer within 72 hours of the call. The longer the gap, the higher the no-show rate.

Three things to put in the offer that are easy to miss. One, a single number for total compensation in the first sentence, no jargon. Two, the start date you are actually planning for, not "to be agreed". Three, a one-paragraph "why we picked you" note from the hiring manager, written by hand. The last one moves more accept-rates than any pay band can.

When the counter-offer comes.

About 30% of senior engineering offers attract a counter-offer from the current employer. The data is unambiguous: candidates who accept a counter-offer leave their current employer within 12 months in over 80% of cases anyway. The counter is usually not a real recalibration of value; it is a retention buy-time.

When the counter-offer arrives, the right response is a 24-hour offer to talk it through. Not to match. Not to negotiate. To listen. Most counter-offers fail in the second conversation; the candidate names the things their current employer is not solving, and your offer becomes the cleaner choice.

If you have not handled a counter-offer before, the Founder's first ten hires playbook (Batch 15) has a walkthrough of the exact wording.

Section 11

~3 min

The first thirty days.

Hiring is not done at the offer signature. The first thirty days of an engineering hire is when the bet you made is either confirmed or refuted. Most companies under-invest here and then complain that the hire is not working out by month four.

Three things to set up before day one.

A working machine, working logins, working repo access. If any of these are not done on day one, the message you are sending is "you are not really expected yet". Fix it.
A named onboarding buddy who is not the manager. The buddy is on the hook for the new hire's questions for the first 14 days. The manager is on the hook for the rubric-shaped feedback at 30, 60, 90 days.
A first ticket that is real and finishable in five days. Not a tutorial. Not a hello-world. A real ticket that, when shipped, lands in production with the new hire's name on it.

The 30-day review.

At day thirty, sit down for a structured 45-minute review. The review uses the same rubric you hired against. Score the hire on each competency. Share the scores back. Discuss the deltas openly.

About 90% of hires score within one rubric point of their pre-hire scores. About 8% score above and 2% score below. The 2% below is the early warning; address it now, in a structured conversation, while there is still time to correct.

A new hire who is below bar at day 30 and unaware of it will still be below bar at day 90 and unaware of it. The hiring loop ends at the 30-day review; do not skip it.

Block the 30-day review on your calendar on the day the offer is accepted. The probability you remember to do it later, unprompted, is low.

Section 12

~2 min

TL;DR and a one-page checklist.

The whole playbook in one page. Print this section; pin it to whatever surface you write on; come back to it every time you open a role.

TL;DR.

Engineering hiring is the hardest funnel. Two stages are yours: the role brief and the offer. Picked owns the other six.
The brief is load-bearing. One hour of brief work saves twelve hours of bad interviews.
The rubric is what the system scores on. Tune it for the specific role. Do not skip the tuning.
No take-homes. Code samples only for tech-lead or specialist roles, and only as existing public work.
Three vetted finalists arrive on Friday. Reading the cards takes 10 minutes. Override the rank when the narrative flags a fit reason the score did not.
Make the call within 48 hours of the on-site. Make the offer within 72 hours of the call.
Block the 30-day review on the day the offer is accepted.

The checklist.

Write the role brief.
45 minutes. Eight fields. Read it back to a teammate in plain English.
Tune the rubric.
5 minutes. Confirm or adjust the five default weights for your specific role.
Post the role.
8 minutes. Picked syndicates to the default channels.
Share on your own LinkedIn and Twitter the same day.
Tag your team. Send to two engineers in your network.
Wait. Read your inbox on Friday morning.
About 12 to 18 days from post to finalist card for a typical senior role.
Read the three finalist cards.
10 minutes. Override the rank if the narrative flags a reason.
Run the on-site half-day.
Five blocks; about 4 hours of your time including lunch.
Make the call within 48 hours.
Make the offer within 72 hours of the call.
Handle the counter-offer if it comes.
A 24-hour offer to talk. Listen, do not negotiate.
Set up day one before day one.
Machine, logins, repo access, named buddy, first real ticket.
Block the 30-day review on your calendar today.
Use the same rubric. Share the scores back. Address deltas in the conversation.

The whole loop, end to end, is about 30 days from post to start date for a typical senior backend role. About 6 hours of your time across that window. The rest runs without you.

About the author

Picked Team

Engineering and research

The people building Picked. Method posts, model cards, fairness audits, product opinions. Edited and signed off by the engineering and research leads.

From reading to hiring.

Three vetted finalists.

Friday.

Post a role for free See the product page

$0.99 per AI-vetted candidate. First 50 free.