Engineering hiring is the hardest funnel in the company. Three reasons, none of which are negotiable.
One, the supply is loud. A backend role at a 200-person company gets 250 to 400 applicants in the first ten days. The signal in any single CV is low; the noise across the stack is high. Manual triage at this volume is theatre.
Two, the assessment is contested. Every approach has a credible counter-argument. Take-homes select for free time on a Sunday. LeetCode selects for interview prep, not engineering judgement. Pair-programming on the call selects for performance under pressure. The community is not aligned on what good looks like.
Three, the wrong hire is expensive. An engineering misfire at a 30-person company is two to three months of wasted runway plus the team morale tax. Recovering from a bad senior hire takes a quarter. Recovering from a bad tech lead takes a year.
This playbook is a step-by-step method for running an engineering hire end-to-end with Picked. It assumes you have one role open, a working role brief in your head, and three to five hours of your own time to invest across the next 30 days. It does not assume you have a recruiter, an ATS, or a previous engineering hire under your belt.
Read it once end-to-end. Then come back to the section you need on the day you need it.
The engineering funnel has eight stages. Picked owns six of them. You own two: write the role brief, and make the offer.
Stage Owner Typical drop-off -------------------------------------------------------------- 01 Role brief and rubric You n/a 02 Posting and syndication Picked n/a 03 Triage (intake) Picked ~40% pass 04 AI screen (voice) Picked ~35% pass 05 Role-fit assessment Picked ~25% pass 06 Behavioural interview Picked ~20% pass 07 Three finalists arrive Picked → You 3 vetted 08 On-site half-day + offer You 1 of 3 typical
For a typical backend role you should expect 247 applicants to produce 3 vetted finalists, of whom 1 receives an offer. The headline number is 1 hire per 247 applicants; the operational number is 1 hire per 25 vetted candidates (the 25 who clear triage and reach the AI screen). The unit cost at $0.99 per AI-vetted candidate is about $25 of Picked spend per role.
The two stages you own are stage 01 and stage 08. Everything in between runs without manager input. Spend your time well on the two stages you own; the playbook walks each in detail.
The role brief is the load-bearing artefact of the entire funnel. If the brief is vague, the rubric is vague, the AI screen is vague, the assessment is vague, the interview is vague, and the three finalists are not a match. The brief is also the cheapest place to invest your judgement; one hour here saves twelve hours of bad interviews later.
The brief is editable after posting. Edits to the brief regenerate the rubric proposal; you can accept the regenerated rubric or keep the prior one. Re-editing the brief mid-funnel is fine, but every edit invalidates a small percentage of the candidates already screened, so try to get the brief right before you publish.
The rubric is what the system scores on. Get it right and every downstream artefact gets it right too. Get it wrong and you will spend the next month interviewing on the wrong dimensions.
Picked proposes a default rubric for every engineering role family. The default rubric for a senior backend engineer scores on five competencies, weighted as below. You can accept the default, tune the weights, or replace any competency with one of your own.
Three places where the default is almost always wrong for a specific role.
One, if the role is a first-three-engineers role at an early-stage company, raise Pragmatism to 30% and lower Systems thinking to 15%. The constraint is shipping the boring, working answer, not designing for capacity you do not have yet.
Two, if the role is a tech lead or staff role, raise Communication to 30% and lower Curiosity to 5%. The cost of a tech lead who cannot explain their reasoning to the team is higher than the cost of one who does not read outside their stack.
Three, if the role is a specialist (ML, security, devops), add a sixth competency for the specialist domain, weight it at 25%, and lower the other five proportionally. The default rubric is for generalists; specialists need the specialist dimension explicit.
Picked syndicates every role to the major engineering channels by default. You do not need a recruiter to source. Your job is to confirm the channel mix and add the two or three places that only you know about.
You do not configure any of this. Picked posts on your behalf and mirrors the listing to your own careers page (or hosts it if you do not have one).
Three sourcing moves that only the hiring manager can make. Do these on the day you post the role; the marginal effort is low and the marginal yield is high.
The AI screen and the role-fit assessment run back-to-back, in the candidate's own time, with no involvement from you. This section explains what runs, what it scores, and what the candidate experiences.
A 12 to 15 minute voice conversation between the candidate and Picked. Adaptive, role-aware, scored against the rubric in flight. Runs on LiveKit; transcription via OpenAI Whisper; reasoning via Anthropic Claude. The candidate sees a calendar invite within an hour of applying, picks a slot, and runs the screen from any browser or phone.
The screen asks about the candidate's most recent shipped project, the trade-offs they made, the failure mode they did not anticipate, and the smallest thing they would change in retrospect. The questions adapt: a candidate who answers shallowly gets a probe; a candidate who answers deeply gets the next question escalated.
About 40% of triaged applicants pass the AI screen. The rest get a structured response (which competency was below bar, what the score was, an offer to retake in 90 days if circumstances change). No silent rejections.
Twelve to eighteen items, depending on seniority. Each item is a short forced-choice scenario calibrated against the role family. Time-boxed (about 25 minutes), browser-based, no proctoring software. The item bank is sourced from the Neuroworx psychometric battery (validated against on-the-job performance for over 40,000 hires since 2018).
Example item for a senior backend role: "You have a flaky test that fails 1 in 20 runs. The release is in 36 hours. Your options are (a) skip the test in CI, (b) investigate and fix, (c) revert the change the test covers, (d) hand it off to the on-call engineer to triage. Which do you do, and why?" Scored on the reasoning, not the answer.
About 25% of candidates who pass the AI screen pass the assessment. The 4/5ths fairness check runs on every batch, per protected group, per role family.
A single coherent flow: apply, get a calendar invite within an hour, run the screen at the chosen slot, complete the assessment immediately after the screen (or within 48 hours), get a result within 24 hours of the assessment. The candidate never wonders what is happening; the candidate never waits more than 48 hours for a signal.
The behavioural interview is the third gate. A 20 to 25 minute live voice conversation, adaptive, scored against the role rubric. Runs after the assessment; only candidates who pass the assessment get to the interview. About 20% of candidates who reach this stage pass.
Five anchor questions, each with adaptive follow-ups. The anchors are the same across every senior backend interview; the follow-ups depend on the candidate's answers and the rubric weights you set in section 04.
No LeetCode, no whiteboard, no pair-programming. The interview is voice-only by default with a video option; we do not weight tone of voice in the score; we do not use facial recognition.
Behavioural signal correlates with on-the-job performance at r=0.49 in our held-out cohorts, against r=0.20 for unstructured interviews and r=0.34 for work-sample tests. The combination of structured anchors plus adaptive probes is the part that lifts the correlation; either alone is weaker.
The interview is scored against the rubric in flight. The candidate sees a transcript and a competency-level score sheet within 24 hours, whether or not they advance. The hiring manager sees the same artefacts on the finalist card.
Picked does not run take-homes. We have opinions on why; this section is the short version.
A 6-hour take-home selects for candidates with 6 hours of free time on a Sunday. That set under-represents parents, carers, candidates with second jobs, and candidates who are interviewing at three other companies that week. It is a quiet bias filter applied early and never measured.
A take-home also rewards a different skill from the one you want. Spending two days polishing a side project optimises for thoroughness and presentation. The job you are hiring for usually rewards judgement under constraint and shipping past v1. These are not the same skill.
And take-homes are reviewed badly. A tired engineer with 15 minutes per submission cannot reliably distinguish a good solution from an over-engineered one. The signal-to-noise on the review side matches the signal-to-noise on the submission side; both are low.
Three cases where asking for a code sample (not a take-home) is worth it. Ask for an existing public sample, not new work. The candidate sends a link.
If your team insists on a take-home, do not lengthen the Picked pipeline; replace the on-site half-day with a paid two-hour pair-programming session and a debrief. That is a higher-fidelity signal than any take-home will ever be.
Three vetted finalists arrive in your inbox on Friday morning. Each one is a single-page finalist card with the score, the evidence, the override path, and the transcripts one click away. Reading three cards takes about 10 minutes. This is the only step of the funnel where your judgement is required and the only step where the playbook cannot make the decision for you.
Each card has four parts. Read in this order.
Read the headline of all three first; do not read in depth yet. You are looking for the rank-1 candidate to be obviously the right person on the headline. About 60% of the time they are; about 40% of the time you will want to dig in.
Then read the competency breakdown of the rank-1 candidate. If one of the competencies is below your bar, read the transcript span; if the span confirms the score is right, the rank-1 candidate is correct. If you disagree with the score after reading the span, override the rank and note your reasoning.
Then read the narrative of the rank-2 candidate. The narrative is where the screener flags "this person is rank 2 on score but might be rank 1 for your specific role". When this flag fires, override.
Then read nothing else. Make the call. Advance the chosen finalist to the on-site half-day. Park the other two with a structured response that goes back to them within 24 hours.
The on-site half-day is the part of the funnel that Picked does not run. You meet the candidate, the candidate meets the team, the team meets the candidate. By the end of the half-day you are 80% of the way to a yes or no.
Make the call within 48 hours of the on-site. Make the offer within 72 hours of the call. The longer the gap, the higher the no-show rate.
Three things to put in the offer that are easy to miss. One, a single number for total compensation in the first sentence, no jargon. Two, the start date you are actually planning for, not "to be agreed". Three, a one-paragraph "why we picked you" note from the hiring manager, written by hand. The last one moves more accept-rates than any pay band can.
About 30% of senior engineering offers attract a counter-offer from the current employer. The data is unambiguous: candidates who accept a counter-offer leave their current employer within 12 months in over 80% of cases anyway. The counter is usually not a real recalibration of value; it is a retention buy-time.
When the counter-offer arrives, the right response is a 24-hour offer to talk it through. Not to match. Not to negotiate. To listen. Most counter-offers fail in the second conversation; the candidate names the things their current employer is not solving, and your offer becomes the cleaner choice.
Hiring is not done at the offer signature. The first thirty days of an engineering hire is when the bet you made is either confirmed or refuted. Most companies under-invest here and then complain that the hire is not working out by month four.
At day thirty, sit down for a structured 45-minute review. The review uses the same rubric you hired against. Score the hire on each competency. Share the scores back. Discuss the deltas openly.
About 90% of hires score within one rubric point of their pre-hire scores. About 8% score above and 2% score below. The 2% below is the early warning; address it now, in a structured conversation, while there is still time to correct.
A new hire who is below bar at day 30 and unaware of it will still be below bar at day 90 and unaware of it. The hiring loop ends at the 30-day review; do not skip it.
The whole playbook in one page. Print this section; pin it to whatever surface you write on; come back to it every time you open a role.
The people building Picked. Method posts, model cards, fairness audits, product opinions. Edited and signed off by the engineering and research leads.