Data has taken over the world.
With 90% of all the data that has ever existed produced in just the last two years, the demand for Data Scientists is also growing exponentially. Pretty much every company is confronted with the challenge of harnessing and utilizing the power of data today.
It’s no surprise that recruiters have set 295% more data-science-related tasks in interview processes over the past year.
With demand for Data Science professionals exploding and a shortage of qualified candidates, you’re not alone if you’re finding it challenging to fill this role. However, traditional hiring methods may not be enough to snag a coveted data science specialist from this newly-emerged field.
That’s why we created this guide covering the complete process of how to hire a data scientist from finding, evaluating, and hiring a data scientist who fits your organization’s needs.
Hiring for other tech positions? Have valuable recruiting information at your fingertips with our handy technical recruiting cheatsheet.
How do you recruit a Data Scientist: 4 strategies to consider first
Before we get into the finer details of hiring process, there are a couple of key strategies that you must absolutely consider when starting to recruit Data Scientists. Here are the main takeaways from our experience helping hiring managers source and make hires.
#1: Design your process to sell to the candidate
Just 2% of Data Scientists are unemployed, according to federal jobs data. When recruiting Data Scientists, it’s important to remember that other organizations are competing for attention.
So give candidates powerful reasons to choose you.
That means being able to articulate why this opportunity fits into their career trajectory and what your company can offer them in terms of culture, values, and personal growth.
It means designing a thoughtful and streamlined hiring process without burdensome requirements that make candidates drop out.
And, it means finding the right balance between moving quickly enough that you don’t lose your candidate to the competition, and taking the time to build a meaningful connection.
#2: Use data to optimize your hiring process
Iterating is the most effective method for hiring managers to consistently achieve great results, allowing you to keep the best ideas and quickly discard those that don’t work.
- If you try outreach messaging version A and version B, and version B performs 20% better, that translates to huge productivity gains over a month of outreach.
- Examine your process to identify bottlenecks. If 30% of candidates disappear when you request a resume prior to an intro call, you might try moving the ask to later, to minimize the chances of losing great candidates.
- Use an objective evaluation such as a skills test or take-home Data Science problem to generate a consistent and comparable performance metric among all candidates to help identify the best.
- When great candidates are scarce, focusing on actual skill (rather than degree, resume experience or personal characteristics) can help you to identify undervalued candidates
#3: Build strong, data-driven culture
- Can you articulate how your organization uses data analysis to drive business decisions?
- Is there someone on your leadership team with a good understanding of Data Science?
- Is your company’s data transparent to stakeholders and your Data Science team well-integrated into important conversations around business operations?
A “no” to any of these questions will be a major red flag.
Brand, culture and benefits are increasingly valued by the Millennial and Gen X professionals coming to dominate the workforce. A dedicated careers page for your Data Science roles and short, consumable video ads can help signal to these potential candidates that you’re serious about data and aligned with their values.
A staggering 97% of Data Scientists currently feel burnt out, and 70% plan to leave in the next year. Showcasing your organization’s work-life balance can be particularly effective when hiring Data Scientists.
#4: Understand whom you want to hire
- Are you a lean and fast-moving startup that needs a generalist, or an established company looking to fill a specialized skillset?
- How will the candidate fit into your organizational structure?
- What is the specific value you’re looking for them to deliver?
Once you have a clear understanding of the ideal hiring profile, you’ll be able to identify the skills and characteristics a candidate will need — and craft your interview process to identify and assess these skills.
What are the roles of a Data Science team?
As Data Science is an emerging field in real-time evolution, the various strands and specializations aren’t always easy to pick apart. Here are the most common Data Science titles as they currently stand in 2022.
Your standard Data Scientist is part Mathematician, part Statistician, and part Software, Engineer. Their job is to take vast amounts of noisy data, both structured and unstructured, and build models to extract patterns, insights, and meaningful conclusions.
Data Engineers are software engineers who build the infrastructure necessary to process and access data. They build the virtual “pipelines” between data systems which allow Data Scientists to access this data for analytical or operational purposes.
As a traditional architect draws up the plans for a building and oversees its construction, Data Architects design the data management blueprints for Data Engineers to build. They visualize and define how data will be stored, integrated, and accessed by different users, applications, and IT systems.
While Data Scientists focus on new ways of capturing and analyzing data, Data Analysts are generally concerned with structured, existing data and how it can be used to solve tangible problems. They analyze these data sets, extract insights, and present the results to aid and influence decision-making for businesses.
Machine Learning Engineer
These professionals are focused on building and producing Artificial Intelligence (AI) and Machine Learning (ML) models that “learn”, or leverage data to improve performance over time. Machine Learning Engineers build not just the models, but the underlying systems and infrastructure for these models.
Some argue that Data Science is just a rebranding of Statistics, integrating newer technologies like Computer Science.
In general, Statisticians frequently operate in a research capacity, designing experiments, collecting data, and building models to draw relevant conclusions and influence an organization’s future actions.
Business Intelligence Developer
BI Developers are focused on integrating and processing all of the different sources of data in an organization. They use visualization, reports and dashboards to make this data understandable and accessible to everyone else in the organization, facilitating data-driven decision-making and improved operations.
Data Science Skills and qualifications to look for
Here are the hard and soft skills Data Scientists need.
|Programming Languages||e.g. Python, Java, Scala|
|Databases||e.g. SQL, MySQL, PostgreSQL and NoSQL databases (MongoDB, CouchDB, Redis)|
|Data Visualization||e.g. Tableau, Power BI, Sisense, Excel|
|Machine Learning||e.g. NumPy, SciPy, Scikit-learn, TensorFlow, Keras, Pandas|
|Big Data||e.g. Hadoop, Spark, Storm, Hive, Flink|
|Statistics||e.g. Probability Theory, Bayesian Statistics, Modeling|
|Mathematics||e.g. Calculus, Linear Algebra|
|Data Analytics||e.g. R, SAS, Stata|
|Soft Skills||Critical Thinking, Communication, Flexibility, Adaptability, Teamwork, Perseverance, Creativity, Problem-Solving|
Data Scientist resume examples — and how to read them
Here are some sample resumes to give you an idea of typical Data Scientists projects, experiences and accomplishments.
Data Scientist Resume #1
Data Scientist at Etsy (2019-)
Developed framework for rapid training and productionizing of machine learning models
Built targeting models for personalized retrieval, ranking, revenue optimization, seller fairness, and seasonality
Built an automated system for targeted client email outreach
Designed and executed experiments to inform business decisions leading to 50% lift in conversions
Used data insights showing unfavorable tradeoffs to enabled business partners to avoid rolling out problematic initiatives
Mentor other scientists and guide them during their projects
Principal Data Scientist at Visa (2018-2019)
Senior Data Scientist (2016-2018)
Built models for prediction of monthly customer spend
Built credit models to demonstrate value of a digital initiative, influencing business decisioning around the initiative
Analysis of customer call data and browsing behavior to identify opportunities for customer experience personalization
Implementation in production of Machine Learning backend in Python and Spark on AWS for in-app personalized product recommendations, resulting in 3x more campaign conversions
Built performance forecaster to identify credit models requiring remediation
The University of California, PhD in Physics (2016)
The University of Chicago, BS in Mathematics and Physics (2010)
Python, Machine Learning, Personalization, React, AWS, Mathematics, Linux, Jenkins, Spark, Kafka, ElasticSearch, Computational Physics, Visualization, Java
Like many Data Scientists, this potential candidate has an advanced degree in a quantitative field. This resume reflects good career progression and a mix of hard and soft skills — overall a generalist.
Tip: You can’t always tell seniority by title. An entry-level and very senior professional may both be called “Data Scientist”. The significance of projects undertaken is a better gauge of seniority level.
Data Scientist Resume #2
Data Scientist at Trello (2020-)
Collaborate as a product data scientist on the customer onboarding team
Run randomized experiments and work with product management to make key product decisions based on test results
Create and maintain core data tables and key metrics related to new user engagement using Scala, Python, and SQL
Design and maintain AB testing interface UI, and collaborate with the experimentation team to improve our testing process
Intern at AirBnb (Summer 2019)
Built a model to identify the responsible party after an Airbnb cancellation using logistic regression
Consolidated data from tables using SQL and Presto
Derived a loss function to optimize the model based on business impact of each type of misclassification
Opportunity-sized the model for the business in terms of annual revenue
Delivered numerous presentations to data scientists as well as cross-functional teams to advocate for changes to the cancellation policy page
Research Assistant at The Brookings Institution (2016-2018)
Co-authored economic analyses, including “Closing the Gender Gap in Software Engineering” and “Housing Inequality in the Greater Boston Area”
Analyzed data in Stata to create figures and graphics for a range of papers on topics such as such as housing, infrastructure, land use, and climate change
Wrote first drafts of accessible policy briefs from expert policy proposals
Fact-checked and proofread documents for errors before publication
Stanford University, MS in Statistics (2020)
Cornell University, BS in Economics (2016)
Python, R, SQL, Data Science, Scala, Project Management, Multivariate Analysis, Machine Learning, Economic Research, Market Research, Data Analysis, Stata, LInear Algebra, Regression Models, Algorithms, Presto, Scala
This potential candidate comes from a research background. Currently, he is focused on product analysis: interpreting data and building models to improve customer experience and product quality.
Data Scientist vs Machine Learning Engineer
As Data Scientists have increasingly adopted Machine Learning modeling and open-source tools, there can be quite a bit of overlap.
However, Data Scientists are generally a better fit when you need a specialist in modeling to find the information you need to determine technical solutions for your business problems.
When the solution is already decided and implementation and scalability are the critical issue, Machine Learning Engineers are your choice. They will put the models into production and fit them into restricted computational resources.
Here’s a quick chart summing up the common differences.
|Data Scientist||Machine Learning Engineer|
|College Degree||Quantitative Field (Math, Statistics, Economics…)||Computer Science|
|Level of Education||often Masters or PhD||College Degree|
|Work responsibilities||Build (ML) models|
Data Mining and Cleansing
|Deploy and productionize models|
Enable data processing
Build ML infrastructure and backend
|Business role||Find technical solutions to unresolved problems||Implement and scale technical solutions once determined|
|Languages||Python, SQL, Java, R||Python, Java, C++|
|Keywords||Analytics, Data Cleansing, Data Mining, Data Visualization, Mathematics, Modeling, Regression, Statistics, Querying||Architecture, Backend, Data Pipelines, Development, ETL, Infrastructure, Platform|
How to Source Data Scientists
Once you know what kind of professional you need, it’s time to fill your recruitment funnel with qualified candidates. Here how:
#1: Seek out passive candidates
The “post and pray” method of sourcing isn’t likely to get you very far here, as essentially all Data Scientists are already employed by the competition.
Proactively engaging passive job seekers can feel like a heavy lift. But don’t despair — passive candidates are worth your time and effort!
- Pre-qualified by their current employers, their extensive knowledge and experience are little in doubt.
- You can target candidates who more closely fit your recruiting requirements
- They are less likely to be considering competing offers.
A strategic focus on passive candidates is your secret weapon for filling challenging roles like Data Scientist (and we’ll get into exactly how to do it below).
#2: Consider creative talent pools
Unresponsive candidates, outdated LinkedIn profiles, dry talent pools — if these challenges sound familiar to you, it might be worth checking out some creative sources of tech talent.
- Consider hosting a competition on Kaggle, an open-source Data Science and ML community, or TopCoder, a coding and Data Science crowdsourcing site
- You can find techies in their natural habitat on sites like GitHub, far and away the most popular place for maintaining code, and Stack Overflow, another popular gathering spot
- Many skilled Data Scientists have a research background — consider those who are employed not just by other tech companies, but by universities with great Computer Science, Math, Statistics, Physics and Computational Biology programs
- Check out cutting-edge work done at Data Science conferences that may be relevant to your organization
#3: Take advantage of the economic turbulence to snap up talent from top data companies
Big tech players like Microsoft, Netflix and Meta — hit with over 3 billion in losses this year — are slowing or freezing hiring, rescinding offers and even laying off staff.
For growing startups, it’s a great time to take advantage of the shakeup to lure talented employees — who might ordinarily be monopolized by larger companies — with flexible culture, career growth, and stock options.
#4: Beware recruiting agency pitfalls
With average fees for hiring Data Scientists topping $15,000 and search times lasting as long as 40 weeks, traditional methods of recruiting like agencies may not be the best fit for this difficult role.
Unscrupulous agencies may also burn up your time flooding you with unqualified candidates in the hopes that something sticks.
#5: AI sourcing automation can help you find the best matches
Celential.ai’s Virtual Recruiter service is perfect for hiring managers that need to make key Data and Engineering hires but aren’t making any traction in their sourcing efforts.
Our AI taps into vast talent networks of candidates and millions of data points unavailable from public sources to analyze candidates’ online presence in technical communities, professional and social networks, company websites, and personal websites to find the best match for your job description — even for challenging, specialized roles like Data Scientist.
AI sourcing tools are gaining popularity in the recruiting world. Check out the list of best AI sourcing tools in the market.
#6: Conduct personalized outreach
Persuading candidates to reply to your cold recruiting emails can feel almost impossible. However, we’ve found that it’s possible to boost your reply rate to 30% — even for those elusive Data Scientists.
How do you do it?
By telling Data Scientists a personalized story about why they should work with you.
- Explain how this opportunity fits into their career trajectory and how your values are aligned with theirs.
- Reference the candidate’s domain experience, technical skills, work history, and other background details. And don’t forget to personalize your subject line for up to 50% better results.
- Mention common backgrounds with your team such as education, career history, geographic location and mutual connections
- Send 2-3 follow ups to connect with many more candidates — according to our own company recruitment data, two thirds of replies come from follow ups
Passive candidate cold outreach template: ultimate edition
To make it easy, here is a template we’ve put together that we’ve used to source and hire real candidates. You can plug in your own company pitch and candidate background details to supercharge your reply rates.
After creating intrigue with a vague yet personalized subject line, this passive candidate email template keeps the focus on the candidate throughout the email – envisioning their state of mind and feelings about making a move, calling out their specific accomplishments, and establishing common background in the pitch.
Check out our blog for more free passive candidate email templates.
Why You Should Try Hiring a Junior Data Scientist
Given that 88% of Data Scientists have a graduate degree, “entry-level” candidates often actually have significant experience at internships and in university research positions.
Sky-high starting salaries and quick career progression to the Senior level are not uncommon in such scenarios, and it’s advisable to avoid branding these positions as “Junior Data Scientist” to make them more appealing to candidates.
However, with the shortage of Data Scientists and the current economic uncertainty, hiring and upskilling a true entry-level candidate without an advanced Data Science degree or work experience could be a cost-effective way of filling your role.
To identify a candidate who will likely learn quickly and grow into a role, look for positive signals such as:
- STEM degree or degree from a selective school
- Quick promotions, prizes, distinctions, or other past accomplishments in previous roles or extracurricular activities
- Exposure to your industry or domain
- Internships or experience at selective startups and top companies like Pinterest, Uber, and Amazon
- Data Science projects on sites like GitHub that demonstrate creativity and knowledge of coding languages and technologies
How to Hire a Data Scientist: Interviews
Many candidates feel that the technical interview is broken and doesn’t truly measure the skills necessary to succeed in a job. On the other hand, a well-designed set of problems and challenges can help you to objectively evaluate candidates at scale and quickly eliminate those who won’t be a fit.
- Base problems on actual challenges your team faces and try out your assessment on your current team to ensure that it’s a reasonable test of the skills used on the job and to achieve consensus on what a good answer looks like
- Keep it short – 2-3 hours max
- An open-ended challenge will reveal more about a candidate’s thought process and skill level than a quiz of technical minutiae
Unsure how to evaluate candidates for the skills you need? These questions can help.
What to ask when hiring a Data Scientist
- Which Data Science tools and skills have you used? Which are you the most experienced in?
Asking about skill strengths will also help you to gain a sense of how long it will take a potential employee to ramp up and start making meaningful contributions.
- Explain overfitting and underfitting in modeling.
This is just one example of a specific technical question you might ask to assess both hard and soft skills.
Data Scientists will often need to communicate key findings to leadership or project stakeholders with various levels of data fluency. Beyond knowing technical information, is the candidate able to articulate this knowledge in a concise and comprehensible way?
- Walk me through one of the models you’re most proud of from ideation to implementation. What was your approach, and what was the result?
Having a candidate talk through a project is a good way to gauge their overall experience level and give insight into their thought process.
Perhaps a candidate created a new model to predict customer churn after a pricing change, or an ML algorithm for personalized podcast recommendations.
Great answers will reflect the use of metrics to measure success, incorporation of feedback, and a focus on results and overall business impact.
Perhaps the implementation of the model prevented significant customer churn. Or, perhaps a bad decision was averted because a predictive model warned of dire results.
- Tell me about a challenging project you tackled. How did you handle bottlenecks and setbacks? What did you learn — was there anything you could have done better, or did you pick up a new language, technology or skill?
For example, perhaps a candidate was tasked with analyzing thousands of records of customer purchases and browsing behavior to identify opportunities for upselling. This data is messy and complicated, and there are many possible approaches and many pitfalls to avoid.
Though errors are inevitable, the way those errors are addressed makes all the difference.
Great Data Scientists are thoughtful — using past experience to reflect and iterate on their own processes. This question will help you gauge a candidate’s judgment in handling errors and ability to problem-solve and adapt.
- Tell me about a time you influenced a colleague, customer or stakeholder with data.
The smartest Data Scientist in the world won’t add value to your organization if they can’t explain why their insights should impact business decisions.
On the other hand, a Data Scientist who can convincingly articulate why decision X will lead to Y% better results or $Z cost savings is likely to start producing a strong impact quickly.
How much does it cost to hire a Data Scientist?
Here are some market salary data to help hiring teams make a competitive offer and plan headcount targets.
It’s important to note that pay may vary significantly depending on location and company. For example, Data Scientists in the San Francisco Bay Area (CA) are compensated $130,300, and those in Montgomery (AL) make $116,941 annually.
|Years of Experience||Base Compensation||Additional Compensation|
|Data Scientist||1-3 years||$99,453||$20,404|
|Senior Data Scientist||4-6 years||$123,106||$24,848|
|Principal Data Scientist||7+ years||$135,378||$22,237|
Learn how much it costs to hire a software developer.
Celential’s sourcing solution leverages a vast talent network of 5M+ to deliver top-quality talent with zero effort or learning curve on your part. It takes just three days on average to receive warm candidates ready to interview.
Through the power of cutting-edge AI and ML, our Virtual Recruiter will find the best matches and engage candidates at scale — even for specialized roles like Data Scientist.
With your recruiting team’s time freed up to nurture and engage candidates, you’ll soon close hires for this competitive role.
Sign up for a free trial today!
Table of Contents