is about web analytics, data science and marketing strategy

This article takes only 5 minutes to read

How to successfully recruit a data scientist?



As sometimes pointed, “big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it”. If you feel like your company needs to get among the biggest and hire data scientists, make sure you, your company are ready for this. This article will try to help you with not falling into most obvious traps and seting up an efficient team with talented individuals.

  1. Most obvious, but the first step is to honestly answer the question: do I really need a data scientist? A yes should be first followed by a ‘yes’ in multiple areas:
    •  My data is too big to play with it in a simple tool like Excel – otherwise just hire a regular analyst to dig into it
    •  My data is too complex to derive insights from a reporting tools: if you can get insights you need just by walking around the data, hire a good reporting analysts (QlikView, Tableau) to visualize the data
    •  My business needs an intelligent ‘robot’ to do something: sort your first page results, propose a product to a given basket or set the PPC bid – whenever you need an intelligent algorithm to replace a human because of the economies of scale
  2. Will the data scientist have anything to do in my company? Again, a honest ‘yes’ should follow any point like:
    • I do believe in the data I store. My data is reliable enough, to give it a power to influence my decisions. If you have tracking issues, database errors that exceed a level of reasonability fix that first. The output of a data scientist’s role is always conditional on the data she/he has at fingertips. I will never be better than that. Also, a data scientist won’t fix your tracking, won’t do the admin job in hour hadoop cluster. He might have skills to do that, but you won’t keep him for a long time in such environment. If the answer here is no, there is no point of going any further. Get the basics right.
    • I can freely access the data I store. It might be obvious, but double check if the key data is not protected by a data dragon, i.e. a person who sits on the database and build his power in the company by shuffling the access to it. Make sure the admin(s) will be helpful in not only giving access, but also explaining what they store inside. Otherwise you will bring a fight with no results.

Advertise the role

Having a yes to all in previous section, we can start looking for the right person. If we are building a team from scratch, first person to hire will determine the later team quality. At this point, no cutting corners is allowed. If we don’t have internal resources to verify required competences, we need to ask either an experienced recruiting agency or a proven domain guru for help. Look at blogs , career paths, seek for proofs. The person to interview must have the desired skillset mastered.

Making sure you will have someone to conduct a proper interview, you can start advertising your new position. At this stage, pay attention to advertise correctly:

  1. Do not post all skills you know as required: tailor the ad to really fulfill your needs. Don’t ask for hadoop experience if you don’t have a cluster installed now or in a near future. This will broaden your audience
  2. Do not ask for obvious, like “expert knowledge of MS Office”. If one can handle 3 programming languages, he will be able to insert a pie chart into Power Point presentation. Such things often discourage professionals.
  3. Do not ask for being creative, problem solver, data driven. These are obvious. Seek for these skills in career history and interview, but don’t place in the ad
  4. Provide a problem description. What problem do you want a data scientist to solve? Create curiosity. An example “The role will be responsible for creating and developing a basket optimization algorithm and support quantitative side of CRM”. It states clearly what you need now, in the future and how broad the responsibilities are. The job will not be boring – this is what you’re trying to sell here

See the candidate in action

Having the CVs, now it’s the fun part. After reviewing, pick only few best ones and do not immediately interview. First, give them an opportunity to simply show off. Send a task to complete, that would require not less than 30 min and not more than 4-6 hours to complete. It can’t be too short, as you want to test the motivation to work. The task should be about solving a simple problem related to the role, written in business (not technical!) language. Give some blinded (e.g. normalized to the value of \(\pi\) by \(\widetilde{x} = \frac{x}{\max{x}}\pi\)) data and ask for an analysis. Be simple (due upper time limit) but try to aim for answering an important question. Example:

These are the purchases of products X,Y and Z with the descriptions of the customers. Who among those who never purchased Z would you recommend as a target group for an email campaign promoting Z?

This simple question requires:

  • Translating it into a purchase probability maximization problem
  • Data manipulation, reading to a program like R from a raw file.
  • Running a statistical model, like logistic regression, clustering – be open to creative solutions
  • Opens the area of discussion about the results in a business language: “so, you recommend males purchasing X during Xmas to sell Z – how do you think it might affect our brand perception?
  • Proves to some extent programming skills, this will eliminate from the interview process ‘theoretical data scientist’, i.e. people who might be bright and knowledgeable, but with no hands-on experience

Face to face interview

If the result is good, it’s time for a face to face interview.Now make the most of it:

  • Ask for details. You have the task solution, it’s a great starting point. “in the read.table function, you used stringsAsFactors=F. Could you please tell me what it means and why you did it like that?
  • If one claims knowing a language, ask for a tiny yet important things. One knows Python? Ask for a fast way to do linear regression (if never heard of scipy or numpy that’s a red light), one knows R? Ask for difference between \(a*b\) and \(a\%*\%b\). One knows SQL? Ask for an efficient way to get intersect of two sets. And so on. This will help you separate real gems from “CVs on steroids”.
  • Always ask for “what have you done”, not “what would you do”. Power of behavioral interviewing.
  • Avoid obvious: “where do you see yourself in 5 years”. You have a person with high IQ in front of you, don’t play these tricks
  • Verify if he will understand business specifics. Give some background, explain a problem, see how he thinks.
  • Always look at the soft skills, but in the background: ultimately, is this a person I would enjoy working with? Will he add value to the company? Will he add joy and happiness of the teams by increasing morale with a positive energy?

Anything else? What did make you feel good or bad during interviews? Please comment below!

2 responses to “How to successfully recruit a data scientist?”

  1. charles

    the best way to avoid hiring the wrong person is to hire someone who loves what they do and has been doing it for a long time for some great firms and has great recommendations

  2. Hashmi Dawakhana

    Help full post

Leave a Reply

You must be logged in to post a comment.