The Lead Quality Problem: Data & Machine Learning

Lead Quality Problems: How We Use Data and Machine Learning to Solve Them

This post originally appeared on Clearbit's Blog. When Simon Whittick joined Geckoboard as its first VP of Marketing, he took all the standard steps to attract more visitors to their site, convert them, and grow the SaaS company’s revenue. He and his team wrote content for their popular blog, ran paid advertising campaigns, and set up email nurture campaigns. At the end of his first year, he was as successful as almost any other marketing executive in the industry. The site was attracting hundreds of thousands of visitors every month, and the business was booking millions in annual recurring revenue. But unknowingly, his success was driving one of his coworkers crazy.

While 10,000 leads a month earned Whittick applause at the company’s weekly all-hands meeting, it was keeping Geckoboard’s only sales development rep (SDR), Alex Bates, at the office on nights and weekends. Many of the inbound leads were self-serve customers who required no conversation with sales, or tire kickers who were not ready to buy. This left Alex manually qualifying leads and wasting tons of his time.

As a result, Geckoboard’s sales efficiency—one of the most critical metrics for any company—was slumping. In other words, Whittick wasn’t only driving a junior sales rep crazy; he was leaving money on the table.

Over the course of the next year, Whittick built a data-backed machine learning process to solve his company’s lead-qualification problems. In the process, he turned Bates into not only an adoring fan of his, but a one-man sales team as efficient as a typical ten-person SDR team. Without any technical background, Whittick figured out a way to change the shape of his company using data and a bit of machine learning.

One day toward the end of last year, Bates and Whittick sat down to discuss how they could solve their lead-quality problem. They had close to 10,000 leads coming in each month, but they needed to figure out which of those leads to send to sales. Their first instinct was to refine their ideal customer profile. They’d both read all the sales and marketing blogs preaching its importance. They started with a Ideal Customer Profile based on some simple audience rules.

On paper, Geckoboard’s ideal customer was a software company with more than 100 employees; they typically sold to a director or VP. But the truth was that a lot of companies outside that explicit profile would be great customers. For example, their initial model excluded a company with 95 employees even if it looked almost identical to one of their best customers. When they looked at their past data, they learned that leads in what they believed to be their ideal customer profile converted at twice the rate. But they only accounted for 0.7% of the conversions. They needed a more nuanced and flexible inbound model.

basic-ideal-customer-lead-qualification-results

Prior to joining the Geckoboard team, Whittick had worked for Marin Software. While he was there, he began to notice a shift in the way companies approached marketing. The most forward-thinking companies had begun to hire technical employees with degrees in mathematics instead of business. He heard stories of companies that were replacing entire teams and doubling revenue by using publicly (or privately) available information and crunching it to their advantage. As time went on, many of those employees left their jobs to provide the same level of automation to smaller companies without the budget to hire data scientists.

Between his time at Marin Software and Geckoboard, dozens of startups popped up to help larger companies embrace the data revolution. Companies like Clearbit mined the web for demographic and firmographic data that could be used to better filter leads. My own company, MadKudu, makes it possible to pull insights from that data without having a PhD in computer science. By 2016, the entire marketing technology landscape had shifted. With an executive team that embraced innovation and big bets, Whittick decided to make it Geckoboard’s competitive advantage.

The first step Whittick took was to develop his own flexible point-based scoring system. Previously a lead was either given a 1 or a 0. A lead was either a software company with 100 or more employees or it wasn’t. It was binary. The obvious problem of this model was that a company with 95 employees would be excluded. In addition, a company with 100 employees was given the same score as a company with 1,000 employees, even though the latter was twice as valuable.

In his new model, Whittick gave leads a score based on multiple criteria. For example, he’d give a lead 50 points for having 500 employees or more, and negative 2 points if it had less than 10 employees. A director-level job title would receive 10 points, whereas a manager would only receive 3. This created an exponentially larger score range, which meant that Bates could prioritize leads. If he called the top score leads, he’d have the option to call B-tier leads. The model was weighted toward the large accounts Geckoboard knew could significantly impact revenue. For example, a US-based real estate company with 500 employees and a director buyer would be routed to the top of Alex’s lead list, even though it didn’t fit the software industry criteria.

This new model was similar to the way SDRs have scored leads for over a decade, only more efficient. Prior to automated lead scoring, sales reps were told by their managers to prioritize leads based on four criteria: budget, authority, need, and timing (or as it’s commonly referred to, BANT). This method is more flexible than a rigid ideal customer profile, but it is only as strong as the rep behind it. Human error, irrational judgment, and varying levels of experience lead to a process with little rhyme or reason. That’s why Whittick chose to automate the task and take humans out of the process entirely.

RESULTS-advanced-point-based-lead-scoring@1x

Immediately the company began to see results from their lead-scoring investment. Within the first month, leads were converting at twice the rate. As a result, Bates was spending less time to close more deals. Sales efficiency—revenue collected divided by the time and resources to earn it—rose significantly. Still, Whittick knew he could improve the results and save Bates even more time.

One of the biggest shifts that Whittick saw in the technology industry was the speed at which data could be put to use as a result of new tools. In the old world that he inhabited, a lead couldn’t be scored until it hit a company’s CRM. Enrichment software took hours to append valuable data to a lead.

That information could be sent to the CRM and the lead scored accordingly before the visitor began typing in the next text box.

After his first lead scoring success, Whittick decided to make another bet. Bates frequently complained about leads that were obviously bad fits—the type of conversation that takes 30 seconds to know there isn’t a mutual fit. Many of the companies were too small to need sophisticated dashboards yet. Whittick enlisted one of the company’s front-end developers to help him solve the problem. They built logic into the product demo request page that would ask for a visitor’s email address and then, before sending them to the next page, score the lead. On the back end, additional information would be appended to the lead using Clearbit, and it would be run through MadKudu’s scoring algorithm. If it received a high-enough score, the next page would ask for the lead’s phone number and tell them to expect a call shortly; if the score was low, they’d be routed through a low-touch email cadence. It was radically successful.

Before implementing their real-time lead scoring solution, only about 15% of Bates’ conversations were meaningful. The new website logic meant that he could cut 85% of the calls he took every day and focus on higher quality accounts. Once again, sales efficiency increased significantly.

In addition to the speed at which information could be appended, processed, and acted on, Whittick saw another change in the marketing technology world: there was suddenly more data than most companies knew what to do with. Marketers could know what CRM, email server, and chat service a company used. They could know when a company was hiring a new employee, when they were written about by a major news outlet, and how much money they’d raised. It was overwhelming. But thanks to tools like Segment, marketers could pipe all that data into a CRM or marketing automation system and act on it. Then they could combine it with information like how frequently someone visited their own site, how often they emailed sales or support, and when they went through a specific part of the onboarding process. For a data-driven marketer like Whittick, this new world was utopia.

In conversations with Bates, Whittick learned that the best leads were ones that went through their onboarding process pre-sales conversation. During the Geckoboard free trial, users were prompted to build their first dashboard, connect a data source, and upload their company’s logo and color palette. As is the case with many SaaS solutions, most users dropped off before completing all the steps. Those users weren’t ready for a conversation with sales. But when Bates was looking at his lead list, he had no way of knowing whether or not a free trial user had completed onboarding. As a result, he was spending at least half of his time with people that weren’t ready to talk or buy.

Combining usage data from the website and their app, Whittick set out to refine the lead scoring model even further. Each time a free-trial user completed a step in the onboarding process, it was recorded and sent back to the CRM using Segment. The model would then give that lead a couple of additional points. If the user completed all of the steps, bonus points would be added and the lead would be raised to the top of Bates’ lead queue in Salesforce. Again, Bates began spending less time talking to customers prematurely and more time having high-quality conversations that led to revenue. Whittick had figured out how to save the sales team time and increase sales efficiency further.

But while Whittick and Bates were celebrating their improved conversion rate success, a new problem was emerging. By the summer of 2016, they had enlisted my team at MadKudu to automate their lead scoring. Rather than manually analyzing conversion data and adjusting their lead scoring model accordingly, our machine learning tool was built to do all the work for them. There was a small problem. Today, machine learning algorithms are only as strong as the humans instructing them. In other words, they are incredibly efficient at analyzing huge sets of data and optimizing toward an end result, but a human is responsible for setting that end result. Early on, Whittick set up the model so that it would optimize for the shortest possible sales cycle and the highest account value. He didn’t, however, instruct it to account for churn, an essential metric for any SaaS company. As a result, the model was sending Bates leads that closed quickly, but dropped the service fast too. Fortunately, the solution was simple.

After learning about the problems with his model, Whittick instructed MadKudu’s algorithm to analyze customers by lifetime value (LTV) and adjust the model to optimize for that. He also instructed it to analyze the accounts that churned quickly and score leads that looked like this negatively.

Example: For Geckoboard, Digital Agencies were very likely to convert and the old scoring algorithm scored them highly. However, agencies had a 5X chance of churning after 3 months when the project they were working on ended.

At this point, the leads being sent to Bates were significantly better in aggregate than the leads he had previously been receiving. However, there were still false positives that would throw him off. While the overall stats on scored leads were looking great, the mistakes the model made hurt sales and marketing trust and were hard to accept. To combat this and make the qualification model close to perfect, Whittick had Bates start flagging any highly scored leads that made it through.

Through this process, they found that many of the bad leads that made it through were students (student@devbootcamp.com), fake signups (steve@apple.com), or more traditional companies that did not have the technology profile of a company who would likely use Geckoboard (tractors@acmefarmequipment.com). Whittick was then able to add specific, derived features to their scoring system to effectively filter these leads out and yet again improve the leads making it to Bates.

At this point, Geckoboard can predict 80% of their conversions from just 12% of their signups. By increasing sales efficiency with machine learning, Whittick found a way to enable Bates to do the work an average sales team of five could typically handle.

From self-driving trucks to food delivery robots, this is the story of twenty-first-century business. Companies like Geckoboard are employing fewer people and creating more economic value than enterprises ten times their size. Leaders like Whittick are center stage in this revolutionary tale, figuring out how to optimize sales efficiency or conversion rates or any other metric given to them, just like the artificial intelligence they now employ. But of course, this has been happening over many years, even decades. The difference—and this cannot be overstated—is that Whittick doesn’t have a PhD in applied math or computer science. The technology available to marketers today enables companies to generate twice the revenue with half the people.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.