#23 - Confessions of a fraud CTO: My million-dollar algorithm mistake

A couple of weeks ago I had the chance to speak at a panel titled “My biggest mistake in fraud”.

During our prep call, we had a quick round between the speakers so each one could briefly tell their story.

Somehow my mind always goes blank in such conversations. Don’t get me wrong, it’s not that I have a flawless track record – oh lord, not at all.

It’s just that usually when we talk about mistakes in fraud, we tend to refer to operational mistakes:

Activating the wrong rule at the wrong time;

Deactivating the wrong rule at the wrong time;

Introducing some data quality issues, etc.

And even though I made plenty of these mistakes myself, they all seem like boring stories to me.

(Or maybe I’m very good at suppressing the details of my mistakes, I would not put it beyond my treacherous mind…)

Anyway, back to the conversation: it suddenly hit me. My worst mistake wasn’t operational in nature, and maybe that’s why it never occurred to me.

But not only did it cost me millions(!!!), it also put me behind by a couple of years at least.

Now isn’t that a story worth sharing?

Planting Bad Seeds

When we started Fraugster, my previous company, the idea was to detect fraud in real-time using a new Machine Learning algorithm we developed ourselves.

Now let me preface this by saying that I’m sure you’ve heard this claim before. However, it’s rarely true.

When teams say they have “homebrewed” their own algorithms, what they usually mean is that they’ve combined a couple of known algorithms, that they played with the hyperparameters (i.e. the algorithm’s configuration), or that they’ve built some unique data features.

All of these might enable you to train your own models, but it’s almost never the case that the algorithms themselves would be written from scratch.

Without going too much into boring details, our idea was this:

We would build an in-memory database that would enable us to query it thousands of times within nanoseconds.

This will allow us to write a very compute-heavy algorithm that will basically “research” each transaction we get in real time.

Since the queries will be done in real-time, it also means that as long as we keep the database “fresh” with new data, the algorithm will always have peak performance.

Essentially, this is how it’ll self-learn.

And while there were a couple of general-use in-memory databases that were similar in nature, we wanted to create a fraud-specific one, that will also have its own scoring algorithm.

Sounds cool, right?

In reality, implementing this vision was very challenging and took us the better part of four years, even though we had a working prototype within a couple of months.

But you’ve got to ask yourself - other than claiming to be doing something different, is the investment worth it?

Early on, we ran a couple of tests where we pitted our algorithm against a couple of algorithms that were commonly used for fraud detection - Logistic Regression and Random Forests.

It outperformed both.

We were of course happy with our vision validated, and continued onwards with full force.

Reaping a Storm

A few years have passed and one of our data scientists was doing some exploratory research. He was trying to find out if we could get better performance from slower (>200 milliseconds) algorithms.

In one of his experiments he accidentally had an out-of-the-box Logistic Regression algorithm run against our main algorithm. Mind you, our algorithm was very mature at that time.

The LogReg algorithm crushed ours, without any special tuning.

Not only that, it was faster.

You would think that no better accident can happen. What better R&D outcome can you hope for? A better, cheaper AND faster algorithm? Yes sir!

But I want you to understand what went through my mind when I heard about it:

My mouth went dry, my stomach clenched. Did I just burn millions of my investors’ funds, and a few good years chasing a pipe dream?

As a CTO, that was my worst nightmare. If true, it would be the biggest mistake I could possibly have made.

I mustered my critical thinking skills and started to hypothesize:

Maybe it was a specific dataset that was randomly skewed?

Maybe the algorithm performed well but degraded very fast?

Maybe the fraud-rate was too high in the sample, and it wouldn’t work with real-life ratios?

I immediately devised a few research questions and sent my team on an endless quest to run even more tests to prove the initial one wrong.

It had to be wrong.

But every time they’ll come back with more proof that this algorithm was better, and every time I refused to believe it.

The tests I’ve asked them to run became more and more outlandish.

Finally, I had to give in.

I had to admit to myself, to my data team, and to my leadership team that I’ve been wrong all this time.

We productionalized the new algorithm and it finally went live a full year after the first incidental finding.

Another full year of going in the wrong direction just because my ego wouldn't let me admit my mistake.

What I’ve Learned

There is another failure to consider: after all, we did test LogReg when we started down our R&D path. And back then, it proved to be inferior to our algorithm.

What changed?

It’s hard to say. But I also need to consider the fact that nothing really changed.

After all, we were out to prove our idea was pure genius, and it’s very easy to bias your experiments with such an attitude in mind.

The point is that it doesn’t matter much. If I had to go back in time, I wouldn’t even bother with experimenting. Today I will always opt for proven, supported technology.

Why? Here’s the thing:

In the few months it took us to productionalize the LogReg algorithm, some crucial aspects of it came to light.

It wasn’t just the performance itself. It was also how easy it was to test and implement it.

You see, when you design your own algorithm from scratch you don’t get out-of-the-box tooling.

You don’t get to hire data scientists with experience in utilizing it.

You don’t get open-source code libraries.

And you definitely don’t get decades of academic research publications.

You’re on your own.

And so, even if I knew the absolute truth to be that a homebrewed algorithm has better performance, I would still not work with it. It’s just too much hassle.

Hassle = slow = expensive = low ROI

In fact, I’m quite convinced that without having scientific and technological community collaboration and knowledge-sharing, you can’t expect to have better performance in the first place.

What an expensive lesson it has been for me.

But you live and learn.

What was your biggest mistake in fraud? Hit reply and let me know. Confidentiality guaranteed. :)

In the meantime, that’s all for this week.

See you next Saturday.

P.S. If you feel like you're running out of time and need some expert advice with getting your fraud strategy on track, here's how I can help you:

Free Discovery Call - Unsure where to start or have a specific need? Schedule a 15-min call with me to assess if and how I can be of value.
​Schedule a Discovery Call Now »

Consultation Call - Need expert advice on fraud? Meet with me for a 1-hour consultation call to gain the clarity you need. Guaranteed.
​Book a Consultation Call Now »

Fraud Strategy Action Plan - Is your Fintech struggling with balancing fraud prevention and growth? Are you thinking about adding new fraud vendors or even offering your own fraud product? Sign up for this 2-week program to get your tailored, high-ROI fraud strategy action plan so that you know exactly what to do next.
Sign-up Now »

 

Enjoyed this and want to read more? Sign up to my newsletter to get fresh, practical insights weekly!

<
Next
Next

#22 - I asked Claude to score fraud AGAIN. Then this happened...