The Search AI Wars

April 17, 2023

15 minutes

OpenAI's ChatGPT product and GPT-4 model, which represent the current state of the art of Large Language Models (LLMs), have taken the tech world by storm and captivated the imaginations of technologists and investors alike. The prevailing mood seems to be that every company or product that can be disrupted by AI will be, and startup founders and big tech executives are both scrambling to catch this next wave of innovation early.

There are many facets of this new technology and business landscape to analyze, but the one I want to focus on in this post is the impact of LLMs (I will use LLM and AI interchangeably) on Google's search business. I do own Google stock, but this is not financial advice, just my attempt at making sense of the discourse around this topic. Frankly, I think too much of the discussion around AI has focused on far-future extremist predictions: AI is coming for your job, the singularity is upon us, AI chatbots will fill the web with spam. I  wish there were more thoughtful discussions about how AI will impact search, and which companies will come out of this potential shakeup as winners. More attention should be paid to this topic because google.com is the most visited website in the world and the workings of Google search affects the vast majority of us multiple times every day (often in ways we don't even realize). When Google makes changes to search, it matters. Microsoft's integration of OpenAI's GPT-4 model into Bing and Google's recent responses truly capture the here and now of AI's impact on technology.

The common argument I've seen in the wake of OpenAI's splashy demos and recently announced partnership with Microsoft, compared with Google's seemingly reactionary and underwhelming response with their own chatbot named Bard, is that LLMs carry the potential to massively disrupt the search experience as we know it today, and Google will not be the one to lead that dance. The Google doom case predicts  that instead of having to issue several queries to Google to find the website(s) that contain what you're looking for (clicking through to advertiser's that bid on real estate in Google's search results is, of course, key to their business model) , you'll be able to send one larger query to an LLM powered search engine which will deeply understand your intent and respond with a comprehensive answer. LLM powered search would be like interacting with an intelligent chatbot that scours the internet on your behalf. This mode of search will disintermediate Google's wedge between users and information, which will inhibit Google's ability to serve ads and sell ad space. The argument continues that this is a classic Innovator's Dilemma problem. The Innovator's Dilemma, in short, is that companies won't kill their golden goose, even if they have the technical acumen to do so. The inertia of a working business model keeps incumbents entrapped in what works, until it doesn't.

It's a compelling argument, and parts of it are true. I do believe that LLM enabled search will be different than it is today. Google also certainly has a vested interest in keeping advertisers involved in search, even if that doesn't always necessarily benefit the user. Finally, there are technical/economic reasons that Google can't just turn on LLM-powered search, even though they do have their own LLMS that are competitive with OpenAI's. Most discussion around the costs of LLMs focuses on the cost of training these huge models. But Google can absorb that cost. The cost-prohibitive part of these massive models is in using them to serve queries (so called "inference mode"). Google search is arguably the most high volume product on the internet, and Google makes only a few pennies of revenue on each query -- anything that will meaningfully change the cost per query will be financially untenable for Google (this article gives a fantastic breakdown of the costs involved). Therefore, it is easier for a scrappy upstart to experiment with alternative modes of search that may not be long term economically viable than it is for Google.

That said, I am bullish on Google and think they will emerge AI craze stronger than before. I think the Innovator's Dilemma argument against Google overlooks a lot of the main, but subtle, dynamics in play:

  • It overstates Google's business model risk
  • It ignores Google's true moat in search
  • It underweights Google's technical head start
  • It ignores the real disruption


Google's Business Model Risk

We need to be very precise when we talk about how Google makes money. As it relates to search (Google, of course, has other sources of revenue beyond search such as YouTube and Google Cloud Platform), the main way Google makes money is when a user clicks on an ad in the search results. The advertiser bids to have their ads displayed for that query, and then pays Google for the click. Google claims that 80% of queries don't show ads at the top of search results, and that the vast majority of ads you see in Google search results appear on searches with commercial intent. Given that, it seems to me that the next generation of LLM powered search won't  look too different from search as we know today for those queries that express commercial intent. If a user wants to buy something, they should be shown websites of merchants that want to sell something. If today that transaction is brokered by a traditional search engine, and tomorrow it's brokered by an LLM powered search engine, I see that not so much as paradigm shift but as a superior improvement in UX. In light of all that, I argue that LLM-powered search doesn't pose as direct of an attack on Google's moneymaker as many people claim.


Google's Real Search Moat

LLM powered search may and hopefully will disrupt what we come to expect out of a search experience. But any search competitor looking to unseat Google has an extra large hill to climb and it's because Google's moat in search extends much further than building a world class search engine. Google's hidden moat in search lies in its convenience. Google has spent many years and several billions of dollars to ensure that it takes as little clicks as possible to issue a Google search. Google Chrome is the world's most popular browser. Android is the world's most popular mobile operating system, and for the last eight years, more Google searches have been performed on mobile devices compared to desktop. Google also pays around half a billion dollars to be the default search engine on Firefox, and they pay many times that amount to be the default search engine on Apple devices. This "convenience moat" may seem fickle, but it is extremely well fortified.

In fairness, Google's moat in search is also due to it being an incredibly full-featured product. Google has relentlessly expanded the volume and variety of queries it can effectively service, and the result of all of that effort and improvement is that users are habituated to go to Google with many type of search queries and think they will get a good result. Scrolling through the history of search, I'm struck by just how many categories of queries are out there, including but not limited to: using Google as a calculator, a dictionary, checking the weather, checking financial info, checking election info, accessing information on COVID-19, and much more. Many of these use cases are what I'd call timely lookups, and LLMs, with their inability to be rapidly updated, are ill-suited to handle these types of queries.

More generally, just a few years sparktoro reported that over half of Google searches are "clickless queries", in which the user doesn't need to go to a website to find the information they are looking for. The report by sparktoro is interesting - as of 2019, 50.33% of queries don't require a click, 45.25% result in an organic click, and 4.42% result in an ad click. Over half of Google searches don't even have the potential to be monetized! That may seem like a rare act of corporate charity, but Google makes the easy queries as easy as possible to entrain the habit of going to Google. Any search upstart is going to have the hard work of detraining that habit.


Google's Technical Headstart

I find much of the discussion around how OpenAI's technical superiority will leave Google in the dust to be somewhat curious. GPT literally stands for "Generative Pre-trained Transformer", and transformers were invented by Google in a landmark paper six years ago. Now, just because Google invented the concept doesn't mean they'll be best at operationalizing it. After all, they didn't create the first search engine. In any case, many are conceptualizing LLMs as a Whole New Search Paradigm when the reality is that Google has been integrating AI and language models into search for over a decade.

In 2012, Google announced their Knowledge Graph, which touted Google's ability at recognizing and searching across entities instead of just textual data. More to the point, Google started taking the first steps in determining what an article was really about even if the article never mentioned that particular thing. The Google Research blog gave the following example:

Using background knowledge, we might be able to infer that the WNBA is a salient entity in the Becky Hammon article even though it only appears once.

In 2015, Google introduced RankBrain which used machine learning models to associate web pages with concepts. In 2019, they introduced Neural Matching which enabled Google to more deeply understand the intent of a user's query. Taken together, both of these improvements enabled Google to continue to return results to satisfy a user's query and reduced the burden on the user of having to specify exact matching terms.

Later in 2019, Google introduced BERT, which enabled Google to understand queries the way a human would. BERT is, in some sense, the true predecessor to the LLMs that have come out recently. The basic idea of the innovation of BERT is that Google can understand each word in a query in the context of the other words. The Google Research blog gives a salient example of understanding the different meanings of the word "bank" whether it's used in "bank account" or "bank of the river".

More recently, in 2022, Google introduced yet another innovation called Multitask Unified Model (MUM). MUM, which represents a substantial improvement upon BERT, which can understand not just natural language, but also multimodal data, world events.

Each of these models has helped to lay the conceptual and practical groundwork for the latest crop of large language models. Google may not have been first to market with their own LLM, but I have a hard time seeing why they'll be left behind by the latest in a long line of artificial intelligence evolutions.


The Real Disruption

The above analysis is not meant to imply that I think LLMs will not affect the status quo, and therefore Google's prospects are good. I think LLMS are going to hugely impact how we navigate the internet, and it is already spawning an exciting new class of AI powered applications.

But look at the names we ascribe to the products we use today - search engine, browser. The names imply that we use these tools to find or explore information. And then look at the magic moments that ChatGPT and other LLMs produce. People seem most wowed when they use these products to co-create something. The real promise of generative AI is when it's...generative.

So, while I do think LLMs will change the search experience we're used to, I don't think the primary disruption of LLMs will be on search, rather I think it will be in the creation of new product types that let users do more than passively locate information. I don't know exactly what these products will look like today. But I do have a hunch that personalization will be a key vector of these products going forward. Right now, we marvel at ChatGPT's ability to produce human quality content. At some point, I think we'll come to expect this ability of LLMs and will instead crave that it generates content that's uniquely suitable for the user. In a recent episode of the Hard Fork podcast, Alphabet CEO Sundar Pichai spoke of getting everybody "their own personalized assistant". Google has a head start over most other companies, except maybe Apple, in building personalized services based on the amount that it knows about you.

---

So, I predict that Google will be one of the winners of the search AI wars. At the very least, it'll be a fun story to follow! I hope you had as much fun reading this as I did researching and writing it.