[Opinion] Bing AI: has Microsoft lost its mind?

In case you haven’t heard, Microsoft is “Reinventing search with a new AI-powered Microsoft Bing and Edge, your copilot for the web”. Surfing on the ChatGPT craze, they also further announced the integration of chatbots into Skype, so that you can just send a quick question to your AI pal in the middle of your conversation with friends and family.

Let’s put aside the fact that it necessarily means giving full access to the content of your conversations to Microsoft (no end-to-end encryption if Bing needs to listen in to offer its advice!).

I just want to quickly review the examples that Microsoft has decided to highlight in its own blog posts to showcase the capabilities of its new superpowered AI. Because… I’m starting to doubt that anyone even reads the chatbot outputs before sharing them and saying how wonderful they are.

Let’s start by the first example, in the Bing AI announcement.

I am planning a trip for our anniversary in September. What are some places we can go that are within a 3 hour flight from London Heatrow?

Relatively simple and straightforward request. We need destinations that are:

Less than 3 hours flight from London Heatrow.
Suitable / popular for a romantic getaway.

If we ask Google for a “romantic trip less than 3 hours flight from London”, we get several articles with a lot of possible choices, from inspiringtravel.co.uk, thegentlemansjournal.com, or travelrepublic.co.uk. Not all of those are suitable for the request, but it’s fairly easy to sift through (almost all include some estimation of the flight time and what to do there).

What’s Bing AI’s take?

First one is Malaga. The provided information seems correct, and it fits the request, depending on what you consider suitable for a anniversary trip. So far, so good.

Second one is… Annecy, in France. Certainly a very nice city, but it doesn’t have an airport, and is therefore absolutely not within a 3 hour flight from London Heatrow. Ok. One mistake, let’s move on.

Third one is… Florence, Italy. Which as far as I can tell requires a minimum flight time of four hours.

That’s the first example they chose to highlight: two of the three proposed destinations do not fit the prompt. In the propositions from the Google search, you can find Paris, Amsterdam, and plenty of other destinations which arguably fit the prompt better. Also, you know, instead of the Venice of France, there is actually a 2h20 minutes flight from London to actual Venice…

Let’s see what Bing AI in Skype can do. Here, they provide three conversation examples. In the first one, they ask for some vegetarian recipes, and it delivers. I don’t really see how it’s easier than Google in this case, but fine. Second one is about cleaning up a full mailbox, and here again the results seem fine, but you could put the same thing in Google (or probably even Bing!) and get it in the first result.

The third one is where we finally see a request that requires some level of “intelligence”: “what should we do during a layover in Spain? Food an beaches ideally”. That’s a relatively easy question: Bing AI can choose from all over Spain, and information about beaches and restaurants are typically not difficult to find. Can you guess how well it performs?

Bing AI in Skype example from Microsoft’s blog

The first suggestion is fine… Almost. At first glance. As far as I can tell (because the reference is hidden in the screenshot) La Mallorquina in Barcelona is not a pastry shop, it’s a textile shop. There is a pastry shop called “La Mallorquina Formentor”, but it doesn’t seem particularly famous. There is a famous pastry shop called “La Mallorquina” in Spain… but it’s in Madrid. I’m sure Spanish – and Catalan – users will be delighted to know Bing AI confuses Madrid with Barcelona…

The second suggestion is incredible. “If you are in Port of Spain”. Port of Spain. Which is in Trinidad and Tobago, around 6000km from Spain, is the second option for a layover in Spain from Bing AI. In the example cherry-picked by Microsoft to showcase how good their system is. Seriously.

This is absurd. Microsoft may be getting some people to Bing with the hype, but I really don’t see how they will retain customers if they have such a high miss rate in their answers.

The fact that a comparatively much smaller mistake made by Bard in Google’s demo of their competing system apparently had a huge impact on their stock while Microsoft’s mistakes (and there are others in their other demos) don’t seem to have much impact on the hype either suggests that tech investors are completely irrational, or that everyone is so used to Bing being bad that “bad, but with style” is really seen as an improvement.

Anyway. I can’t wait for this hype cycle to be over so that we can move beyond large generative models, which seem more and more like a scientific dead end.

8 Mar 2023 | Adrien Foucart | adrien@adfoucart.be