Google: Bard Now 30% Better at Computation-Based Problems

As Microsoft, OpenAI and several other tech firms add new features and enhancements to their generative AI models, Google is following suit with new improvements to Bard that strengthen the chatbot’s math and coding capabilities, as well as an export feature.

The company says these improvements have improved Bard’s accuracy to computation-based word and math problems by 30%.

According to Google, the company is introducing a new technique called “implicit code execution” to help Bard detect computational prompts and run code in the background. The intended result is a more accurate response to mathematical tasks, coding questions and string manipulation prompts. These improvements also come with a new features that allows users to export a table to Google Sheets.

In a blog, Google leaders overseeing Bard say the improvements will make the generative AI chatbot better at answering questions such as:

What are the prime factors of 15683615?
Calculate the growth rate of my savings
Reverse the word “Lollipop” for me

In the blog, Google says large language models (LLMs) are like prediction engines. Essentially, LLMs generate a response to prompts by predicting what words are likely to come next.

“As a result, they’ve been extremely capable on language and creative tasks, but weaker in areas like reasoning and math,” write Google Bard leaders. “In order to help solve more complex problems with advanced reasoning and logic capabilities, relying solely on LLM output isn’t enough.”

This new method, however, allows Bard to generate and execute code to boost its reasoning and math abilities.

According to Google, this approach is inspired from “a well-studied dichotomy in human intelligence, notably covered in Daniel Kahneman’s book “Thinking, Fast and Slow” — the separation of “System 1” and “System 2” thinking.

“System 1 thinking is fast, intuitive and effortless,” the Bard experts write. “When a jazz musician improvises on the spot or a touch-typer thinks about a word and watches it appear on the screen, they’re using System 1 thinking. System 2 thinking, by contrast, is slow, deliberate and effortful. When you’re carrying out long division or learning how to play an instrument, you’re using System 2.”

LLMs have been essentially operating under System 1, producing responses quickly but without deep thought, leading to some issues like trying to solve complex math problems.

Meanwhile, traditional computation more closely aligns with System 2 thinking as it is formulaic and flexible, but can produce impressive results with the “right sequence of steps,” Google says.

With the latest update, Google is combining the capabilities of both LLMs and traditional code – which it compared to combining System 1 and System 2 thinking.

“Through implicit code execution, Bard identifies prompts that might benefit from logical code, writes it “under the hood,” executes it and uses the result to generate a more accurate response,” Google says. “So far, we’ve seen this method improve the accuracy of Bard’s responses to computation-based word and math problems in our internal challenge datasets by approximately 30%.”

If you enjoyed this article and want to receive more valuable industry content like this, click here to sign up for our digital newsletters!