Boston startup's search engine uses AI to make dense search more accessible - The Boston Globe

Boston startup’s search engine uses AI to make dense search more accessible – The Boston Globe

“We really built this for people like us,” said Christian Salem, who played quarterback at Northwestern University in suburban Chicago from 2012-2016 and earned a degree in economics. “We are tech professionals who secretly want to be scientists. . . but have neither the time nor the attention needed to read scientific articles.

Co-founder Eric Olson, a Sudbury native, played right tackle at Northwestern and earned a master’s degree in predictive analytics. Olson then worked as a data scientist at DraftKings, a Boston-based sports betting company, while Salem became a product manager for the National Football League.

Olson and Salem aren’t the first to try their hand at a scientific search engine. Since 2015, the Allen Institute for AI in Seattle has operated Semantic Scholar, a website that summarizes the content of academic research articles.

And in mid-November, tech titan Meta, Facebook’s parent, launched a public beta of a tool called Galactica that had been trained to summarize 48 million academic papers. But Galactica was shut down after just two days because its results were often meaningless gibberish.

“At launch, we pointed out the limitations of large language models such as Galactica, including the potential for it to generate inaccurate and unreliable output,” Meta said in a statement. “Given the propensity of large language models such as Galactica to generate text that may seem authentic, but is inaccurate. . . we have chosen to remove the demo from public availability.

Carl Bergstrom, a biology professor at the University of Washington who tried, said the Galactica AI often produced answers that sounded believable but were completely wrong. “If you give him something he doesn’t know,” Bergstrom said, “he just makes stuff up out of all the fabric.”

Instead of trying to summarize academic papers, Consensus simply identifies and highlights key findings. Bergstrom said the results are much more useful. “I really, really, really don’t like Galactica,” he said, “and I love it.”

Olson and Salem had been thinking about a scientific search engine for years. But they got serious when they saw how easily a traditional search engine like Google could spread misinformation. The problem, Olson said, is that Google’s algorithm tends to favor the most popular internet sources, not the most trusted ones.

“Looking for what the experts think. . . is really, really, really hard,” Olson said. “Google just isn’t built to do that for us.”

Meanwhile, Olson and Salem realized that AI systems had become much more powerful in recent years and were now able to understand words and phrases in their larger contexts. This convinced them that an AI could learn to break a document down into its key sections and display only the most important parts.

In 2021, the pair hired three software engineers and raised $1.3 million in funding. The biggest chunk came from Winklevoss Capital, a company founded by Tyler and Cameron Winklevoss, best known for their bitter legal row with Meta chief executive Mark Zuckerberg over the founding of Facebook.

Olson and Salem started with a pre-existing AI model that had already been trained on scientific papers written by humans. Then they hired scientists to read and annotate about 100,000 academic papers in various fields. These papers and the scientists’ notes were then used to train the AI ​​to recognize specific features found in all academic papers, especially the parts that summarize the authors’ conclusions.

The Consensus search engine is linked to a database of 200 million academic articles in the public domain. When a user asks a question, Consensus takes about five seconds to respond with a long list of excerpts from scientific journal articles. The system does not seek to provide simple answers, but to give the user a range of academic research on the subject. Ask Consensus “is nuclear power safe?” and it serves up quotes from several articles, some pro-nuclear and some more skeptical. But all quotes are from reputable academic publications, rather than from knowledgeable amateurs.

At this time, Consensus cannot access several million scholarly articles locked behind paywalls by subscription-only publishers. Salem said Consensus plans to work with publishers to negotiate a solution. Additionally, the Biden administration announced in August that all research papers produced with federal funding must be made available to the public beginning in 2025.

Since Consensus opened to the public in September, around 15,000 people have logged on. Salem said the service gets many requests from health and fitness enthusiasts, parents looking for advice on raising their kids, and students looking for homework help. Salem and Olson plan to offer a premium version that will provide more detailed summaries of each article, such as information about the organizations that funded the research. But right now, Olson said, “we’re totally free and we’d like some of the product to stay free forever.”

Hiawatha Bray can be contacted at Follow him on Twitter @GlobeTechLab.

#Boston #startups #search #engine #dense #search #accessible #Boston #Globe

Leave a Comment

Your email address will not be published. Required fields are marked *