A startup company called Powerset gained a slew of headlines last week when it launched a beta version of its search engine, which like other offerings employs natural language processing, allowing users to search sets of information in the form of questions.
But the future of search, particularly within enterprises, will go well beyond processing queries or parsing content. Future search systems will get to know the user -- and communities of users -- as much as the content it crawls, analyzes and indexes, observers say.
"Relevance is in the eye of the beholder -- what's relevant for me may not be relevant for you. Consequently, what's needed is a profile of the user (interests, vocabulary, previous searches, job title, etc.) and a profile of the content (author, subject, date, who's read it, etc.) Great search matches the two up," said Guy Creese, an analyst with Burton Group, via e-mail.
"To do that, these profiles need to be equally sophisticated. Enterprise search vendors for a long time have spent a lot of effort on profiling content, but not profiling users. This will change over time, as systems such as Amazon.com make it clear that knowing a lot about the user makes it easier to find and suggest relevant content."
For example, Creese said, if a user was a network engineer and entered "ATM" as a query, a smart search system could rank results for "asynchronous transfer mode" more highly than "automated teller machine."
While many companies have a role to play and products that work, Google is the company to watch in the long term if you want to know where enterprise search is headed, according to analyst Stephen Arnold.
[ See related story: Could Google's 'dataspaces' reshape search? ]
"When you hear the big companies saying, we are doing an enterprise solution and Google isn't a problem, you have to ask yourself, are these guys connected to reality?" he said during a recent speech at the Infonortics Search Engine Meeting in Boston. "Buying into the Wall Street crowd's [contention] that this is an advertising company is crazy."
In the meantime, the search market has fragmented into a few distinct size classes, analysts say: offerings from major vendors like IBM, Oracle and with its recent acquisition of FAST Search & Transfer, Microsoft; larger independents such as Autonomy; and smaller, specialized vendors.
Arnold recently wrote a nearly 300-page study for Gilbane Group, "Beyond Search," that takes a deep dive into the facets of the enterprise search market. While in terms of size, search-focused companies are spread among only a handful of categories, but they vary widely in terms of their technological focus. These are among the sub-segments Arnold identified:
Database-centric systems, such as Teratext and Intelligenx. "Because of this, these systems are adept at handling data management, content repurposing, and generating reports from the content that reside in the system's database," he wrote.
Companies involved in "deep analysis" of content, which include Attensity and Siderean Software. "The use of multiple processes in iterative cascades point to the direction search and content processing is moving. Simple key word indexing is a Model-T Ford to these vendors' finely tuned machines."
"Tools" companies like SchemaLogic sell software that helps customers organize and prepare their content to be searched, according to Arnold. "Most licensees of search systems don't know what they don't know," he wrote. "Once you have some experience with behind-the-firewall search, you have a better understanding of the importance of controlling and managing metadata."
There are also "building block," "linguistic processing" and "pattern analysis" vendors, Arnold wrote.
Though a plethora of companies are vying for market share, there may be plenty to go around. Analyst firm Gartner recently predicted search technology will locate and analyze more than 90 per cent of the data in more than half of the Global 2000 by the end of 2012.