IBM To Corporations: 'Search Me'

Later this week, IBM Software is expected to announce general availability of its DB2 Information Integrator, which it has said will search not only HTML data prevalent on the Web but all the structured and unstructured data that is the lifeblood of corporate IT.

That would include the whole gamut of Microsoft Word documents, Excel spreadsheets, PDF files and calendar entries that fuel business activity, observers said.

The software, code-named Masala, has been in beta since June and the company has previously said it would ship in the fourth quarter. IBM would not comment for this story.

IBM Software execs have long said that a truly effective corporate search engine needs to handle both the rows and columns of information in structured databases as well as the reams of free-form data in desktop applications. Unlike Web-based documents, typically in HTML format, these internal documents are not usually interlinked and thus Google's relevancy engine is not a factor, observers said.

id
unit-1659132512259
type
Sponsored post

Google, the current Internet search kingpin, scours terabytes of Internet data but the huge bulk of that is in HTML format, observers said. A Google spokesman, however, said the company's technology searches 12 main file formats, including HTML, Acrobat/PDF and Microsoft Office as long as the relevant documents are posted to the Web.

It was unclear what IBM will charge for the latest DB2 Information Integrator, but the current version starts at $5,000 per CPU for the replication edition, those users who want to replicate data across multiple databases. The standard edition with additional federation capabilities runs $15,000 per CPU. An advanced edition, which includes all of the above plus the core IBM DB2 database, costs $40,000 per CPU. Connectors for tapping into non-IBM repositories are extra, but enterprises also can get an "unlimited" edition, with all necessary connectors, for $125,000 per CPU.