Informatica Offers Data Integration Links To Hadoop, Social Media Networks


Data integration software vendor Informatica is shipping a new release of its flagship product designed to handle the "big data" generated by today's transaction processing and social media systems.

Informatica 9.1 also supports Hadoop, the open-source framework for distributed, data-intensive applications.

"Big is about more than just volume," said James Markarian, Informatica CTO, in a phone interview. "It's the unprecedented velocity, complexity and variety of data. The problem is more than just speeds and feeds."

IT departments today are wrestling with large amounts of transaction data, data generated by social media systems such as Twitter, sensor devices such as RFID tags, and data from large-scale Hadoop-based systems.

Informatica 9.1 provides adaptors for connectivity to social networking services such as Twitter, Facebook and LinkedIn, allowing businesses to collect data generated by those networks. And 9.1 features a new adaptor for connectivity to the Hadoop file system for moving data into Hadoop for processing and back out for storage and analysis.

The software also offers new connectors to such data sources as the Greenplum, Vertica and Netezza databases.

The new release provides two styles of master data management: Just-in-time Registry style and point-in-time Hub style. New data quality alerting capabilities and the ability to reuse data quality policies across data profiling, data cleansing and MDM projects also make data more authoritative and trustworthy, Markarian said,

Also new are self-service facilities tailored for specific roles such as IT analysts, developers, business users, data stewards and project owners. The new Point-of-Use Data capability, for example, lets business users enrich business applications with data using embedded MDM controls. Application-Aware Accelerators let project owners reuse packaged application metadata to speed up data integration projects.

Informatica 9.1 also offers new adaptive data services, including multi-protocol data provisioning for data virtualization, integrated data quality for data governance, and policy-driven enforcement for data governance.