Tech 10: Big Developments In Big Data

Information Please

The big data space remains one of the most fast-moving in the IT industry with rapid product updates and improvements from established companies and a continuous wave of innovative products from startups.

Many of the most recent big data developments are around the Hadoop platform, including better ways to access and use data stored in Hadoop, and the Apache Spark engine that's revolutionizing big data processing. Cloud-based business analytics is hot, as are new technologies for integrating data from disparate sources.

Here's 10 new products and product upgrades that solution providers should be aware of.

1. Hortonworks DataFlow

Hortonworks DataFlow provides a way to collect and curate "data in motion," information streams from Internet of Things devices such as sensors, geo-location devices, machines and even social networks, and load it into Hadoop or other data management systems for real-time analysis. DataFlow is based on technology originally developed by the National Security Agency and is now an Apache Software Foundation project: Hortonworks recently acquired Onyara, which sold a commercial version of the technology.

2. Alteryx Analytics 10.0

Alteryx Analytics data integration and advanced analytics software gives line-of-business analysts tools to access, blend and analyze data without help from data scientists or the IT department. The new release supports a wider range of database systems with its ability to shift data blending and processing into the database, rather than computer memory, for faster processing.

3. Birst Networked BI

Developed on Birst's multitenant cloud architecture, the Birst Networked BI technology creates a network of business intelligence instances that share a common analytical fabric that can be used across departments and multiple geographic regions -- essentially a virtualized BI system that users can apply to their local data. Birst's approach provides a level of self-service capabilities to users while complying with corporate data governance policies. It also eliminates data silos and reporting chaos that local business intelligence initiatives create.

4. Platfora Big Data Discovery 5.0

Platfora's Big Data Discovery is a Hadoop- and Spark-native analytics platform that helps business users and data scientists visually interact with petabyte-scale data for customer, security and Internet of Things analytics. The company's new Big Data Discovery 5.0 release includes advanced self-service data preparation capabilities that reduce the amount of time it takes business teams to ready data sets for analysis. It supports SQL language data transformations and Microsoft Excel for users who need results in Excel format.

5. Arcadia Data Enterprise

Arcadia Data Enterprise is a unified data discovery, visual analytics and business intelligence platform that runs natively in Hadoop. The software can handle thousands of simultaneous queries and lets business analysts and information workers analyze billions of data records within Hadoop. Arcadia eliminates the need for intermediate data stack technologies and appliances such as data warehouses, OLAP servers and data marts. The company began offering a free, downloadable version of the visual analytics portion of the product in June.

6. Talend 6

Talend 6 introduces native support for Apache Spark and Spark Streaming in the new Talend Real-Time Big Data software. That boosts the speed of the platform's integration capabilities for realtime analytical applications: Talend said converting MapReduce jobs to Spark provides a 5X performance increase. Release 6 also has a built-in Lambda Architecture that creates a single environment for working with bulk and batch, real-time, streaming and Internet of Things data.

7. Tableau 9.1

The new release of the Tableau business analytics software offers a completely redesigned mobile application that makes it easier for users to find, interact with and manage business intelligence content from mobile devices. Also new is a Web data connector that developers use to connect Tableau to a greater number of data sources including Facebook, Twitter and Google Sheets. And Tableau 9.1 provides native connectors to Google Cloud SQL, Amazon Aurora and Microsoft Azure SQL Data Warehouse.

8. Looker Blocks

Looker's Web-based business intelligence platform provides access to data that resides either in a database or in the cloud. The new Looker Blocks are reusable, customizable components of business logic, such as churn prediction or lifetime value metrics, that can be assembled to create complete business analysis queries and speed up the business analytics process. The blocks can be used to analyze a business' sales funnel, for example, monitor customer relationship health, optimize an online storefront or conduct sophisticated Web analytics.

9. Trifacta v3

Trifacta develops software for "data wrangling," the company's term for transforming raw, complex data into clean data with structured formats for analysis. Trifacta v3 offers enhanced capabilities to meet enterprise data governance requirements in security, metadata and data lineage. The release improves on the product's user interface, including new "transformation suggestion cards" that provide visual representations of data transformation suggestions. Trifacta v3 also expands connectivity to additional data sources as Amazon Web Services S3, XLS files and Hive.

10. Altiscale Data Cloud 4.0

Altiscale provides a Big Data-as-a-Service for data scientists and application developers based on the Hadoop platform. Altiscale Data Cloud 4.0 features major upgrades to core Hadoop components such as HDFS and YARN. Also new is a comprehensive Spark-as-a-Service capability that supports all major versions of the Apache Spark cluster computing framework for processing large volumes of data.