15 Big Data Technology Developments You Should Know About

Big Doings In Big Data

It's been a busy couple of weeks in the big data universe as startups and established companies launch new business analytics and data management products and unveil updates to existing products with new features and capabilities.

While these technologies are all over the map, some common themes run throughout them: providing business users with easier access to more data, better ways to manage huge volumes of data and prepare it for analysis, and working with emerging big data technologies like Spark, HAWQ and Geode.

Here's a look at 15 big data announcements that caught our eye. Many, but not all, were unveiled at the recent Strata + Hadoop World in San Jose, Calif.

Altiscale Insight Cloud

Altiscale, a provider of Big Data-as-a-Service, recently launched Altiscale Insight Cloud, a self-service analytics service that allows business analysts to rapidly query a data lake using familiar BI tools like Tableau and Excel, without heavy involvement from the IT department.

The Altiscal Insight Cloud can power SQL queries, dynamic visualizations, real-time dashboards and other reporting and analytics capabilities, according to the company. It eliminates the need for a separate relational data store for aggregated data, bypassing the need for expensive, proprietary data warehouse systems.

Altiscale also established a strategic alliance with Tableau through which Altiscale customers can use Tableau's data visualization software in conjunction with Altiscale's services for data discovery applications.

AtScale Intelligence Platform 4.0

AtScale's software provides a way to use popular business intelligence tools, including Tableau and Qlik, to access data stored in Hadoop clusters. The software creates a semantic layer between Hadoop and third-party tools, essentially turning Hadoop into an online analytical processing server that can be tapped for multidimensional analysis.

The 4.0 release offers some 100 new features and system enhancements, many relating to enterprise security and performance.

Topping the list is the new AtScale Hybrid Query Service that natively supports both SQL and MDX query languages used by business intelligence tools. Since many businesses and organizations use multiple BI tools throughout their organization, AtScale's support for both SQL and MDX means those businesses don't have to load new client software or custom drivers to users' computers.

BlueData EPIC Spring Release

BlueData's EPIC is a Big Data-as-a-Service platform that's designed to reduce the complexity of implementing big data technologies such as Hadoop and Spark.

The spring release of EPIC (Elastic Private Instant Clusters) offers dozens of new features and functionality enhancements, including a number that improve the service's security and quality of service for multitenant deployments. The list includes more granular resource management controls, QoS-based allocation, performance optimization, and quota enforcement for multitenant deployments.

The new release also supports a greater range of big data applications and tools, including Cloudera Navigator and Ranger for data governance and security administration, HAWQ for massively parallel processing analytics, and Geode, Cassandra and Kafka for real-time analytics.

Domo Business Cloud

The Domo Business Cloud is an ecosystem of business management applications, including free and premium prepackaged software, that provide decision makers with data, insights and access to people -- all to help find answers to business questions specific to their role or industry. The company is offering a free edition of the cloud with users paying to store more data or access more advanced enterprise functionality such as administrative controls.

Domo also recently launched the Domo Appstore with more than 1,000 business management applications that expand the Business Cloud ecosystem, and started the Domo App Publisher Partner program for third-party ISVs to develop software for the Business Cloud.

Domo also debuted Buzz, a social collaboration platform that works with the Business Cloud (see image), and Domo Mobile for accessing the Business Cloud through any mobile device.

Kyvos Insights With Azure HDInsight

Kyvos Insights' flagship product is a massively scalable online analytical processing (OLAP) system that runs on Hadoop systems and allows business users to visualize, explore and analyze big data that's stored within Hadoop.

The Kyvos software now works with Azure HDInsight, Microsoft's cloud-based Hadoop platform, making it possible for Azure HDInsight users to deploy Kyvos for analytical tasks.

Looker Blocks For IBM Cloud Data Services

Looker's Web-based business intelligence platform provides access to data that resides either in a database or in the cloud. Last year the company debuted reusable, customizable components of business logic called Looker Blocks that can be assembled to create complete business analysis queries.

Looker has formed a partnership with IBM to develop a suite of Looker Blocks that can be used to simplify and customize data analysis for businesses utilizing IBM's Cloud Data Services. The combination will allow customers to deploy a full data platform in days, according to the two companies.

MapD Technologies GPU-Powered Database

MapD Technologies debuted its new database and visual analytics software that use graphical processing unit chips to help data analysts interactively explore large data sets.

By tapping into the power of GPUs, the database can process queries to be executed in parallel across nearly 40,000 cores per server, according to the company, offering faster performance than leading in-memory databases. Using the database with the MapD Immerse analytics front-end tool delivers rapid visual insights into complex data sets (such as data about political donations, seen here in this map).

MemSQL 5

MemSQL develops its namesake database for transactions and real-time analytics. The company's latest release, MemSQL 5, offers a range of new technologies and enhanced capabilities to improve the software's database, data warehouse and streaming workload performance.

With the new release, transactions and analytics can be merged into a single database through the hybrid transaction/analytical processing technology that supports OLTP and OLAP queries. Users can perform real-time queries under heavy write loads. The new release utilizes pluggable authentication modules with tools like Kerberos for improved security. And users can deploy Apache Spark through the use of MemSQL Streamliner to create real-time data pipelines through a graphical user interface and eliminate batch ETL tasks.

Paxata Spring '16 Release

Paxata's Adaptive Data Preparation platform, built on Apache Spark and optimized to run in Hadoop environments, provides data integration, data quality, semantic enrichment, collaboration and governance capabilities.

The latest release improves the software's ability to provide users with connected information through advanced filtergrams for comprehensive data profiling, granular searching across columns of wide data sets, new options for data discovery with statistical selections, and integration of complex nested JSON/XML data with Hadoop compressed files.

The release also includes new IT controls to improve system governance, security and scale.

Platfora Big Data Discovery 5.2

Platfora is a big data discovery platform that's built natively on Apache Hadoop and Spark. The new release "democratizes" big data by making it easier to use existing business intelligence tools to access Hadoop data.

The new release provides native Tableau integration by directly exporting prepared and enriched data in TDE (Tableau Data Extract) format to Tableau Desktop and Tableau Server. Other front-end BI tools can access Platfora data through lens-accelerated SQL, processing queries via SparkSQL and ODBC.

Platfora 5.2 also runs directly on Hadoop clusters, in addition to the traditional dedicated configuration, making it easier to utilize existing hardware and repurpose computing resources. And Platfora Vizboards for data visualization have been enhanced with "smarter" default visualizations.

Ryft One Cluster

Ryft Systems develops hardware/software appliances that use the parallel processing power of FPGA processors to accelerate terabyte- and petabyte-scale data analysis.

The new Ryft One Cluster uses a hybrid FPGA/x86 compute architecture with an open API that the company said can accelerate big data ecosystems by a factor of 100 while reducing costs by 70 percent.

The new system scales data analysis performance and storage linearly, processing analytics to 200-plus GB per second. It can operate as a standalone cluster or as part of an existing Apache Spark system or other big data ecosystem.

Tableau 9.3

Tableau announced the general availability of Tableau 9.3, the latest edition of its popular data visualization software with an "always connected" desktop feature and a connection to the Snowflake Elastic Data warehouse.

The "always connected" feature in Tableau Desktop makes it easier to share results with others while staying within the flow of the analysis, according to the company. The software's global map coverage functionality is significantly improved and new data has been added to its Geocoding Database and Tableau Map Service.

The native connectivity to Snowflake Computing's cloud data warehouse system in the 9.3 edition makes it easy for Tableau users to perform simple and complex data exploration and analysis (see graphic). The two companies also will collaborate on ways to help customers move their business analytics processes to the cloud.

Talena ActiveRx

Telena's software is used to optimize data across test and development, backup and recovery, archive and compliance, and disaster recovery systems.

Talena recently unveiled ActiveRx, new predictive analytics software that incorporates machine-learning algorithms and data visualization to better administer big data management workloads and more accurately predict data availability.

ActiveRx software also offers "active copy analytics" capabilities that businesses can use to turn idle backed-up data into useful assets.

Tamr Apache Spark Compatibility

Tamr's data unification platform unifies and enriches a business' data -- from hundreds and even thousands of data sources both inside and outside of an organization -- for analysis.

Tamr announced that its software is compatible with Apache Spark, the in-memory processing engine for scalable machine learning that Tamr said complements its machine-driven approach to enterprise data preparation.

Tamr is also developing open interfaces and core components to support data curation systems powered by Spark.

Trifacta Photon

Trifacta, which develops "data wrangling" software, unveiled the Photon Compute Framework, new technology at the core of its user interface that provides users with a rich, interactive data exploration and transformation experience when working with large, in-memory data sets.

Data wrangling is the process of transforming raw, complex data into clean, structured formats for analysis -- one of the biggest challenges in data analysis processes.

Photon, which is compliant with the Apache Arrow in-memory data structure specification, provides immediate feedback to users as they interact with data content. The Photon engine makes it possible to explore more data at higher levels of computation.