Business Intelligence & Analytics Articles

Qubole Review – Facebook Power for Your Data

by Rakesh Sharma
Published on 4 September 2013

What if you had the power of Facebook's data infrastructure at your disposal? This week's solution does just that. Headed by the former head of Facebook's data infrastructure team - Ashish Thusoo - Qubole is a Big Data analytics software and managed big data service rolled into one.

In this week's Qubole review, let's look at its features, interface, and see how it can be of use to you.

A Managed Data Service Company

Founded in 2011, Qubole is designed to explore, model, integrate and manage data. It builds data pipelines to your data stores enabling you to manage and process large amounts of data in one go. The main idea behind Qubole is to analyze tons and tons of data, says Gil Allouche, vice president of marketing at the company.

According to him, there are three parts to the solution:

  • Managed Cluster
  • Data Connectors Library
  • Graphical User Interface

The first part consists of a scalable infrastructure - this means that you can scale your data needs based on the state of your business. "Technical organizations don't have to calculate and know their data needs ahead of time, but rather connect to the data and let QDS figure out the right resources for the task" says Allouche. Complementing the scalable infrastructure is the company's simple graphical user interface, which focuses on simplicity and usability.


Apart from making it easier for you to navigate through multiple menu structures, Qubole's graphical user interface also makes it easier for you to multitask by configuring multiple jobs through the Qubole Data System (QDS). Finally, the library of connectors within the application makes it extensible, thus, enabling you to connect with other cloud-based applications and databases (such as Google Analytics data, AppNexus data, S3 data, MongoDB and many others).


According to Allouche, regular companies with a significant technology component and data crunching operations use Qubole. The main benefit of using Qubole for such companies is that they do not have to worry about infrastructure. Instead, they can focus on data crunching operations such as building algorithms and financial models.

Working With Qubole

The good thing about Qubole is that it uses popular open source technology frameworks such as Hadoop and Hive. This makes it easier for companies to get off the ground running. The service's browser-based front-end interface is known as QPal. There are multiple tabs in the front end. For example, the composer tab enables you to run multiple commands across a range of platforms from Hive queries to Hadoop jobs to workflows. These commands can be run in multiple modes such as test run (which limits size of resulting data sets), constrained run (which runs the test between pre-defined time intervals). The Cluster tab provides you with the health of multiple clusters. Thus, you can run multiple jobs at the same time and track results for each job. You can also view schedules of multiple operations related to data retrieval and manipulation using the Schedules tab.

The Control Panel helps you manage account settings. These settings are both professional and related to your data computation. For example, in addition to changing regular settings such as changing account username and passwords, you can also change multiple, other details such as updating type of computation and details of your settings for Amazon Web Services (AWS). Finally, you can also configure commands in sequence as part of a process using the Workflow tab. As an example, you can combine a Hive Query, Data Import, Data Export, and Hadoop operation into one workflow.

One of the more interesting features of the solution is its integration with multiple data stores. Known as DBTap, this functionality enables you to work with a broader set of databases other than Amazon. DBTap work with five endpoints ranging from the commonly-used MySQL to MongoDB, a database that is fast becoming popular amongst startups.

Pros & Cons

The biggest pro in favor of Qubole are its antecedents. Because they have held leadership positions at reputed companies such as Facebook and Oracle, Qubole's founders have brought the same level of technology expertise and orientation towards this solution. The solution is packed with features and integrations that make it easier for organizations that would like to invest in big data solutions but do not have the financial resources to find a team that puts it together. Qubole's SaaS pricing further makes it an attractive proposition because it puts this computing power within reach of startups and small businesses.


Qubole has four pricing tiers, ranging from free to enterprise. While pricing for the first three tiers are fixed, the enterprise tier has flexible pricing that is custom-fit to an organization's needs.

Each tier comes with its own set of features. For example, the free version of Qubole provides you with a single user license, the opportunity to scale API for further integrations, a limited number of invocations, and built-in connectors. The Enterprise solution, on the other hand, has unlimited invocations and user licenses.

The Bottom Line

I would definitely recommend the solution for organizations with websites that have great Big data analysis needs. For those - Qubole helps you make sense of that data and monetize it. It is, as described in another discussion, a "turnkey cloud solution" to help you sift through your data needs and make sense of large amounts of multi-structured data from different sources.


Apps mentioned in this article