Skip to content
Advertisement

Alternative to BigQuery for medium-sized data

This is a follow-up to the question Why doesn’t BigQuery perform as well on small data sets.

Let’s suppose I have a data-set that is ~1M rows. In the current database that we’re using (mysql) aggregation queries would run quite slow, perhaps taking ~10s or so on complex aggregations. On BigQuery, the initialization time required might make this query take ~3 seconds, better than in mysql, but the wrong tool for the job, if we need to return queries in 1s or under.

My question then is, what would be a good alternative to using BigQuery on doing aggregated queries on moderate-sized data-sets, such as 1-10M rows? An example query might be:

SELECT studio, territory, count(*)
FROM mytable
GROUP BY studio, territory
ORDER BY count(*) DESC

Possible solutions I’ve thought of are ElasticSearch (https://github.com/NLPchina/elasticsearch-sql) and Redshift (postgres is too slow). What would be a good option here that can be queried via SQL?

Note: I’m not looking for why or how BQ should be used, I’m looking for an alternative for data sets under 10M rows where the query can be returned in under ~1s.

Advertisement

Answer

Here are a few alternatives to consider for data of this size:

  1. Single Redshift small SSD node
    • No setup. Easily returns answers on this much data in under 1s. 
  2. Greenplum on a small T2 instance
    • Postgres-like. Similar perf to Redshift. Not paying for storage you won’t need. Start with their single node “sandbox” AMI.
  3. MariaDB Columnstore
    • MySQL-like. Used to be called InfiniDB. Very good performance. Supported by MariaDB (the company).
  4. Apache Drill
    • Drill has a very similar philosophy to BiqQuery but can be used to anywhere (it’s just a jar). Queries will be fast on this size data.

If low admin / quick start is critical go with Redshift. If money / flexibility is critical start with Drill. If you prefer MySQL start with MariaDB Columnstore.

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement