The Google Sheets Query Function A weapon of mass laziness

The Google Sheets Query replaces so many other spreadsheet functions it’s not even funny – FILTERs, AVERAGEs, and SUMs all go out the window when it enters the picture.

There are a few nuances to queries though, so let’s go step-by-step.  This post contains eight videos, with cliff’s notes embedded beneath each:

Let’s dive in!

If you’re already familiar with basic Sheets querying, and looking to master it – I’d recommend heading straight to the CIFL Querying course.  It includes detailed walkthroughs and quizzes that you won’t find here.

Why we Query

QUERY combines all of the capabilities of arithmetic functions (SUM, COUNT, AVERAGE) with the filtering abilities of a function like FILTER.

In this example, we’ll walk through a few QUERY examples, to analyze a sampling of Twitter data.

The basic structure of a QUERY is to take a data range and a query string:

=query( ‘data tab’!A:C, “select A where C = ‘twitter’”)

Let’s answer a few questions, all using QUERY functions:

Q: What was the average number of retweets from the sample?

A: =query(D1:G, “select avg(G)”)

Q: What was the average number of retweets on Tweets that contained an @mention but weren’t a retweet?

A: =query(D1:G, “select avg(G) where E contains ‘@’ and not E contains ‘RT’”)

Q: How many retweets did each user in the list receive?

A: =query(D1:G, “select F, sum(G) where E != ‘’ group by F label sum(G) ‘total retweets’”)

For a full cheat sheet on QUERY syntax, grab a copy of the CIFL Google Sheets formula guide or check out the official docs.

Date comparisons in Google Sheets queries: why so complex?

When I first started writing date comparisons within Google queries (pulling data before / after a certain date), it was a pain to figure out the correct syntax.

pasted image 0 2

Unfortunately it’s not really mentioned anywhere in the Google Sheets documentation, so I had to figure it out for myself.

You have to do two things correctly:

  1. Write the string ‘date’ before the actual date, to declare your intentions
  2. Type your date in the format ‘yyyy-mm-dd’

If you want to compare a date dynamically from another cell, it’s a bit of a different story.  I like to keep date ranges (last 7 days, last 30 days, year-to-date, etc), then reference them in a query like this:

pasted image 0 6

Notice the syntax:

  1. To combine the string with an outside value, close the query double-quotes, and use ‘&’
  2. Use the TEXT function to convert the date into the correct ‘yyyy-mm-dd’ format

Once you get the hang of it, you’ll never think about it again.  But if you do ever forget the syntax, the CIFL cheat sheet has your back.

Debugging common Query errors

Errors are never fun, but Sheets’ error messages will always point you in the right direction:

#REF

1) Array result was not expanded because it would overwrite data in D6

There’s data in cell D6 blocking your query from expanding – remove it!

2) Unresolved sheet name…

The sheet that you’re referencing in your query doesn’t exist – probably due to a typo or having deleted it :(.

#VALUE

3) PARSE_ERROR

Caused by a misformed query string – it will always tell you exactly where to look:

query error

Would mean there’s a syntax error somewhere surround my ‘where’ clause – a hanging comma or a typo:

=query(tweets!A1:D,”select C, sum(D), where D >= 0 group by C limit 2″,0)

4) LABEL_COL_NOT_IN_SELECT

Caused by a mismatch between columns in your ‘select’ clause and your ‘label’ clause:

=query(tweets!A1:D,”select C, sum(D) where D >= 0 group by C limit 2 label sum(C) ””,0)

Here sum(D) is mislabeled as sum(C).

5) NO_COLUMN

When you’re selecting a column that’s outside of your range:

=query(tweets!A1:C,”select C, sum(D) where D >= 0 group by C limit 2″,0)

Remedied by expanding your data range to include the column (tweets!A1:D in this case).

6) ADD_COL_TO_GROUP_BY_OR_AGG

Pops up when you’re selecting a column (B in this case) that isn’t present in your ‘group by’ clause:

=query(tweets!A1:D,”select C, B, sum(D) where D >= 0 group by C limit 2″,0)

If you’re not summing or aggregating column B, then you have to group by it.

#N/A

7) Query completed with an empty output.

Simply means there’s no results from your query as currently written – potentially due to a too-restrictive condition:

=query(tweets!A2:D,”select C, D where D = 100 limit 2″,0)

Try simplifying your query to be less restrictive (removing the D = 100 in this case).

Query multiple Google Sheets: importrange + {} = friends

There comes a time in every life, where you want to combine two data ranges within a Google Sheets query.

Let’s talk about how to combine data ranges from within the same spreadsheet (or from two different sheets), to run one query – working through examples using sample Twitter data.

To query a combined range from within the same sheet:

pasted image 0

  1. Enclose both ranges within { } to combine them, and separate with a ‘;’ to stack one on top of the other
  2. Instead of referencing columns with their letter (A, B), refer to them as Col1, Col2 depending on their order

To query a combined range from different sheets:

  1. First run the importrange function on the outside Sheet you’d like to pull from.  The first element is the spreadsheet ID, which is contained within the URL of the sheet (https://docs.google.com/spreadsheets/d/1ekTIZ40VVT9p9kE80wNaiy6Y2xslo4K9KVmmg-Dx654/edit).pasted image 0 3
  2. You’ll be asked to allow access to the sheet.pasted image 0 7
  3. Embed the range from your importrange function within the query itself.

pasted image 0 4

To be honest, my goal is to never have to combine ranges from within a query itself (I prefer to do that type of aggregation before writing a query), but sometimes it’s unavoidable.

Pivot tables + query for time series reporting

If you’re working with time-series data in Sheets (like most of us do), then pivoting in queries will be your new favorite move.

It lets you build full report in one Query, by grouping values down the left column, and transposing date ranges (or any second column) to the right.

pivot tables in google sheets query

You pivot with a very similar syntax to grouping – this query produces the result above:

=query(tweets!A1:D,”select C, sum(D) where D >= 0 group by C pivot A”,0)

As you can see, it’s as simple as adding ‘pivot A’ after the ‘group by’ clause in a query.

When you’re building a report in Google Sheets, a solid goal is to not make the people *using* the report think.

Dropdown menus are great for this, since they allow you to present a limited number of options to report users.

google sheets query dropdown menu

Data validations in Sheets automatically create dropdown menus when you create them – pretty slick right?

And blending the selected value from your dropdown menu with a Query is simple:

=query(data!A:B, “select A where B = ‘” & ‘dropdown tab location’!C2 & “‘”, 0)

It only takes a few keystrokes:

  1. Wrap your dropdown value in single quotes (”), if it’s a string (none if it’s a number)
  2. Close (and then re-open) your query with double quotes (“”)
  3. Use ampersands (&) to add the dropdown value to the query string
  4. Reference the actual cell of the dropdown value (‘data’!A2) to add it to the query string

That’s it! Your report users will thank you.

Google Sheets QUERY vs SQL: what’s the difference?

If you’re used to writing SQL queries, it can be off-putting at first to work with Google Sheets queries (and vice-versa).

If you don’t know any SQL yet, learn Query first.  It’ll make working with databases *much* easier when you get there.

Let’s walk through a number of ways in which they’re similar & different, and how you can adapt each of them in your work.

Basic querying

SQL: SELECT * FROM table WHERE column_name = ‘xyz’ ORDER BY column_name desc LIMIT 10
Sheets: =query( ‘tab’!A:D, “SELECT * WHERE A = ‘xyz’ ORDER BY A desc LIMIT 10”)

The basic query syntax is roughly the same.  Google Sheets queries use the same SELECT statement to choose columns, WHERE / AND / OR to set logic, ORDER BY to arrange results, and LIMIT to pull only a certain number of results (see a full list of language clauses in the Google docs).

The key difference is that there’s no ‘FROM’ statement in a Sheets query – instead of referencing a table, you reference a range of cells (‘tab’!A:D in this example).

Also, you’ll notice that, instead of referencing column names (like in SQL), in Sheets queries you reference columns by their letter (A, B, C, etc) or by their order (Col1, Col2, Col3).

Doing math on columns

SQL: SELECT column_name, sum(other_column_name) FROM table GROUP BY column_name
Sheets: =query( ‘tab’!A:D, “SELECT A, sum(B) GROUP BY A”)

The same avg(), count(), sum(), max(), min() aggregations functions that you’re used to using in SQL are available in Sheets queries, as are basic +, -, *, / arithmetic functions.

Labeling columns

SQL: SELECT column_name AS “label” FROM table
Sheets: =query( ‘tab’!A:D, “SELECT A, B LABEL A ‘label1’, B ‘label2’ “)

In SQL, to label a column you simply add an ‘AS’ to your ‘SELECT’ statement: column_name AS “label.”

In Sheets, this is done at the end of a query, with the ‘label’ statement.

Labeling the sum(A) column as blank removes the automatic sum() header from being displayed, returning a nice clean single metric.

Date comparisons

SQL:        SELECT column_name FROM table WHERE date_column > ‘8/22/2016’
Sheets: =query( ‘tab’!A:D, “SELECT A WHERE C > date ‘2016-08-22’ “)

Dates in Google queries are tricky to figure out the first time, if you’re used to working in SQL.

Unlike SQL, which can handle multiple date formats (‘20160826’, ‘8/26/2016’, ‘2016-08-26’), Sheets can only take dates in format ‘yyyy-mm-dd’ – so you sometimes have to use the TEXT function to transform a date like 8/26/2016 into 2016-08-26.

In Sheets, you also have to specify that you’re comparing a date, by adding the phrase ‘date’ before your date string.

Fuzzy phrase matching

SQL: SELECT column_name FROM table WHERE column_name LIKE “%phrase%”
Sheets: =query( ‘tab’!A:D, “SELECT A WHERE A CONTAINS ‘value’ “)

To pull text that contains a value, you can simply use ‘contains,’ versus the ‘like’ statement in SQL.  Sheets queries do have a ‘like’ statement, but I find myself using ‘contains’ much more frequently because of its simplicity.

One added wrinkle though, if you’re looking to add a ‘not contains’ parameter, you have to place the ‘not’ before the column letter.

Joining tables

SQL: SELECT a.column_name, b.column_name FROM table1 a, table2 b WHERE a.column_name = b.column_name
Sheets: Not supported 🙁

My biggest pet peeve with Sheets queries is that they don’t support joining two tables together by a unique key.  I’ve tried every workaround I can think of, and so far haven’t found a way.

Google Sheets team, if you’re reading this – help!

Match from a list of options

SQL: SELECT column_name FROM table WHERE column_name IN (‘match1’, ‘match2’, ‘match3’)
Sheets: =query( ‘tab’!A:D, “SELECT A WHERE A = ‘match1’ OR A = ‘match2’ OR A = ‘match3’ “)

Unfortunately in Google Sheets, there’s no such thing as the ‘IN’ phrase in SQL, so you have to list out values you’re looking to match for one-by-one.

Nesting Sheets queries like SQL

As I mentioned above, not being able to join queries in Sheets is a major drawback.

Say you have data in two different tabs, and you want to pull all of the values from one table that exist in the other.

I thought it wasn’t possible. I was wrong.

It’s a hacky way of imitating the way that SQL queries can nest, but it works.

You combine arrayformula and concatenate, to produce a long string within your query ( OR A = ‘value1’ OR A = ‘value2’, etc).

The final formula looks like this:

=query(tab1!A:Z,”select A,sum(B) where A != ” and (B = ‘blah’ ” & concatenate(arrayformula(” OR B = ‘” & tab2!C2:C & “‘”)) & “) group by B”,0)

This takes all of the values that are present in Tab 2, column C, and looks for them in column B of Tab 1.

It’s a poor man’s nesting query, but it works :).

Phew!

That was a lot of Google Sheets querying, but I hope it’s starting to sink in for you.

Make sure to grab a free copy of the Sheets formula cheat sheet below – but if you’ve made it this far, you’re probably ready to master Sheets by investing a few hours in the advanced CIFL Querying course (which includes detailed walkthroughs and quizzes not found here).

23 Comments

  1. Very nice page! Congratulations!! I love using funcions to create my MVP’s, but this chalange bellow is giving me a hard time. Do you know if there is a limit for query function in google sheets?

    This is working when I request 2220 rows.
    =query(IMPORTRANGE(“Key”;”Nivel 4 Isapa!H1:AB2220″);”SELECT Col1,Col21 WHERE Col21=’A'”)

    This is working when I request 2400 rows it DOES NOT work..
    =query(IMPORTRANGE(“Key”;”Nivel 4 Isapa!H1:AB2220″);”SELECT Col1,Col21 WHERE Col21=’A'”)

    Have you faced something like this?

    • Hey Rubens, I actually have never run into size limitations with queries (beyond the 2mm cell limitation of Google Sheets).

      Have you tried first importing the entire range you’re looking to query into a tab, then running it? Might be a size limitation of the importrange function.

      • You are right, I did found out, the limitation was really on the IMPORTRANGE fuction, it would not bring 47.000 cells. This was exceeding the limits.

        I really love google sheets but I have to find a workaround to reach at least 15.000 rows to work with…

        • One idea – you could nest 3 importrange functions together – place them within { }, and separate the three function calls with ; – ie { importrange1(first 15k rows) ; 2nd 15k ; 3rd 15k }. That’ll achieve the same result of returning ~45000 rows.

  2. Hi there. I’ve been using gsheets for some months and faced how powerful it is. Anyway, surfing around I found your blog and just loved your comparisions and methods.
    I’m one of these guys who like to unlock all resources of a tool before deep into codes.
    I have a suggestion about join tables with query formula. Despite google doesnt support joining tables with query there is actually a way to do a subquery. It is not the same but can help with lots of troubles

    In the sample you gave above …
    ** SQL: SELECT a.column_name, b.column_name FROM table1 a, table2 b WHERE a.column_name = b.column_name
    We can use:
    Sheets: =query( ‘tab’!A:D, “SELECT B WHERE A contains ‘”& query( ‘secondtab’!E:E, “SELECT E WHERE E=1 label E ””) &”‘ “)

    We only have to be careful with spaces, quotation marks, parenthesis and other special caracters.
    I’m pretty sure you will do a better job describing and deploying the function.

    • That’s an interesting approach Allan, thanks for sharing – joining in Sheets queries to me has been the holy grail, the cup of gold of Sheets functions. Your idea doesn’t **exactly** work as you scoped it – you’d have to concatenate a long string of ‘OR A contains X or A contains Y or A contains Z’ – rather than referencing the entire range X:Z in one shot.

      If I can make it work, I’ll share a video about it – thanks for the inspiration.

  3. This was VERY helpful to me. Thank you!

    I used to use SQL a LONG time ago, but now on Sheets I get frustrated by QUERY’s closeness yet… not so much.

    Thank you for putting this together and continue please.

  4. Warning to those who plan on using QUERY a lot: don’t use it a lot.

    Essentially, you’re pinging an API each time you use the QUERY function. This API has a limit – you can’t actually tell what that limit is (there’s no way to know how close you are to the limit until you’ve passed it), but when you pass it, your calculations will start to lag out (and eventually will fail/return nothing). You’re then pretty severely rate limited until your quota refills (I’m not sure how long that takes, but I’d say somewhere in the several hours/12 hours range).

    The limit is somewhere in the thousand cell range (i.e., if you have a thousand QUERY functions, you’re gonna have a bad time). This might sound like a lot, but if you have 50 rows of dates and 20 columns of metrics to calculate (as I did), you’ll discover it’s not as much as you thought.

    If you’re not calculating that much, though, you should be fine.

    • Thanks for sharing Tom, that’s a great point. Google Sheets can definitely hit you with rate limits if you overuse functions like query, importrange, or importrange.

      The purpose of query, IMO, is to use accomplish a lot with *very few* formulas – I try to limit any Sheet to just a handful.

      In any case where I’ve seen folks using upwards of 50 queries in a Sheet, query is generally not the right tool for the job.

  5. you can replace the construction of your “IN” substitute with use of JOIN() and Unique(). It’s a little cleaner:

    join(” OR “,UNIQUE(F2:F))

    You can augment this with Filter(), as well

  6. Another difference you should note between SQL and the Query function is the absence of a concatenation operator or function. It really is something they should add.
    This sounds small, as the explanation given is that you just have to add an extra column in your dataset to perform the concatenation, however this ignores the headache this involves when dealing with remote datasources (importRange, e.g.). It requires the maintenance of a second, modified copy of the dataset where one would otherwise not be necessary, and the annoyance of the resulting revision churn on the file.

    • Agreed, that is a drawback of query – I usually use arrayformula() on columns to concatenate them before querying, which is painful when combined with importrange().

  7. Hey!
    So I’ve got my query function all setup but then I add columns to the sheet and the function doesn’t update (changing to the new column letter) like it normally would without the query. Is there a workaround for this?

    • Good question Parker – the query text won’t adjust if you add new columns (it’s not like vlookup or other formulas). So I’d recommend only writing a query once your columns are relatively set.

  8. Awesome post – very helpful and much needed!

    Is there any hack or correct way of using “case when” in Query – there are places when you have to use them in a select statement for multiple conditions. Let me know if you find a way – I’ve been trying and wasn’t able to figure it out – you can even add it to your post if you find it.

    • Hmm I know you can’t use actual CASE logic in Sheets queries, but you could potentially use an ARRAYFORMULA and IF function when building the data range (before the select statement) to have conditional formulas run.

  9. Hi all,
    Im trying to build the Query formula based in other cells.. let me explain it to you:

    This is the formula that is now working to me:
    query({importrange($G$1,”Disney – Current!A:G”);importrange($G$1,”Disney – TVOD!A:G”);importrange($G$1,”Lionsgate – Current!A:G”);importrange($G$1,”Lionsgate – TVOD!A:G”);importrange($G$1,”Paramount – Current!A:G”)},”select * where Col1””,1)

    As you can see the Importrange is calling the G1 to get the Key from the sheet. That Cell is a vlookup connected to a drop down. (Basically, I want to choose different sheets.)
    But I have a problem, As you can see the Importrange formula is manually set by the name and range of the sheet that I’m calling. But not all the sheets are having the same tabs…
    So, I build a table with all the tabs from the different sheets and create in a table the range part conected to the same drop down that I’m using to the key.
    Basically what I need at the end is something like this:

    =query({importrange($G$1,concatenate(char(34),B5,Char(34)));importrange($G$1,concatenate(char(34),B6,Char(34)));importrange($G$1,concatenate(char(34),B7,Char(34)));importrange($G$1,concatenate(char(34),B8,Char(34)));importrange($G$1,concatenate(char(34),B9,Char(34)))},”select * where Col1””,1)

    Where Char(34) = ”
    This is delivering #REF!

    Any idea? A better way to do what i’m trying to do?

    Thank you!

    • Okey.. easy Answer here:
      I was obsessed by the ” (char(34).
      So at the end, the formula is working like this:

      query({importrange($G$1,B5);importrange($G$1,B6);importrange($G$1,B7);importrange($G$1,B8);importrange($G$1,B9)},”select * where Col1””,2)

  10. I have been trying to use the “drop down” query without success. My data is on “Sheet 1” and is named “NTBA_2018_Roster”. My “validated range” drop down menu is located on “Sheet 2” in “Cell A2″. I am attempting to use the following query:

    =QUERY(NTBA_2018_Roster,”Select B,D,E,I where B=‘”&Sheet2!A2&”’”)

    I am getting the following error message:

    Error
    Unable to parse query string for Function QUERY parameter 2: PARSE_ERROR: Encountered “B = \u2018” at line 1, column 22. Was expecting one of: “(” … … … … “-” … “true” … “false” … “date” … “timeofday” … “datetime” … “timestamp” … “min” … “max” … “count” … “avg” … “sum” … “year” … “month” … “day” … “hour” … “minute” … “second” … “millisecond” … “now” … “dateDiff” … “lower” … “upper” … “quarter” … “dayOfWeek” … “toDate” … “*” … “/” … “%” … “+” … “-” … “is” … “=” … “=” … “=” … “=” “-” … “=” “true” …

    I can’t seem to get this to work. Any ideas?

    Thanks,

    • Hey Jack, I think your issue is that the range (NTBA_2018_Roster) needs to also contain a cell range (ie NTBA_2018_Roster!A:Z).

      Other than that it looks good, just make sure you’re closing out the dropdown string (Sheet2!A2) with single quotes:

      =QUERY(Sheet1!A:Z,”Select B,D,E,I where B='”&Sheet2!A2&”‘”)”)

Submit a comment