To query from another tab within the same spreadsheet in Google Sheets, you can use the following syntax:
This returns columns A and B from the cell range A1:C9 within the tab named stats. The 1 specifies that there is 1 header row at the top of the dataset being queried.
To query from another spreadsheet entirely, you can use the following syntax:
This returns the first two columns from the cell range A1:C9 within the tab named stats within the Google Sheets spreadsheet with a specific URL.
The following examples show how to use these functions in practice.
Example: Query from Tab in Same Spreadsheet
Suppose we have the following Google Sheets spreadsheet with two tabs:
To perform a query on the data in the stats tab and return the results of the query in the new_sheet tab, we can type the following formula in cell A1 of the new_sheet tab:
Example: Query from Another Spreadsheet
Now suppose that we would like to query from another spreadsheet entirely. To do so, we simply need to identify the URL of the Google Sheets spreadsheet that we’d like to query from.
For example, suppose the data we’re interested in is located at the following URL:
We can use the importrange() function to query data from this spreadsheet:
Notice the subtle difference between this example and the previous example:
One of the most powerful tools to manipulate your data
- University of Maine
In This Article
Jump to a Section
The QUERY function lets you pull information from a range or entire sheet of data using flexible query commands. Learning how to use the Google Sheets QUERY function gives you access to a powerful lookup tool.
If you’ve ever written SQL queries to get data out of a database, then you’ll recognize the QUERY function. If you don’t have database experience, the QUERY function is still very easy to learn.
What Is the QUERY Function?
The function has three main parameters:
=QUERY(data, query, headers)
These parameters are fairly straightforward.
- Data: The range of cells that contain the source data
- Query: A search statement describing how to extract what you want from the source data
- Headers: An optional argument that lets you combine multiple headers in the source range into a single header in the destination sheet
The flexibility and power of the QUERY function comes from the Query argument, as you’ll see below.
How to Create a Simple QUERY Formula
The QUERY formula is especially useful when you have a very large data set from which you need to extract and filter data.
The following examples use U.S. SAT high school performance statistics. In this first example, you’ll learn how to write a simple QUERY formula that returns all high schools and their data where “New York” is in the name of the school.
Create a new sheet for placing the query results. In the upper left cell type =Query(. When you do this, you’ll see a pop-up window with require arguments, an example, and helpful information about the function.
Next, assuming you have the source data in Sheet1, fill in the function as follows:
=Query(Sheet1!A1:F460,”SELECT B,C,D,E,F WHERE B LIKE ‘%New York%'”).
This formula includes the following arguments:
- Range of Cells: The range of data in A1 to F460 in Sheet1
- SELECT Statement: A SELECT statement that calls for any data in columns B, C, D, E, and F where column B contains text that has the word “New York” in it.
The “%” character is a wildcard that you can use to search for parts of strings or numbers in any data set. Leaving “%” off the front of the string would return any school name that starts with the text “New York”.
If you wanted to find the name of an exact school from the list, you could type the query:
=Query(Sheet1!A1:F460,”SELECT B,C,D,E,F WHERE B = ‘New York Harbor High School'”).
Using the = operator finds an exact match and can be used to find matching text or numbers in any column.
Because the the Google Sheets QUERY function is very easy to understand and use, you can pull any data out of any large data set using simple query statements like the ones above.
Use the QUERY Function With a Comparison Operator
Comparison operators let you use the QUERY function to filter out data that doesn’t meet a condition.
You have access to all of the following operators in a QUERY function:
- =: Values match the search value
- <: Values are less than the search value
- >: Values are greater than the search value
- <=: Values are less than or equal to the search value
- >=: Values are greater than or equal to the search value
- <> and !=: Search value and source values are not equal
Using the same SAT example data set above, let’s take a look at how to see which schools had an average mathematics mean above 500 points.
In the upper left cell of a blank sheet, fill in the QUERY function as follows:
=Query(Sheet1!A1:F460,”SELECT B,C,D,E,F WHERE E > 500″)
This formula calls for any data where column E contains a value that’s greater than 500.
You can also include logical operators like AND and OR to search for multiple conditions. For example, to pull scores only for schools with over 600 test takers and a critical reading mean between 400 and 600, you would type the following QUERY function:
=Query(Sheet1!A1:F460,”SELECT B,C,D,E,F WHERE C > 600 AND D > 400 AND D < 600")
Comparison and logical operators provide you with many different ways to pull data from a source spreadsheet. They let you filter out important pieces of information from even very large data sets.
Advanced Uses of QUERY Function
There are a few other features you can add to the QUERY function with some additional commands. These commands let you aggregate values, count values, order data, and find maximum values.
Using GROUP in a QUERY function allows you to aggregate values in multiple rows. For example, you can average test grades for each student using the GROUP function. To do this, type:
=Query(Sheet1!A1:B24,”SELECT A, AVG(B) GROUP BY A”)
Using COUNT in a QUERY function, you could count the number of schools with a writing mean score over 500 using the following QUERY function:
=QUERY(Sheet1!A2:F460,”SELECT B, COUNT (F) GROUP BY B”)
Using ORDER BY in a QUERY function, you can find schools with maximum math mean scores and orders the list by those scores.
=QUERY(Sheet1!A2:F460,”SELECT B, MAX (E) GROUP BY B ORDER BY MAX(E)”)
I have a Google Sheets database that I want to query. I want the query to be based on data in columns B, C, and D and return all data if the conditions are met. I have input cells for the search criteria for those three columns, and I’d like the query to do the following:
- Ignore search criteria where there is no value entered in the search cell
- Use AND logic where there is a value in a given cell
For example, search criteria in input cells:
- B = “1”
- C = Blank
- D CONTAINS “the”
I want the query to return data where C is anything and B = “1” and D CONTAINS “the”.
- B = “1”
- C = “E”
- D CONTAINS “the”
I want the query to return data where all three conditions are met
I’d like D to not be null in all cases.
I’m having trouble incorporating the empty criteria into the query. I’ve tried dynamically using AND/OR depending on whether there’s data in a given search cell, but I can’t get it to work.
So, I created a crazy-long query to account for all possible combinations with B, C, and D. This works, but I want to incorporate more search criteria, and this equation becomes exponentially more complicated with 4, 5 or more search criteria.
Here is an image of the query input and query results for the first example above (I’m not sure why some of the values in the Name column don’t contain “the”). Image of Query Sheet
And here’s an image of the database that I’m querying against: Image of Database Sheet
Here is the equation I’m using:
My ask: I’d like to simplify this equation so that it can be extended to multiple search criteria (more than 3). Thank you for your help!
Query() is a powerful Google Sheet tool. But what if you don’t want to learn coding?
If you have ever wanted to pull data from multiple spreadsheets, you would have come across the query() function. The formula looks something like this:
QUERY(data, query, [headers])
If you know your query languages, you would write something like this:
QUERY(A2:E6,”select avg(A) pivot B”)
But if you’re like me, you have to do a Google search every time you’re creating a complex query to make sure the syntax is legit. Here are a few common ones I use often:
Import selected columns from another Google Sheet
I often have to import a few columns from large datasheets that have 10s of columns. In those cases, I find the query() function faster than the traditional indexmatch. Usually, they look something like this:
QUERY(MASTER!A:X,”select A, B, C, Q, R”)
Import selected columns from another Google Sheet, based on criteria
Once in a while, I import data from another sheet which meet a certain criteria. Again, far easier than complex indexmatch’s:
QUERY(MASTER!A:X,”select A, B WHERE F=’FALSE'”)
This is all good and fine if you are familiar with SQL querying. But if you are a business user, you likely don’t know querying languages. Or, at least, don’t have them at the tip of your fingers.
How to run query formulas using Airboxr instead?
The cool thing about using Airboxr is that you don’t need to bother with complex formulas. You definitely don’t need to learn how to write raw queries. In the gif below, I highlight how to select a source and choose the columns to import. You simply choose the Google Sheet you’d like to pull data from and click on the columns to import.
In this example, I want to import a list of SaaS companies and a link to their startup decks from another Google Sheet: you can add this GSheet as a source yourself and give it a shot.
I would like to add a filter to include only SaaS companies . So I simply hit the Filter button and tell Airboxr what to do.
That’s it! When you hit the import button, Airboxr will do its thing behind the scenes and import the data into your GSheet for you! ?
Easy-peasy, isn’t it?!
It really is that easy. Our public beta is now live: get your GSheets plug-in here.
Get best practices and data tips in your inbox.
We write these stories once a week. To up your game at marketing and data, leave your e-mail.
QUERY is a function in Google Sheets that allows you to ask questions of your data. You can use it to get the sum, average, or count of a range of cells, or to find a specific value. You can also use it to return data from a specific row or column. QUERY is a powerful tool for data analysis.
What is the syntax of QUERY in Google Sheets?
The syntax of the QUERY function in Google Sheets is as follows:
query – The text string you want to query.
range – The range of cells you want to return results from.
What is an example of how to use QUERY in Google Sheets?
QUERY is a powerful function that can be used to extract data from a range of cells in Google Sheets. You can use it to return specific values, or to return a range of values based on certain criteria. For example, if you want to extract the total sales for a particular product from a list of sales data, you can use the QUERY function to return the value from a specific cell in the range. The function takes the following form:
where range is the range of cells that you want to extract data from, and criteria is the criteria that you want to use to return a specific value or range of values.
When should you not use QUERY in Google Sheets?
There are a few instances when you should not use the QUERY function in Google Sheets. One is when you are looking to filter data in a specific way. If you are looking to filter data in a specific way, you can use the FILTER function. Another time you should not use QUERY is when you are looking to create a pivot table. Pivot tables can be created using the PIVOT function.
What are some similar formulae to QUERY in Google Sheets?
QUERY is a formula in Google Sheets that allows you to search for data in a range of cells. There are a number of similar formulas that you can use to search for data in Google Sheets. These formulas include SEARCH, FIND, and VLOOKUP. SEARCH allows you to search for a text string in a range of cells. FIND allows you to find a specific value in a range of cells. VLOOKUP allows you to lookup a value in a table of data.
I need to use use a ” VLOOKUP ” type formula to match values across two data sets, but instead of using 1 key value as you would normally with VLOOKUP in need to use 2 key values.
Please see below screenshot of example data. What I need to do is match the “TAX” value from the “Merchant Report” to the “Purchase Report” where “DATE” and “ORDER VALUE” match across the two data tables.
I use the term ” VLOOKUP ” in inverted commas, as i dont think this can be done with VLOOKUP (without a helper column), so instead was trying to use the following QUERY formula (learnt from this video) :
But it keeps returning “N/A – ERROR – Query completed with an empty output”
Ive made an example of the spreadsheet here in google sheets (please use file > make a copy if you would like to make a copy) – https://docs.google.com/spreadsheets/d/1VuVuSIiuLQLVrf5dhTwKV368pvrs-PyT4cnx7wLRdyE/edit#gid=0
A) Any idea why this isn’t working ?
B) Any idea how i would deal with the lines colour flashed in yellow, where both the key values are the same, but the data to be matched differs ?
2 Answers 2
See my comment directly below your post. Assuming that entry is not a possibility in your real data set (i.e., that two or more line items would have the same date and order value yet a different tax amount), delete everything from Col C (including the header) and place the following formula in C1:
This will generate the header and all results. You can change the header text within the formula itself.
VLOOKUP , as you can see here, is capable of using concatenated range data to find matches.
You’ll need to format Col C as currency.
Addendum (after comments):
Apparently, part of the original goal was to know why the original QUERY formula used by the OP wasn’t working. (This was not clear, given the opening statement: ‘I need to use use a ” VLOOKUP ” type formula to match values across two data sets, but instead of using 1 key value as you would normally with VLOOKUP in need to use 2 key values.’)
The OP’s original QUERY formula that does not work:
The version of the above that will work:
The original formula doesn’t work for a number of reasons:
1.) There are far too many quotes, used incorrectly, for the formula to have worked, even if the data to be matched were all strings.
2.) Quotation marks of any type are used to search strings. Typically, those are single quotes when used within the Select clause, which itself is already enclosed in strings.
3.) However, neither piece of data being compared is a string. One is a currency amount and the other is a date. Both are numbers, and each is treated differently. But neither would be enclosed directly within quotes in a QUERY .
4.) Numbers such as currency are simply searched, without being enclosed in quotes. So the dollar amount concatenation looks like this: “select G where F = “&B3 .
5.) Dates within QUERY formulas must be set up a specific way; that is, they must be converted to text in a specific format that matches how SQL sees them. So the data-match concatenation looks like this: “select G where . E = date ‘2021-11-23’ ” Notice the use of date and the single quotes within the Select clause that come before and after the date-string that will be formed by the TEXT function (which must be in the format yyyy-mm-dd only). Since your dates are variables, they must be interposed like this: “select G where . E = date ‘”&TEXT(A3,”yyyy-mm-dd”)&”‘”
All of that said, using QUERY this way for the purposes in the original post would have required a separate QUERY formula per row. And then there is the added complication that a single date-amount combination may have more than one tax amount assigned to it, meaning that the QUERY for the first of those would return two rows of information, not one. That means that even if there were a separate QUERY dragged down into each cell of a column, this would result in an error at each point where the return would be more than one cell/row, because room would not have been left below each such QUERY for its expanded results.
All said, QUERY is the wrong function here. On the other hand, as I provided above, VLOOKUP is a concise, single-cell formula that can produce the results simply and without such conflicts.
With the Query Builder for Google Sheets, you can create Custom Metrics to sync your Google Sheets data with Databox.
Navigate to Metrics > Query Builder to access the Query Builder for Google Sheets. Click the green + Create Custom Metric button and select your connected Google Sheet from the Data Source drop-down list.
Connecting Google Sheets and accessing Query Builder is available on the Professional and Performer plans. Request a free trial of Google Sheets by following these steps.
A Value: Select a cell or a range of cells from your Google Sheet that stores the Metric Value(s). These cells must contain numerical values (Currency and other Unit formats are supported). Select the cell(s) and navigate to Format > Number > Number in your Google Sheet to verify that the entries are entered as Numbers and not Strings.
A cell or a range of cells can be entered using A1 notation or by highlighting the selection directly in the Spreadsheet Preview section at the bottom of the Query Builder. Learn more about A1 notation here.
If you select an entire row or column as the Value selection, any new entries in the selected row or column will automatically sync the new data to the Custom Metric in Databox.
IN THIS SECTION
How to Create a Custom Google Sheets Metric without Dimensions [Example]
Let’s say we want to create a Custom Metric to track Impressions from the Google Sheet below.
How to Create a Custom Google Sheets Metric with Dimensions [Example]
Let’s say we want to create a Custom Metric to track Impressions by Platform from the Google Sheet below.
In the Google Sheet Preview, we’ll select column C as the Value. Since we are selecting the full column, any additional Values added to the Worksheet will be automatically synced with Databox
Further, we’ll select the Date column from our Google Sheet, which is column A
We want to view Impressions based on the Platform they were tracked in, so we will select column B, Platform, as the Dimension
We want to get the sum of all values per selected Date Range. Therefore, we’ll leave preselected SUM as an Aggregation function
Next, we’ll select the visualization that we would like associated with the pre-built Datablock for this Custom Metric from the Data Preview section. We are interested in analyzing the distribution of Impressions across the Platforms, so we will select a Pie Chart
We’ll enter Impressions by Campaign as the Custom Metric Name
Click Save to save the Custom Metric
IN THIS SECTION
How to Add a Custom Google Sheets Metric to a Databoard from the Metric Library
Click on the Metric Library icon on the lefthand side of the Designer
Select the Google Sheet for which you want to view Datablocks from the Data Source drop-down list. The Custom Metrics that have been created for the Google Sheet will populate the Metric Library
How to Add a Custom Google Sheets Metric to a Databoard from the Visualization Library
- Each Google Sheets Custom Metric supports only one data Aggregation function. To view the same Google Sheets Custom Metric with different Aggregation functions selected (i.e., “SUM” on one vs. “AVG” on another), duplicate Custom Metrics and create the views you desire by following these steps.
Thanks for the feedback There was a problem submitting your feedback. Please try again later.
The QUERY function is probably one of the most powerful functions in Google Sheets. It is very versatile and can be applied to both simple and complex problems. However, for a newbie, it can seem a bit complicated. The good news is that if you know the rules of the function and take a look at some examples, you will surely find hundreds of ways to use it.
In this tutorial, we will help you understand the Google Sheets QUERY function, its syntax, and how to use it. To help you learn how to apply the function in different scenarios, we will take it step by step and explain it with examples, from simple queries to more complex queries.
What is the QUERY function?
The function has three main parameters:
= QUERY (data, query, headers)
These parameters are pretty straightforward.
- Data: The range of cells that contain the source data
- Query: A search statement that describes how to extract what you want from the source data
- Headers: An optional argument that allows you to combine multiple headings in the source range into a single heading in the destination sheet
The flexibility and power of the QUERY function comes from the Query argument, as you’ll see below.
How to create a simple query formula
- The QUERY formula is especially useful when you have a very large data set that you need to extract and filter data from.
- The following examples use high school performance statistics from the US SAT. In this first example, you will learn how to write a simple QUERY formula that returns all high schools and their data where “New York” is in the name of the school.
- Create a new sheet to place the query results. In the top left cell, type = Query (. When you do this, you will see a popup with required arguments, an example, and useful information about the function.
- Next, assuming you have the source data on Sheet1, complete the function as follows:
= Query (Sheet1! A1: F460, ”SELECT B, C, D, E, F WHERE B LIKE ‘% New York%’”).
This formula includes the following arguments:
- Cell range: the data range in A1 to F460 in Sheet1SELECT Declaration:
- A SELECT statement that requires any data in columns B, C, D, E, and F where column B contains text that has the word “New York.”
- If you want to find the exact school name from the list, you can write the query:
= Query (Sheet1! A1: F460, ”SELECT B, C, D, E, F WHERE B = ‘New York Harbor High School’”).
- The = operator finds an exact match and can be used to find matching numbers or text in any column.
- Because the QUERY function of Google Sheets is so easy to understand and use, you can extract any data from any large data set using simple query statements like the ones above.
Using the QUERY function with a comparison operator
Comparison operators allow you to use the QUERY function to filter data that does not meet a condition.
You have access to all of the following operators in a QUERY function:
- =: Values match search value
- : Values are greater than lookup value
- =: Values are greater than or equal to the lookup value
- <> and! =: search value and source values are not the same
Using the same sample data set from the SAT above, let’s take a look at how to see which schools had a GPA in math above 500 points.
The QUERY formula is a great way of being able to perform basic SQL-like commands on a data range. The signature of the QUERY formula is as follows:
The first parameter of the QUERY formula is the data being operated on, with the second parameter being the SQL-like statement for what is needed, and the third optional parameter is how many rows of headers to include. But how do you add multiple ranges to the first parameter?
To use multiple ranges in a QUERY formula wrap the ranges in the data set notation <> .
For example, QUERY(< A2:A, D2:D >, …) gets the range of cells from range A2 to the bottom, and the range from D2 to the bottom and combines them as one data set for use in the QUERY formula.
|1||John||Smith||1 Lane St||NSW||John
=QUERY(, “SELECT *”)
|2||Jane||Doe||2 Martin Pl||NSW||Jane||NSW|
|3||John||Doe||3 Query Lane||NSW||John||NSW|
Using multiple ranges in QUERY data parameter
How Do You Reference Multiple Ranges In QUERY?
If you have multiple ranges in your QUERY formula by using the data set annotation above, how do you reference these in your QUERY statement (the second parameter)?
To refer to a range when multiple ranges are being referenced in the QUERY formulas first parameter use the syntax Col followed by the column number according to the order of the ranges inserted into the first parameter.
For example, QUERY(< A2:A, D2:D >, …) would be referenced as Col1 when referring to the range in A2:A, and Col2 when referring to the range in D2:D.
|1||John||Smith||1 Lane St||NSW||John
=QUERY(, “SELECT Col1”)
|2||Jane||Doe||2 Martin Pl||NSW||Jane|
|3||John||Doe||3 Query Lane||NSW||John|
To reference each of the columns in the QUERY statement use Col followed by the column number (e.g. Col1 )
Be mindful when using the Col annotation when referencing your ranges in the data parameter that you correctly capitalise Col – with a capital C at the beginning. The case is important!
How To Handle Range Errors With Multiple QUERY Ranges
When using multiple ranges in your data parameter you may find you get a range error, why is this a problem?
Function ARRAY_ROW parameter 2 has mismatched row size. Expected: 3. Actual: 2.
QUERY formula error
When using multiple ranges in your data parameter, they need to contain the same length of rows.
For example, QUERY(< A1:A3, D1:D2 >, . ) will not work as the row height for both ranges is different. The first range has 3 rows, whereas the second range has 2 rows.
Make sure the row heights are the same.
Each range in the data parameter needs to be of the same row height, otherwise you will get a reference error.
Therefore, check the values inserted into your data parameter contain the same row height. If you are using functions such as SPLIT it returns values that span over columns, therefore, to convert that range to rows you need to wrap it with TRANSPOSE .
Using multiple ranges in a QUERY is possible provided each range is inserted in the first parameter, is wrapped in the data set notation <>, and all ranges imported have the same row height.
Tags Google SheetsQUERY report this ad
report this ad
Welcome to ScriptEverything.com! On this website you’ll find things I’ve learned while tinkering with code and fiddling around with apps.
When I’m not behind a computer or at work, you’ll find me wandering through the bush with my kids getting lost geocaching.
report this ad
- Set Conditional Format Based On Another Cell Value In Google Sheets
- How To Use The Python ** Operator: With Code Examples
- What Does Asterisk Before Variable Mean In Python?
- Insert Line Breaks In HTML Code
- What Does Colon Equals Mean In Python? The New Walrus Operator :=
- Manually refresh the BigQuery data in the sheet.
- Schedule a refresh of the data in the sheet.
(2) (8) (4) (10) (2)
report this ad
With Connected Sheets, you can access, analyze, visualize, and share billions of rows of BigQuery data from your Sheets spreadsheet.
You can also do the following:
Collaborate with partners, analysts, or other stakeholders in a familiar spreadsheet interface.
Ensure a single source of truth for data analysis without additional spreadsheet exports.
Streamline your reporting and dashboard workflows.
Connected Sheets runs BigQuery queries on your behalf either upon your request or on a defined schedule. Results of those queries are saved in your spreadsheet for analysis and sharing.
Example use cases
The following are just a few use cases that show how Connected Sheets lets you analyze large amounts of data within a sheet, without needing to know SQL.
Business planning: Build and prepare datasets, then allow others to find insights from the data. For example, analyze sales data to determine which products sell better in different locations.
Customer service: Find out which stores have the most complaints per 10,000 customers.
Sales: Create internal finance and sales reports, and share revenue reports with sales reps.
Direct access to BigQuery datasets and tables is still controlled within BigQuery. If you want to give a user Sheets access only, share a spreadsheet and do not grant BigQuery access.
A user with Sheets-only access can perform analysis in the sheet and use other Sheets features, but the user will not be able to perform the following actions:
Before you begin
First, make sure that you meet the requirements for accessing BigQuery data in Sheets, as described in the “What you need” section of the Google Workspace topic Get started with BigQuery data in Google Sheets. An enterprise workspace account is required to use Connected Sheets with BigQuery.
If you do not yet have a Google Cloud project that is set up for billing, follow these steps:
Sign in to your Google Cloud account. If you’re new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.
Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.
In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.
Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.
When you finish this topic, you can avoid continued billing by deleting the resources you created. See Cleaning up for more detail.
Use Connected Sheets with BigQuery
The following example uses a public dataset to show you how to connect to BigQuery from Sheets.
Create or open a Sheets spreadsheet.
Click Data, click Data connectors, and then click Connect to BigQuery.
Click Get connected.
Select a Google Cloud project that has billing enabled.
Click Public datasets.
In the search box, type chicago and then select the chicago_taxi_trips dataset.
Select the taxi_trips table and then click Connect.
Your spreadsheet should look similar to the following:
Start using the spreadsheet. You can create pivot tables, formulas, and charts using familiar Sheets techniques.
Although the spreadsheet shows a preview of only 500 rows, any pivot tables, formulas, and charts use the entire set of data. You can also extract the data to a sheet. For more information, see the Connected Sheets tutorial.
To avoid incurring charges to your Google Cloud account for the resources used in this tutorial:
If you plan to explore multiple tutorials and quickstarts, reusing projects can help you avoid exceeding project quota limits.
Get more information from the Google Workspace Get started with BigQuery data in Google Sheets topic.
View videos from the Using Connected Sheets playlist on YouTube.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
In my previous articles, I’ve mainly focused on in-depth, lengthy content. Hereby I am starting a new series of more digestible articles – delicious bites of marketing & tech delicacies .
Today, this one is about Google Sheet’s Query Function. This function is nothing short but amazing. For me, it’s the nail in the coffin for MS Excel But decide for yourself!
- simplifies complex functions and hence improves adaptability & readability
- it’s using the same principles as SQL – so no need to re-think!
- it’s often faster than using complex INDEX/MATCH or VLOOKUP
- You can’t easily ‘sort’ selections which use query(), the sort needs to be part of the query() itself
- If you have never used SQL before, it might appear more complex first. But don’t give up here!
Alright, I won’t go into a more detail, but if you’re interested on how to use Query() – here’s a great article from CodingIsForLosers – ‘A weapon of mass laziness‘.
Today, I’d like to present a nifty solution to a problem I recently encountered – querying column names!
The problem: You can’t Query the Column Header by Name in Google Spreadsheets
What do I mean with that?
Let’s use Sample table with the following information:
In a normal function, Google Sheets would shift the reference accordingly – but not here. It will break the function. Not fun!
The solution: Query by Column Name
1. First, we need a formula that returns the position of the column.
We can use the ADDRESS() in combination with MATCH() for that. ADDRESS() returns the cell position as a string. The structure is as follows:
ADDRESS(row, column, [absolute_relative_mode], [use_a1_notation], [sheet])
- row : this can be “1”, even if your header column is not in row 1 – because it just depends on the range that you provide.
- column : this is unknown, so we’ll use the MATCH() function to find the number of the column we want to reference (e.g. in our example”salary”).
- [absolute_relative_mode] – optional, “1” be default. 1 is row and column absolute(e.g. $A$1), 2 is row absolute and column relative (e.g. A$1), 3 is row relative and column absolute (e.g. $A1), and 4 is row and column relative (e.g. A1). We are using 4 in this example to keep it fixed.
- use_a1_notation – optional, “TRUE” as default.
- sheet – optional, absent by default. Needs to be changed if you’re referencing a different sheet (in our case we don’t).
What do we need to set for MATCH()?
Here’s the MATCH() structure: MATCH(search_key, range, [search_type])
- search_key – The value to search for. We’re using “salary” in our example.
- range – This is the range of all of the possible header you want to reference. Note, it must be one-dimensional (e.g. A1:F1 ). In our case, it’s just 1$:$1 . ($ to make the reference absolute).
- search_type – optional, 1 by default. We want to make sure the function is searching for an exact match and the range is not sorted, so we’re using 0
Ok, so 論.. here’s our function:
2. Remove the row number from the returned cell.
We’ll just wrap the function with a simple SUBSTITUTE() function to replace ‘1’ (with nothing): = SUBSTITUTE ( ADDRESS ( 1 , MATCH ( D10 , $1:$1 , 0 ) , 4 ) , “1” , “” )
Note: This only works if you’re header column is in the first row. There are ways to make this more flexible, e.g. by searching for the first number in the string (the row), and remove everything after that (w/ LEFT() function). Add a comment if you’re having problems with that!
3. Combine it back into the Query() function
Adding this into the query function looks a bit complicated, but there’s a standard format for this:
- The referenced column must be wrapped in single quotes ( ” ) for strings (if you’re referencing numbers, you don’t need that)
- Close and re-open the query with double quotes ( “” )
- Use ampersands ( & ) to add the referenced cell to the query string
Nice… so what’s so cool about this? Well, I mentioned that it makes the whole spreadsheet more versatile and less error prone. However, there’s more.
Bonus: How to use query() with drop down fields
Wouldn’t it be cool to have drop downs to get different values from our data range in case we need them?
Let’s assume we’re interested not only in the average salary, but also the average age. How do we get that?
We just need to change the value of our input cell ( D10 in my example). And we can use a dropdown by going to Data>Data validation and use the “List of Items” functionality.
We can even use this to change the aggregate function. maybe we’re interested in the MAX, MIN and SUM as well? Well, here you go. We’re just using another cell to provide these as drop downs, and then reference in the query:
Here’s my sample spreadsheet if you’d like to play around or make a copy for yourself: Click to see sample spreadsheet
All working? Congrats! Now go and make some awesome spreadsheets 鸞 and if it was useful, please leave a comment, tweet me at @joradig or drop me a line on LinkedIn. It really keeps me going and motivated.
Whether or not it was one of my resolutions, one of the things I want to do more this year is try to try to make more use of stuff that’s already out there, and come up with recipes that hopefully demonstrate to others how to make use of those resources.
So today’s trick is prompted by a request from @paulbradshaw about “how to turn a spreadsheet into a form-searchable database for users” within a Google spreadsheet (compared to querying a google spreadsheet via a URI, as described in Using Google Spreadsheets as a Database with the Google Visualisation API Query Language).
I’m not going to get as far as the form bit, but here’s how to grab details from a Google spreadsheet, such as one of the spreadsheets posted to the Guardian Datastore, and query it as if it was a database in the context of one of your own Google spreadsheets.
This trick actually relies on the original Google spreadsheet being shared in “the right way”, which for the purposes of this post we’ll take to mean – it can be viewed using a URL of the form:
(The &hl=en on the end is superfluous – it doesn’t matter if it’s not there…) The Guardian Datastore folks sometimes qualify this link with a statement of the form Link (if you have a Google Docs account).
If the link is of the form:
just change pub to ccc
So for example, take the case of the 2010-2011 Higher Education tables (described here):
The first thing to do is to grab a copy of the data into our own spreadsheet. So go to Google Docs, create a new spreadsheet, and in cell A1 enter the formula:
When you hit return, the spreadsheet should be populated with data from the Guardian Datastore spreadsheet.
So let’s see how that formula is put together.
Firstly, we use the =ImportRange() formula, which has the form:
This says that we want to import a range of cells from a sheet in another spreadsheet/workbook that we have access to (such as one we own, one that is shared with us in an appropriate way, or a public one). The KEY is the key value from the URL of the spreadsheet we want to import data from. The SHEET is the name of the sheet the data is on:
The RANGE is the range of the cells we want to copy over from the external spreadsheet.
Enter the formula into a single cell in your spreadsheet and the whole range of cells identified in the specified sheet of the original spreadsheet will be imported to your spreadsheet.
Give the sheet a name (I called mine ‘Institutional Table 2010-2011’; the default would be ‘Sheet1’).
Now we’re going to treat that imported data as if it was in a database, using the =QUERY() formula.
Create a new sheet, call it “My Queries” or something similar and in cell A1 enter the formula:
=QUERY(‘Institutional Table 2010-2011’!A1:K118,”Select A”)
What happens? Column A is pulled into the spreadsheet is what. So how does that work?
The =QUERY() formula, which has the basic form =QUERY(RANGE,DATAQUERY), allows us to run a special sort of query against the data specified in the RANGE. That is, you can think of =QUERY(RANGE,) as specifying a database; and DATAQUERY as a database query language query (sic) over that database.
So what sorts of DATAQUERY can we ask?
The simplest queries are not really queries at all, they just copy whole columns from the “database” range into our “query” spreadsheet.
- =QUERY(‘Institutional Table 2010-2011’!A1:K118,“Select C”) to select column C;
- =QUERY(‘Institutional Table 2010-2011’!A1:K118,“Select C,D,G,H”) to select columns C, D, G and H;
So looking at copy of the data in our spreadsheet, import the columns relating to the Institution, Average Teaching Score, Expenditure per Student and Career Prospects, I’d select columns C, D, F and H:
=QUERY(‘Institutional Table 2010-2011’!A1:K118,“Select C,D, F,H”)
to give this:
(Remember that the column labels in the query refer to the spreadsheet we are treating as a database, not the columns in the query results sheet shown above.)
All well and good. But suppose we only want to look at institutions with a poor teaching score (column D), less than 40? Can we do that too? Well, yes, we can, with a query of the form:
“Select C,D, F,H where D < 40"
(The spaces around the less than sign are important… if you don’t include them, the query may not work.)
Here’s the result:
(Remember, column D in the query is actually the second selected column, which is placed into column B in the figure shown above.)
Note that we can order the results according to other columns to. So for example, to order the results according to increasing expenditure (column F), we can write:
“Select C,D, F,H where D < 40 order by F asc"
(For decreasing order, use desc.)
Note that we can run more complex queries too. So for example, if we want to find institutions with a high average teaching score (column D) but low career prospects (column H) we might ask:
“Select C,D, F,H where D > 70 and H < 70"
Over the nect week or two, I’ll post a few more examples of how to write spreadsheet queries, as well as showing you a trick or two about how to build a simple form like interface within the spreadsheet for constructing queries automatically; but for now, why try having a quick play with the =QUERY() formula yourself?
A travel guide for technical marketing – get inspired and start exploring yourself.
Google Sheets is the cloud-based alternative to Microsoft Excel. Although Excel might still be the choice for some business needs, I’d strongly advise everybody to know your way around Google Spreadsheets. It’s fast, free and pretty easy to use.
We’re not going to talk about the basic stuff here (you should know the basic functions in spreadsheets) but two advanced functions that are very useful and will probably move you well above the crowd of people that say they know excel.
1) Index Match
Most people will use vlookup much too often. Really try to master INDEX(MATCH()) instead. It’s way more flexible. Just a bit harder to remember.
I always just remember one thing. You start with =INDEX( and the first thing to enter is the cell reference to the column that holds the desired results – once you’ve put that in say like this =INDEX(C:C just open the match formula and start of with the “search phrase” like so =INDEX(C:C;MATCH(A1 , then refer to the column that you want to search in e.g. INDEX(C:C;MATCH(A1;F:F . Super important – always end the Match Formula with a 0 so that your finished function looks something like INDEX(C:C;MATCH(A1;F:F;0))
I’ve prepared a google sheet to show you a working example also visualising one of the advantages of Index-Match: the order of columns is not important – with VLOOKUP you need to worry about that stuff.
Just try to always use INDEX(MATCH()) instead of VLOOKUP() – you’ll understand the advantages soon enough.
2) Query Function
This one is a bit more complex but well worth your time. The Query-Function uses the Google Query Language to make data transformations quick and easy. The Google Query Language resembles MySQL a lot and just as MySQL it allows you to quickly drill down into large Datasets / Tables.
I’ve prepared another google sheet to show you a few examples. In the document you’ll see a simple dataset and a few options for a drilldown with the query function.
The Query Function looks like this:
In this example A:Z would represent your dataset – just reference the whole dataset you want to grab data from.
The next parameter always starts with select and is enclosed by apostrophes. I’ll try to explain two different queries as an example here:
=QUERY(A:Z;”select B where A = ‘2016’)
So, let’s say you have revenue in column B and the year in column A . With the query you’ll just get the values from column B where the column A is exacly 2016 . Pretty easy right?
Off to something a bit more complex:
QUERY(A:Z;”select A,sum(B) group by A”)
We’re looking at the same dataset, this time we want the sum of revenue grouped by the year. With select A,sum(B) we’re basically telling the function that we want A (the year) in the first column and sum(B) (the sum of revenue) in the second column. As there could be two rows with data for the same year we need to tell the function over which variable we want to calculate a sum thats what group by A is for.
We don’t want the result to look like this:
2011 | 2000
2011 | 1000
2012 | 5000
But instead what we want to get is this:
2011 | 3000
2012 | 5000
That’s why we need to group the result – in this case by A (the year)
Check out the documentation of the Google Query Language for a lot more examples and many more options. These examples really only show you a starting point of all the advantages.
If you are a hard-core Excel only user this article is not for you. While Excel is powerful especially combined with Power BI, sometimes Google Sheets is the answer. In an industry such as digital marketing where everything we do is online, highly collaborative, and shared with outside clients Google Sheets has its benefits. We have a lot of clients with special needs when it comes to reporting and we respond to those needs with custom built Sheets dashboards. The two largest benefits of doing this are that the data can be automatically pulled in and it refreshes automatically so the client always has up-to-date insights.
The best instrument for building PPC reports is Google Sheets query function. I will cover some of the most important initial items to learn when starting out. This is not a lesson in the overall syntax of the query function so some familiarity is required.
Referencing Another Cell
It’s possible to create dynamic query data based on information in another cell. I use this when creating reports to have the option to view the metric data one by one instead of laying all metrics out at once. This is especially beneficial when linking the data to a chart. When referencing another cell within a query function you’ll need to get the syntax correct based on the referenced cell format.
For the context of the example below, I have a data validation in C4 (Google) for my platform options and a vlookup in D3 (D) pulling in the column letter of the metric in C3 (impressions) from a separate table. When I change the metric in C3 the letter in D3 automatically changes as well.
Looking at the query function you can see that I am referencing both C4 and D3 in the formula but with slightly different syntax. The data that D3 is referencing is a value which requires a double quote and an ampersand on either side. The C4 cell is referencing data that is text and requires a single quote on the outside in addition to the double quote and ampersand. If you want to know more about the why behind this check out this document, Google Sheets Query Functions.
Another common theme in PPC reports is utilizing date conditions from another cell when querying data. Firstly, your date has to be in the format YYYY-MM-DD. You can accomplish this with a text function.
Furthermore, when referencing this cell within a query, you must include the date function prior to the single quote, double quote, and ampersand.
Combining Multiple Data Sheets
Want to bring data together across multiple sheets? You can do that with a query function. The syntax is easy, add curly brackets at the beginning and end of the data ranges and use semicolons to separate them. Keep in mind that your columns must be in the same order across all sheets, which may require adding in some dummy columns.
Importrange with Query
Sometimes I work with a lot of data and Google Sheets cannot handle all of it in one doc. This means that I have to pull in data from other sheets as I need it. This is somewhat easy with an importrange function inside of a query function. You are adding the importrange function and pulling data from the external doc’s set data range. However, one very annoying thing to note is that you can not use column letters within queries based on imported ranges. Instead, you must use the column number formatted as Col4.
Querying Calculated Metrics
This last one is almost common sense, but definitely worth noting as it once was a foreign concept to myself. If you are using a query function to pull in metrics such as clicks or conversions, you can use the sum() within a query. However, if you need to pull in rates such as CTRs or conversion rates you can’t simply take the sum(), because that doesn’t make sense or the avg(), because this is inaccurate.
You must re-create the formula used to calculate the metric. For example, if you want to query CTR it will look like this: sum(clicks)/sum(impressions), replacing the text with the column letter of course.
Google Sheets query function isn’t perfect and it is missing some functionalities, but it can be a really good tool for manipulating data sets. If you get the query function down pat it will become your new flame and sumifs will only be an occasional fling.
If you use Google Docs spreadsheets in your business operations, perhaps as sales or customer information records, you know that there are some powerful tools available to you for organizing and filtering your data. But, despite these features, maybe you wish that the spreadsheets acted a little more like SQL databases, so you can apply your querying knowledge to them. If that’s the case then you’re in luck, because Google’s QUERY function offers you this very ability.
Log in to your Google Docs account and open the spreadsheet that you want to query.
Open a text editor like Notepad and write out your query. Note that the query language used by Google Docs uses the “Gviz” syntax and there are limitations to the queries that can be made. For example, you cannot “INSERT,” “UPDATE” or “DELETE,” and the columns must be referenced by the letter of the column, not the header you have given it.
Encode the query for use in URLs. In most cases, you’ll only have to swap out spaces for “%20” (without quotations) and commas for “%2C”. If you’re unsure about how to do this, go to the “Query Language Reference,” scroll to “Setting the Query in the Data Source URL” and paste your query into the text box. Then click “Encode” to produce the encoded version.
Add “&tq=” to the end of your spreadsheet’s URL and then copy and paste your encoded query on to the end. The result will look something like this:
In today’s technology-driven society, if you’re older than 7 years of age and you aren’t intimately familiar with basic spreadsheet functions like SUM(), AVERAGE(), MAX(), MIN(), and maybe even the occasional COUNTIF(), you were probably born into an Amish community and won’t be reading this anyway. If you consider yourself an expert of the VLOOKUP() with nested INDEX MATCH functions, prepare to have your mind blown.
Like the title says, this a function specific to Google Sheets. Sorry, Excel-lovers…you’re out of luck.
The function we’re talking about today is QUERY(). This function allows the user to create pseudo-natural language look-ups within the spreadsheet. The inputs given within the brackets when using the QUERY() function are similar in structure to the database language called SQL.
If you want to learn by doing, I’ve created a Google Sheet that you can use to practice and follow along. Jerod’s Awesome Query Google Sheet (Full Disclosure: I copied the sample data from this page and modified it very slightly just to give us something to work with.)
As you can see, we have a quick & easy sales table with all the pertinent details. We’ll be performing our magic on this data. Let’s dive in with a relatively basic query. In cell J3 enter the following:
=query(A3:G46,”select * where D=’Pencil'”, 1)
This can be dissected as:
- A3:G46 — our data table including headers
- “select * where D=’Pencil’” — We must include the quotes (“) here to delineate this section of the command. We’re telling the query, “We want to see everything (that’s the select * part) as long as it says “Pencil” in column D (the where D=’Pencil’ part).”
- 1 — The “1” at the end simply tells the query command that we included a header row in our table. If you had two header rows this would be a “2,” three header rows would be a “3,” etc. If you do not include header rows you can set this to zero, “-1,” or “FALSE.”
The output should look like this (in the area J3:P16):
Well now, that’s pretty damn useful…but as they say in infomercials…WAIT! THERE’S MORE!
Let’s say that we want to know the total sales of pencils during this entire time period. We simply modify our function to read:
=query(A3:G46,”select sum(G) where D=’Pencil'”, 1)
When we execute this query, the spreadsheet returns sum Total 2135.14. All we did was replace our asterisk (the “everything” selector) with the normal spreadsheet function we wanted to execute on any line that had to do with pencils. At this point, if you’re not seeing the immense power and utility of the QUERY() function then you need to dream a little bigger. Here we go with bigger…
=query(A3:G46,”select C where (B<>‘Central’ and G>1000) or (E<10)", 1)
This query should give us the names of any sales reps who do not work in the central region (B<>’Central’) and whose line item sales are greater than $1,000 (G>1000) or any sales rep who sold less than 10 units of any particular item (E<10). I have no idea why you would want to get this particular information…but you can…and it works!
If you want to get the average units sold per salesperson, simply create a pivot:
=query(A3:G46,”select AVG(E) pivot C”, 1)
…and you get these results…
Other options available include group by and order by for various kinds of grouping and sorting. In fact, with just a very few number of commands, the QUERY() function can provide astounding results. Here’s a link to the simple Google Docs Support Page and a link to the Google Developers Page–both for your light reading and query-function-learning pleasure.
The QUERY() function can even be nested within itself to run exceedingly complex lookups. Spoiler alert! You have to use ARRAYFORMULA() and some other fanciness, but it can be done. I won’t pretend I’m good enough to walk you through one of those examples but there are plenty of them online.
This is a quite simple function to deploy at its most basic level but it retains the ability to provide immensely powerful results when used in a more advanced manner.
If you make something awesome with the QUERY() function, let me know about it. I’d love to see what others are doing with this powerful tool!
Jerod Karam is Vice President of Technical Operations at Netvantage Marketing, an online marketing company specializing in SEO, PPC and social media. Jerod consults with internal teams and external clients on all manner of technical projects, manages the flow of information surrounding the company’s online objectives, manages relationships with external partners and suppliers, and is a constant bother to everyone in terms of maintaining online security.
Atomic Object’s blog on everything we find fascinating.
By: Justin DeWind
Many of us at Atomic Object leverage spreadsheets for various purposes (estimates, hours tracking, finances, etc.), and since we have strong technical backgrounds, we tend to leverage a lot the functions that spreadsheets provide (avg, max, min, ceiling, sum, etc.). We also tend to push the boundaries of spreadsheets by leveraging multiple functions in one cell and doing some complex filtering.
Effectively, many of us are trying to analyze data in the same way we would analyze data in a database leveraging SQL . Historically spreadsheet applications did not have a query language to support easier data analysis. But recently, I stumbled across Google Sheets’ QUERY function, which supports querying a spreadsheet dataset using SQL .
When to Use QUERY
The query language provides a lot of the same functionality that already exists via spreadsheet functions like AVG , MAX, MIN , SUM, etc. In many cases it won’t provide any more value than those functions and is more verbose than the built-in spreadsheet functions. However, when the analysis requires filtering, grouping, pivoting, and ordering, the query language is a more effective and succinct approach than spreadsheet functions.
The QUERY Function
The QUERY function takes 2 required parameters: the dataset to query (A2:E6, for example), and a text (string) to represent the query itself. The query language is analogous to SQL , but with a limited function set and no need to use the “FROM” keyword since the data set is already defined. You can learn more about the QUERY language on the Google Developers site.
Below is an example spreadsheet I put together with a dataset that represents salaries across different ages and professions. To the right of the dataset there are the the following statistics that leverage the QUERY function:
In a nested Query formula in Google Sheets, a Query is written inside another Query. The result of the Subquery or we can say the inner Query is used to execute the outer Query.
Where is Advanced Settings in Google Sheets?
To access these settings, click the ‘Share’ button in the upper right corner of a Google Docs, Sheets, or Slides file, and then select ‘Advanced’.
How do I use advanced filter in Google Sheets?
Filter your data
- On your computer, open a spreadsheet in Google Sheets.
- Select a range of cells.
- Click Data. Create a filter.
- To see filter options, go to the top of the range and click Filter . Filter by condition: Choose conditions or write your own.
- To turn the filter off, click Data. Turn off filter.
What is query formula?
The format of a formula that uses the QUERY function is =QUERY(data, query, headers) . You replace “data” with your cell range (for example, “A2:D12” or “A:D”), and “query” with your search query. Advertisement. The optional “headers” argument sets the number of header rows to include at the top of your data range.
Does Excel have query like Google Sheets?
You can then work with live Google Sheets data in Excel. In Excel, open the Data tab and choose From Other Sources -> From Microsoft Query. The Filter Data page allows you to specify criteria. For example, you can limit results by setting a date range.
How do I sort a Google spreadsheet query?
Just change the “Asc” to “Desc” to sort the column B in descending order. The B is the column indicator. If you want to sort the column that contains the first name, change column identifier B to A.
How do I sort Google spreadsheet by date?
Below are the steps to sort by date:
- Select the data to be sorted.
- Click the Data option in the menu.
- Click on ‘Sort range’ option.
- In the ‘Sort range’ dialog box: Select the option Data has header row (in case your data doesn’t have a header row, leave this unchecked)
- Click on the Sort button.
What is Google spreadsheet filter?
A Guide To The Google Sheets Filter Function. The Google Sheets Filter function is a powerful function we can use to filter our data. The Google Sheets Filter function will take your dataset and return (i.e. show you) only the rows of data that meet the criteria you specify (e.g. just rows corresponding to Customer A).
Can you filter in Google Sheets?
In Google Sheets, open the spreadsheet where you want to create a filter view. Click a cell that has data. Create new filter view. After you select the data to filter, click OK.
How do you filter a formula in a spreadsheet?
FILTER can only be used to filter rows or columns at one time. In order to filter both rows and columns, use the return value of one FILTER function as range in another. If FILTER finds no values which satisfy the provided conditions, #N/A will be returned.
Can you filter by month in Google Sheets?
As said, you can use this custom formula to filter by month using the filter menu in Google Sheets. Just change the month number in the formula to filter any months. Anybody who wants to filter the column B using a formula in a new range, here is that Filter formula.
Why can’t I get Vlookup to work?
Solution: You can try to fix this by adjusting your VLOOKUP to reference the correct column. If that’s not possible, then try moving your columns. The solution is to use a combination of INDEX and MATCH functions, which can look up a value in a column regardless of its location position in the lookup table.
The QUERY() function in Google Sheets gives you the ability to quickly filter and sort your data similar to how you might get data from a database. If you write SQL queries, the QUERY() function feels easy and natural to use. There are a few caveats as I discuss in this episode. If you want to follow along with the exercises I discuss in this episode, make a copy of this Google Sheet which contains the QUERY() functions I mention in the episode.
Basic query to find confirmed cases greater than 50,000
Our data set is from the COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University. The data shows confirmed cases, deaths, and recovered cases by country (188 countries) on May 1st:
The first query simply pulls back the list of countries and confirmed cases where the number of confirmed cases is greater than 50,000. Notice how you reference the column letter name versus the actual name of the column in the header row:
The first parameter is covid_data which is a named range in Google Sheets. In this case, it references cells A1:E188 in our data set.
More SQL-like commands
You can do many database-like commands with the QUERY() function. The next example shows how you can use the ORDER BY command to find countries with deaths between 0 and 5 and the resulting list is sorted in descending order:
Check out Ben Collins’ blog post about the QUERY() function to see some of the other SQL commands you can use.
Adding in new calculated columns
In the third query, we get a little more advanced and use the LABEL command to create a new “column” called Case Fatality Rate . This calculation is simply Confirmed / Deaths . Unlike SQL, you put the LABEL at the end of the command instead of in the beginning of the SELECT statement:
Coming from SQL, you’ll need to account for the difference in the order of commands in the query in order for it to work correctly.
Inability to select column names
You’ll notice that you don’t put the actual names of the columns in your header row in the query. This can be a pro or con of the QUERY() function depending on how your underlying data set is structured.
Columns are changing a lot
If you underlying data is constantly “shuffling” where columns are moving around and the structure of the data is not set in stone, the QUERY() function will most likely break because you’re referencing the column letter instead of the column name like in a traditional SQL query.
Columns are fixed
If your columns are not shuffling around a lot, this syntax of selecting the column letter may actually be easier for you. This is because you don’t have to type out the long column name in the QUERY() function. If data is simply getting appended to the bottom of your data set, then the QUERY() function should work fine for you because the letters of the columns will always reference the correct columns of data.
PivotTables vs. the QUERY() function
One of the reasons I don’t use the QUERY() function too often is because I find PivotTables to be easy enough to use to filter, sort, and aggregate my data to do my analysis. Additionally, your columns can move around in the underlying data set and the PivotTable will still work since it’s not referencing columns by letter but rather by the name in your header row.
Plotting trend lines for COVID-19
One of the articles I discuss in this episode is this Vox article about how the Council of Economic Advisers may have applied a stock trendline in Excel to “forecast” deaths as a result of COVID-19. The article discusses the concept of “smoothing out” volatile data versus prescribing a forecast, and that line between these two concepts is a bit blurry. This is the cubic chart in Excel which you can easily build from the trendline features in Excel:
And then this is the chart from a CEA Tweet that appears to show the cubic trendline as a potential forecast:
Other Podcasts & Blog Posts
In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:
To give you some background, I created a Microsoft Form to collect responses from individuals in my company that feeds into an Excel spreadsheet (this is a Forms for Excel form so I can access the spreadsheet and manipulate the data via excel online). The form branches into 2 sections of different sets of data. Instead of having 1 spreadsheet with all the responses (which will result in half blank rows depending on the responses in the Form), I’d like to separate and compile the responses in 2 separate sheets within the same Excel file.
This is my first time back to a company that uses Microsoft in a while so I’m not as familiar. I had been using Google and was able to find some help to do what I wanted in Google Sheets (and before you ask, the reason why I’d like to not use Google Sheets is that it would not be compatible with the other Microsoft-centric tools we’re using. It could not be shared on the various tools that people use here so it would be a total mess):
First, the responses from a Google Form were compiled in a MASTER sheet:
Next, I was able to use a QUERY function to select just the rows that had responses in certain cells (based off the response to “Is this a. ” question/column):
Idea for new event sheet:
As you can see, I was able to copy over only the rows that had responses to certain cells (essentially the cutoff was Columns “S” & “T”) by asking to return cells that had data (i.e. “not null”). Also, there are no blank rows for either sheet as it would just add new rows to each of the separate spreadsheets. This makes it so much easier to read without having to scroll all the way to the right for responses to the “Idea for new event” sheets.
Basically, I wanted to know if something like this was possible in Excel. I have tried PowerQuery with the “remove blank rows” option, but that does not allow me to make changes to the queried data without it editing over it once the query runs again. You would think I could do something simpler with the IF function, but I can’t figure out how that might work. I may very well be missing something simple but, again, it’s been a while since I’ve used Excel.
The ability to query Google Sheets as though it were a database is absolutely awesome. There’s just one small challenge:
You can’t reference columns by header labels, i.e. the names you add the first row of each column.
This limitation exists probably because the first row of a spreadsheet is no different from all the other rows. It’s just a row. It serves no pre-defined special function such as to specify names of columns.
Fortunately, though, Google Sheets is insanely awesome in a million other ways. And with a little Google Sheets trickery, you can easily query Google Sheet by the column names in your header row. Here’s how.
As you no doubt know, the Google Sheets QUERY function requires that you reference a column by it’s letter.
The problem is… what if you don’t necessarily know which column your desired data is going to appear in. For example, maybe you’re pulling in data from a CSV feed using IMPORTDATA? The data could get moved around on that external file, columns might drop in and out, etc.
What we want is to be able to look up columns by a value in the header row of our sheet. Something like:
We’ll achieve our desired result by figuring out the letter name of each column where the first row’s value equals the label we want.
But before we get started, let’s set up an example. Here’s a table of data.
The first step is to search the first row for the desired column name and return the column’s position. To do this, we’ll use MATCH.
This will return the value “3“. In other words, the formula has found the value “Year” in the third column of the first row.
So, now we know the column number… and, because your header is pretty much always going to be the first row, we know the row number.
We can take these two numbers (1 and 3) and pass them to the ADDRESS function.
The ADDRESS function gives us the of the actual cell reference as a string. In this case, we get “C1“.
Hey! There’s that column letter that we’re looking for. If only there were a function to get rid of that pesky row ID… 😉
As you’ve probably guessed, we’ll apply SUBSTITUTE to get rid of the “1”, leaving just “C“.
All that remains is to plug this into our QUERY function and we can dynamically reference columns by the column names in the first/header row.
Most versatile effective function and unique to Google Spreadsheets. I am documenting this function to understand it properly and for those who do not have programming background. I do not have previous experience with query. i will try to illustrate with examples.
If you have confusion and stuck some where please post your questions Google Docs forum at the following link https://productforums.google.com/forum/#!categories/docs/spreadsheets. There are experts in this area waiting for your questions.
Google Spreadsheet query is designed to be similar to SQL with few exceptions. it is a subset of SQL with a few feature of its own. if you are familiar with SQL it will be easy to learn.
The Syntax of the function is as follow:
QUERY(data, query, headers)
DATA : it can be columns(A:C open ranges) of data you want to query, range of cells such as A1:C10, result of function such as importrange, index etc.,
QUERY : It is similar to SQL with small exceptions there is no FROM clause in the this since DATA itself is acting like a FROM clause.
HEADERS : If your data has headers in the row you can specify this here (suppose your first row has headers you can specify this as 1)
column headers have to capital letter such A, B, C if you are picking up the raw data with in the same spreadsheet.
if you are using the array formulas to manipulate the data (Index, Filter, importrange to name a few) then column headers will Col1, Col2, Col3 etc. observe i have used C capital letter in Col1. this is syntax you have to follow this otherwise you will get a parse error
parse error: an error of language resulting from code that does not conform to the syntax of the programming language; “syntax errors can be recognized at compilation time”
Data types: supports data types are string, number, boolean, date, datetime and timeof day. all values of the column will have a data type that matches the column type or a null value
The syntax of the query language is composed of the following clauses. Each clause starts with one or two keywords. All clauses are optional. Clauses are separated by spaces. The order of the clauses must be as follows:
Select which columns to return, and in what order, if omitted, all the table’s columns are returned, in their default order
Return only rows that match a condition. if omitted, all rows are returned.
Aggregates values across rows
Transforms distinct values in columns into new column
sorts rows by values in columns
Limits the number of returned rows
skips a given number of first rows
sets column labels
formats the values in certain columns using given formatting patterns
sets additional options
now let try to understand these clauses with an example
our data set like this
try this formula
=QUERY(A:D;”select A,B”;1) the result will be as show in the below image
We have some special keywords called functions, Functions are bits of code that perform an operation on a value or values. The first we will see is to perform a mathematical operation on a column. We will see Sum function which by totaling the values in a column designated by parentheses.
suppose we want to Sum Column C where the Column B is Nicole
=QUERY(A:D;”select B, Sum(C) where B = ‘Nicole’ group by B”;1)
the result will be
(1) you have to use S capital in the Sum followed by column you want sum in parentheses
(2) whenever you are using the group by clause same column has to be selected in the select clause otherwise you may get value error
(3) condition you want the check in the column B has to be in single quote
this can be sorted using the order by clause
=QUERY(A:D;”select B, Sum(C) where B <>” group by B order by B asc”;1)
the result will be
(1) observe the column 1 has been sorted in ascending order
(2) to get the the descending order you can use the desc instead of asc
you can also sort the based on the result of the Sum(C)
=QUERY(A:D;”select B, Sum(C) where B <>” group by B order by Sum(C) desc”;1)
the results will be
(1) we have sorted data by descending order based on Sum of sales
(2) you have to follow the order of clause listed above
We can also limit the results top 3 or top 2 using the limit clause
=QUERY(A:D;”select B, Sum(C) where B <>” group by B order by Sum(C) desc limit 3″;1)
We can also the name the column sum Sales as Top 3 sales
=QUERY(A:D;”select B, Sum(C) where B <>” group by B order by Sum(C) desc limit 3 label Sum(C) ‘Top 3 Sales’”;1)
We can also pivot the data based on date in the D column.
=QUERY(A:D;”select B, Sum(C) where B <>” group by B pivot D”;1)
(1) You might have observed the column D is not selected the select clause
(2) Date are formatted in yyyy-mm-dd format
(3) pivot is unique to google Sheets Query function
In the next blog post we will see how to manipulate data using the cell value