Query JSON data in postgreSQL faster with JSONB
In my daily job as a Product Manager, I often have to do some data exploration to better understand our customers.
In our data, stored in a postgreSQL database, there are many fields formatted as JSON, and I often have to segment our users by those fields.
To do so, I use JSONB, and I wanted to share with you the little tips and shortcuts I use to query JSON in postgres. But first, what is JSONB?
What is JSONB?
JSONB is a structured format for storing JSON. Instead of storing data as plain text, JSONB stores data in a binary format.
Both JSON and JSONB will allow you to store valid JSON values. The main difference between JSON and JSONB will be writing and reading efficiency. As JSON stores data in full text, writing requires no overhead and is therefore slightly faster than with JSONB, that requires to convert text to binary. However, processing JSONB is way faster than processing JSON, as it doesn't require reparsing the data.
You can learn more in the PostgreSQL documentation.
Querying JSONB in postgreSQL
As our postgreSQL database involves a lot of processing on JSON data, we store it as JSONB. Therefore I need to use JSONB to manipulate this JSON data. So how does that work? Well, as it would do for JSON, because the beauty of JSONB is that you manipulate it essentially as you would manipulate JSON.
Let's consider a field myfield
in your table mytable
. (how original!)
This field is a JSON field, with values such as :
{
"key1":"valueA",
"key2":"valueB"
}
I want to select only the rows where "key1" exists and is equal to "valueA".
This is usually used for segmenting based on a value.
Option 1 : use the @> operand
SELECT *
FROM mytable
WHERE myfield @> '{"key1":"valueA"}'
Option 2 : use the ->>
operand
SELECT *
FROM mytable
WHERE myfield->>'key1'='valueA'
The ->>
operand will return, as a text, the value of the key key1
. You can use the operand ->
to return an array.
When accessing values within a JSON field, here are my go-to operands:
Operand | Right Operand Type | Description |
-> | text | Returns the value associated with the right-operand key as an array |
->> | text | Returns the value associated with the right-operand key as a text |
#> | array | Returns the value as an array at a specific JSON path. |
#>> | array | Returns the value as a text at a specific JSON path. |
@> | JSON field | Does the left JSON contain the right JSON? |
<@ | JSON field | Does the right JSON contain the left JSON? |
I want to select only the rows where the top-level 'key3' exists.
The operand ? is used to check the existence of a top-level key.
SELECT *
FROM mytable
WHERE myfield?'key3'
When checking for existence and containment n a JSON field, here are my go-to operands:
Operand | Right Operand Type | Description |
? | text | Is this key a top-level key in my JSON field? |
?| | array of texts | Is any of these keys a top-level key in my JSON field? |
?& | array of texts | Are all of these keys top-level keys in my JSON field? |
I want to see all the values key1 can have.
SELECT DISTINCT myfield->'key1'
Or, if you want them as text
SELECT DISTINCT myfield->>'key1'
That's it for today
You can find more info on the official PostgreSQL documentation.
Looking for more PostgreSQL tips? Check out our Postgres Cheat Sheet
Need to build admin panels or a GUI tool for PostgreSQL? Check out Forest Admin for PostgreSQL