Neiro - Functional programming, software architecture
21 Jan 2018

Postgres full-text search using Ecto

PostgreSQL is one of the most popular, stable and common relational database. It’s widely used in Elixir infrastructure and has a great integration with Ecto library.

If you’re into web development you can often face the search problem. You have a large stable database, you have tons of useful information, great tools, but your users desire to find something by arbitrary questions. You can’t use your favourite plain SELECT … WHERE … queries because you need to search for the words, not columns or records.

This if what the full text search stands for.

Fortunately, PostgreSQL has a built-in support of the full-text search. It allows you to parse your data into tokens, convert these tokens to lexemes - normalised forms of words - and, finally, search and make search optimisations.

2 Create a search module

Our next step will be creating a search module. Let’s suppose that you already have a User schema with a username, first name and last name defined. In that case we can implement a simple search query that can be used in your contexts or controllers:

  defmodule App.Users.Search do
    @moduledoc """
    Implementation of the full-text user search

    import Ecto.Query

    @spec run(Ecto.Query.t(), any()) :: Ecto.Query.t()
    def run(query, search_term) do

As you can see here, we’re defining the run function that will accept another Ecto.Query with the search term and will return Ecto.Query though.

We need to escape all of non-words characters from user’s input:

  String.replace(term, ~r/\W/u, "")

Also we need to allow to search by prefix - beginning of the word. Let’s add :* to our search term*:

  defp prefix_search(term), do: String.replace(term, ~r/\W/u, "") <> ":*"

Now we need to implement the search by using Ecto’s fragment macro and totsquery PostgreSQL function:

  def run(query, search_term) do
          "to_tsvector('english', username || ' ' || first_name || ' ' || coalesce(last_name, ' ')) @@

Let’s take a look at the implementation. At first, we need to compose a search token by joining columns:

  to_tsvector('english', username || ' ' || first_name || ' ' || coalesce(last_name, ' ')) @@ to_tsquery(?)

Created tsvector will be tested by operator @@ with tsquery and will return the result if the matching was sucessful.

But this result won’t be as fast as we wanted. Let’s suppose that we have 11K users:

  User |>"meta") |> Repo.explain

  Filter: (to_tsvector('english'::regconfig,
  (((((username)::text || ' '::text) || (first_name)::text) || ' '::text)
  || (COALESCE(last_name, ' '::character varying))::text)
  @@ to_tsquery('meta:*'::text))
  Rows Removed by Filter: 11285
  Planning time: 1.640 ms
  Execution time: 62.235 ms

3 Create an index

Auspiciously, we have opportunity to use the PostgreSQL trigram index in order to improve performance of full-text queries. Let’s modify our transaction:

  defmodule App.Repo.Migrations.IntroducePgSearch do
    @moduledoc """
    Create postgres extension and indices

    use Ecto.Migration

    def up do
      execute("CREATE EXTENSION pg_trgm")

      CREATE INDEX users_trgm_idx ON users USING GIN (to_tsvector('english',
        username || ' ' || first_name || ' ' || coalesce(last_name, ' ')))

    def down do
      execute("DROP INDEX users_trgm_idx")
      execute("DROP EXTENSION pg_trgm")

Now run mix do ecto.rollback, eco.migrate and try to run the search:

  User |>"meta") |> Repo.explain

  Recheck Cond: (to_tsvector('english'::regconfig, (((((username)::text || ' '::text) || (first_name)::text) || ' '::text) || (COALESCE(last_name, ' '::character varying))::text)) @@ to_tsquery('meta:*'::text))
    Heap Blocks: exact=57
    ->  Bitmap Index Scan on users_trgm_idx  (cost=0.00..20.74 rows=65 width=0) (actual time=0.093..0.093 rows=65 loops=1)
          Index Cond: (to_tsvector('english'::regconfig, (((((username)::text || ' '::text) || (first_name)::text) || ' '::text) || (COALESCE(last_name, ' '::character varying))::text)) @@ to_tsquery('meta:*'::text))

  Planning time: 1.348 ms
  Execution time: 0.457 ms

Voila! Now that’s the speed we wanted. Our users will be so happy with this search!

4 Conclusion

What’s next? You can try using the functionality of pgtrgm module, search for similar words or making the search not just prefix-only. See the docs for PostgreSQL full-text search and trigram .

Happy hacking!

Tags: elixir query postgres search