Josh Bavari's Thoughts

Thoughts on technology and philosophy

Adding PostgreSQL Geometric Type Support to Elixir

about a 6 minute read

In the last week or so, I’ve had a blast playing around with basic Postgres geometric types to do basic earth distance queries.

From my favorite blog, Datachomp shows how to use radius queries in postgres to find the closest place to get a burrito fix. Since I’ve been on an Elixir kick lately, I figured it was time to contribute back to the open source world by adding first class burrito, err, geometric type support.

Initial reaction

I immediately made an Ecto model trying to use the point type in my model:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
defmodule MyApp.LocationPoint do
  use MyApp.Web, :model

  schema "location_point" do
    field :name, :string
    field :date, Ecto.DateTime
    field :location, :point
    timestamps
  end

  @required_fields ~w(name date)
  @optional_fields ~w(location)

  @doc """
  Creates a changeset based on the `model` and `params`.

  If no params are provided, an invalid changeset is returned
  with no validation performed.
  """
  def changeset(model, params \\ :empty) do
    model
    |> cast(params, @required_fields, @optional_fields)
  end
end

Right away, when I ran the commands to retrieve this location from iex, it gave me some errors:

1
2
3
4
5
$ iex -S mix
$ alias MyApp.Repo
$ alias MyApp.LocationPoint
$ Repo.all(LocationPoint)
$ ** (ArgumentError) no extension found for oid `600`

Right away, I knew this mission was up to me to get point support into Postgrex.

In this post, I’ll outline how to add type support to Postgres via the Elixir package, postgrex. We will walk through adding the Point data type in Postgres.

This post will cover:

  • How to see how postgres stores its types (built in and composite)
  • How postgrex does its type lookups
  • Finding the source type – adding it to postgres senders
  • Looking up postgres source code for data mapping
  • Adding new type Point type
  • Adding built in Type structs
  • Adding encode method
  • Adding decode method

How Postgres stores its types

Postgres stores its types in a special system table called pg_type (docs). It defines a few things about the type:

  • Its typelem – how the type is stored – array, or otherwise
  • Its typsend – Output conversion function (binary format), or 0 if none
  • Its typarray – an oid to another array type that has its send method

How Postgrex does type lookups

Postgrex at it’s core is a simple data adapter into PostgreSQL from Elixir. It’s an awesome library, and if you’re using Ecto, you’re already using it!

First, let’s look at how they are loading most types, by looking them up in the pg_type table in postgres:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
  ### BOOTSTRAP TYPES AND EXTENSIONS ###

  @doc false
  def bootstrap_query(m, version) do
    {rngsubtype, join_range} =
      if version >= {9, 2, 0} do
        {"coalesce(r.rngsubtype, 0)",
         "LEFT JOIN pg_range AS r ON r.rngtypid = t.oid"}
      else
        {"0", ""}
      end

    """
    SELECT t.oid, t.typname, t.typsend, t.typreceive, t.typoutput, t.typinput,
           t.typelem, #{rngsubtype}, ARRAY (
      SELECT a.atttypid
      FROM pg_attribute AS a
      WHERE a.attrelid = t.typrelid AND a.attnum > 0 AND NOT a.attisdropped
      ORDER BY a.attnum
    )
    FROM pg_type AS t
    #{join_range}
    WHERE
      t.typname::text = ANY ((#{sql_array(m.type)})::text[]) OR
      t.typsend::text = ANY ((#{sql_array(m.send)})::text[]) OR
      t.typreceive::text = ANY ((#{sql_array(m.receive)})::text[]) OR
      t.typoutput::text = ANY ((#{sql_array(m.output)})::text[]) OR
      t.typinput::text = ANY ((#{sql_array(m.input)})::text[])
    """
  end

You can see that under the hood, we’re querying Postgres and asking it for it’s types, so we can do OID lookups and call the appropriate encoder/decoder methods. From here, we can match up our newly added types encoding/decoding methods.

Finding the source type – adding it to postgres senders

Find information about the geometrics category:

SELECT * from pg_type where typcategory = 'G';

We will see the point type has an oid of 600, which is using a send specification of point_send. Other notable send types for geometries: point_send lseg_send path_send box_send poly_send line_send circle_send.

Thus, we’ll update the send types in postgrex, located in the binary.ex file:

1
2
3
4
5
6
@senders ~w(boolsend bpcharsend textsend varcharsend byteasend
            int2send int4send int8send float4send float8send numeric_send
            uuid_send date_send time_send timetz_send timestamp_send
            timestamptz_send interval_send enum_send tidsend unknownsend
            inet_send cidr_send macaddr_send point_send
            ) ++ @oid_senders

Boom, that gets us the oid to encode/decode off of!

Looking up postgres source code for data mapping

I hopped into the Postgres source code and looked up the struct type for point, found here.

1
2
3
4
5
typedef struct
{
  double    x,
        y;
} Point;

Great, its just two floats, no big deal.

Adding the point struct

Let’s craft our Postgrex stuct type in builtins.ex then!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
defmodule Postgrex.Point do
  @moduledoc """
  Struct for Postgres point.

  ## Fields
    * `x`
    * `y`
  """
  require Decimal
  @type t :: %__MODULE__{x: float, y: float}

  defstruct [
    x: nil,
    y: nil]
end

Adding the encode method

Now since we are sending PostgreSQL binary data, we need to take our data and map it to it’s binary form, via an encode method.

However, postgrex is going to do a type look up, based on the types that we used in that query above.

We’ll add the methods to encode, that does some pattern matching to decipher we are getting the correct sender value.

1
2
def encode(%TypeInfo{type: "point", send: "point_send"}, %Postgrex.Point{} = point, _, _),
  do: encode_point(point)

As you can see, we are encoding, when a TypeInfo tuple is passed with type point and send point_send! Great, we just pass that to this method to parse out the two floats passed in the binary object:

1
2
defp encode_point(%Postgrex.Point{x: x, y: y}),
  do: <<x::float64, y::float64>>

It just takes those two values, and serializes them down to their binary counterparts.

That now handles the test we’ve got to keep us honest:

1
2
3
test "encode point", context do
  assert [[%Postgrex.Point{x: -97, y: 100}]] == query("SELECT $1::point", [%Postgrex.Point{x: -97, y: 100}])
end

This test as promised, takes a Postgrex.Point type, and encodes it to the binary form, and sends it off to Postgres. How beautiful.

Adding the decode method

Now, when we get binary values from Postgres, we need to map that to our Point type we’ve created.

Adding the functions to decode in binary.ex:

1
2
3
4
5
6
7
8
def decode(%TypeInfo{type: "point"}, binary, _, _),
  do: decode_point(binary)

# ..snip..

defp decode_point(<<x::float64, y::float64>>) do
  %Postgrex.Point{x: x, y: y}
end

The real meat and potatoes is, receiving our binary parameter, mapping its individual segmets as two floats, sized 8 bytes, and then with the pattern matching mapping those to our Postgrex.Point struct. QED.

And the test:

1
2
3
4
test "decode point", context do
  assert [[%Postgrex.Point{x: -97, y: 100}]] == query("SELECT point(-97, 100)::point", [])
  assert [[%Postgrex.Point{x: -97.5, y: 100.1}]] == query("SELECT point(-97.5, 100.1)::point", [])
end

Conclusion

Once I finally figured out what pieces were what, I was able to run and create the point type, its mappings, and its senders it required, easily mapping to the struct in Elixir.

I plan to keep working on postgrex, to add first class support for Postgres geometric types.

Cheers!

Comments