Speeding up checking of IP address membership in CIDR ranges, for large datasets
Asked Answered
M

4

6

In a Postgres DB, I need to filter a set of several hundred thousand rows in a table A by including only those rows for which an IP address column (of type inet) in the row matches any of several thousand IP address blocks (of type cidr) in another table B. I've tried various indexes on the inet addresses in the first table and the cidr ranges in the second, but no matter what I do, the planner does a nested sequential scan, applying the << operator to every pair of IP addresses and prefixes.

Is there a way to speed this up with indexes or other clever tricks? (I can resort to external procedural scripting, but I was wondering if it's doable within Postgres.)

Thanks!

Melancon answered 11/9, 2013 at 23:26 Comment(3)
Curious to hear a solution. Some searching turned up some mailing list threads, which in turn led me to this demo implementation of rtrees for inet/cidr types. Might be handy reading.Ringent
Look up optimizations for member-in-range searches and convert your cidr table into a first-address/last-address representation with two columns.Pseudohemophilia
Thanks caskey, it looks as though your suggestion, combined with the use of ip4r as suggested by bma, does the trick. I'll formulate a full answer shortly.Melancon
M
2

Case closed. To make things fast, do the following:

Given the above, if you're using type ip4 (assuming you're dealing with v4 addresses) for all compared addresses, then the planner will leverage indexes on those columns.

Thanks for the help, guys!

Melancon answered 12/9, 2013 at 0:20 Comment(1)
Note: current Postgres versions do support indexed access for the built-in inet operations (this wasn't supported when the answer was posted).Supernatural
S
12

This is an old question but prominent in Google results, so posting my 2 cents here:

With Postgres 9.4 and later you can use GIST indexes for inet and cidr: https://www.postgresql.org/docs/current/static/gist-builtin-opclasses.html

E.g. the following query will use the gist index (assuming a table from MaxMind's free dataset):

create index on geolite2_city_ipv4_block using gist (network inet_ops);

select * from geolite2_city_ipv4_block where network >>= '8.8.8.8';
Supernatural answered 17/10, 2016 at 15:21 Comment(0)
C
2

Have you looked at ip4r? http://pgfoundry.org/projects/ip4r. IIRC, it is really fast for INET-related lookups.

Crick answered 11/9, 2013 at 23:38 Comment(1)
I'm aware of it, yes. Can you clarify whether it would still use nested sequential scans in which each individual comparison is faster, or something more fundamental? They do say something about indexability on their site, but I don't know whether this would be indexing that helps in my context.Melancon
M
2

Case closed. To make things fast, do the following:

Given the above, if you're using type ip4 (assuming you're dealing with v4 addresses) for all compared addresses, then the planner will leverage indexes on those columns.

Thanks for the help, guys!

Melancon answered 12/9, 2013 at 0:20 Comment(1)
Note: current Postgres versions do support indexed access for the built-in inet operations (this wasn't supported when the answer was posted).Supernatural
S
0

having already deployed rules that ensure that the CIDRs are non-overlapping, I just used an index btree tablename(network(addr))

given target as an input check like this:

SELECT target <<= addr FROM tablename WHERE network(addr) <= target ORDER BY network(addr) DESC LIMIT 1;

it returns true on success false or nothing on failure.

Sabina answered 15/6, 2023 at 22:58 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.