How to sort a list of lists with IP subnets python
Asked Answered
M

3

5

I am a complete noob and have google made my first python script.

I am opening a 2 files and removing list 1 from list2.

Once list2 has been modified to remove what was in list 1, I want to sort the list by IP network. for example:

1.1.1.1/24
1.1.1.1/32
5.5.5.5/20
10.10.11.12/26
10.11.10.4/32

currently it is sorting

1.1.1.1/24
1.1.1.1/32
10.10.11.12/26
10.11.10.4/32
5.5.5.5/20

code:

import os
import sys
import random
import re

text_file = open("D:/file/update2.txt", "rt")
lines = str(text_file.readlines())
text_file.close()
ip_address = r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d] 
{1,3}/\d{1,2})'
foundip = re.findall( ip_address, lines )


text_file2 = open("D:/file/Block.txt", "rt")
lines2 = str(text_file2.readlines())
text_file2.close()
foundip2 = re.findall( ip_address, lines2 )


test =(list(set(foundip2) - set(foundip)))

items = sorted(test)
print (*items, sep = "\n")

Thanks in advance.

Moue answered 18/3, 2019 at 19:32 Comment(1)
IP addresses are not your text representation of them. IPv4 addresses are 32-bit unsigned integers, and they should be stored and treated that way, then you would not have that problem.Pakistan
M
3

the default sort is alphanumeric sort. You need to generate integer tuples from your ip addresses to use as your sort key function. I use re.findall with "digits" expression then convert to int (but there are other solutions, with split for instance)

import re

ip_list = """1.1.1.1/24
1.1.1.1/32
10.10.11.12/26
10.11.10.4/32
5.5.5.5/20""".splitlines()

print(sorted(ip_list,key=lambda x : [int(m) for m in re.findall("\d+",x)]))

prints:

['1.1.1.1/24', '1.1.1.1/32', '5.5.5.5/20', '10.10.11.12/26', '10.11.10.4/32']
Medrano answered 18/3, 2019 at 19:40 Comment(0)
R
3

You can avoid using complex regex or other solutions by using the ipaddress module from the python standard lib. This converts your strings into IP_network objects which are sortable.

from ipaddress import ip_network

ip_list = ["3.4.5.0/24", "1.2.3.0/24", "10.10.10.0/24", "5.6.7.0/24"]
print(sorted(ip_list, key=lambda x: ip_network(x)))

prints:

['1.2.3.0/24', '3.4.5.0/24', '5.6.7.0/24', '10.10.10.0/24']

You can also use the ipaddress library to detect if a string is a IP address and replace your regex. Regex is pretty slow in python, so you should avoid it if you can and need the speed.

ip_list = []

for ip in string_list:
  try:
    ip_list.append(ipaddress.ip_network(ip))
  except ValueError:
    pass

Rosaline answered 27/4, 2022 at 10:32 Comment(0)
C
1

Your problem stems from the fact that alphabetic sorting is not doing what you want it to do ( finding 10.x to be 'smaller' than 5.x ). Therefore you need to pass a function that will transform the ip address string into numbers, so that this function follows your intuition of what should come first.

Solution : I will first create a type for IP where I will parse the string into 4 groups and the port, and compare based on these tuples (see python tuple sorting).

from collections import namedtuple
ip_type = namedtuple("IP", 'g1 g2 g3 g4 port')

def to_ip(string: str) -> ip_type:
    groups, port = string.split('/')
    g1, g2, g3, g4 = [int(g) for g in groups.split('.')]
    return ip_type(g1, g2, g3, g4, int(port) )


array = [   '1.1.1.1/24',
            '1.1.1.1/32',
            '10.10.11.12/26',
            '10.11.10.4/32',
            '5.5.5.5/20'    ]

print(sorted(array, key=to_ip))
Comanche answered 18/3, 2019 at 19:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.