As I recently found this paper describing a sniffing mechanism for iOS
using Apple's NEPacketTunnelProvider Extension, I got curious and it made me want to understand it from a technical point of view. As I usually don't work at a deep network layer like that, I'm not able to comprehend it in the detail I'd like to. As Charles Proxy for iOS must do something very similar without requiring supervised devices, I assume the approach which the author of the paper presented in 2016 might be still working nowadays.
The author claimed that "Everything like IP packet parsing, building
an IP packet or parsing a DNS response had to be implemented ourselves." As I want to fully understand that, I tried to build it myself. I build a NetworkExtension
and a message loop for the packetFlow
of the NEPacketTunnelProvider
. I was able to obtain the ip datagrams and tried to parse them. I used unsigned integers of the corresponding size for the source and target ip, the transport protocol and ip version, but I'm unsure how to handle the treat the payload. My parser uses the ptr.load(fromByteOffset: <offset>, as:<DataType>.self)
where ptr
is a UnsafeRawPointer
to access the packet flow information. Since the data
might exceed the storage of UInt64
, I don't know how to access and store the payload in a proper way.
Furthermore, I figured that the source IP is always 192.168.20.1
(set as my interface's NEIPv4Settings address) and my target ip is always 192.168.2.1
(my dummy NEDNSSettings server). This leads me to my first questions: Are those DNS queries? Will the datagram packet claim any further information about the actual target? Would that mean that I have to somehow execute the request to the DNS server and reroute the packet to the target which I will obtain from that DNS query?
The next step would be to implement a TCP / UDP handling, right? My current parsing approach is able to distinguish between UDP
, TCP
and ICMP
(even though I don't have investigated in the last one yet). Therefore, I'd iterate over the datagrams and lookup whether they require a UPD
or TCP
session/connection and transfer the datagram. The problem I currently see their from a conceptional point of view: How do I know which source/target port to use for TCP/UPD connections/sessions? As far as I know, this information is not part of the IP Packet itself (since it's rather some information we need on transport layer level, not on network layer level).
Additionally, I found a project called Specht on github. It uses a self-written library called NEKit which somehow also uses the NEPacketTunnelProvider
approach. When I understand their approach correctly, they managed to somehow build a local proxy server by writing some observer mechanisms in order to handle the requests, but since I'm relatively new to networking and swift, I'm not sure whether I understand that completely correct or whether I just haven't find all those TCP/UDP and/or DNS logic. Is this project comparable to the approach of the paper and charles proxy?
One last question: Charles proxy is in most cases able to show the hostname of the target. I'm currently just able to see destination ip addresses (which aren't real destination ip addresses, but the address of my DNS server). How am I able to see the hostname as human readable text? Does Charles do a nslookup
somehow? Does Charles obtain that information out of the datagrams?
I know it's quite ambitious of me with much missing knowledge in this topic, to build something similar for test reasons, but I'm still motivated to look deeper into that topic and also have the feeling that I have understand already some key points, but unfortunately not enough to solve the puzzle... Maybe you're able to give me some more hints to get a better understanding. If there might be even an easier way to archive a similar behavior (to see outgoing connections on hostname level), I'd be interested in these as well :-)