Private Internet

Posted on 2024-08-16

Many widespread internet protocols were written at a time when internet security wasn’t much of a consideration. From things like lack of From address verification in email to NTP reflection there are lots of protocols that are now considered badly designed. When they were authored there was a lot of implicit trust (For example because there were just a few universities connected to your network and you could just kick off any bad actors.) but now the internet is full of bad actors and these designs are harming us.

I think IP is generally considered well-designed. IPv4 refuses to die and IPv6 is working great. But I think that these protocols have some fundamental flaws. Or at the very least they need more layers on top to work well on today’s internet.

The most obvious sign of this is the many non-technical people know what an IP address is. I don’t think the average user should need to care how their networking works, but due to issues with the protocol many people are aware that their IP is sensitive information and that they need to protect it.

Two of the biggest flaws of IP are privacy concerns and DoS susceptibility. These were likely not even considered when designing the protocol. But much like From address forgery was not a concern when designing SMTP, I think these privacy issues should be considered a deficiency that needs addressing.

Metadata absolutely tells you everything about somebody’s life. If you have enough metadata you don’t really need content.

Stewart Baker, NSA General Counsel

Valuable Properties

Here I tried to brainstorm a few properties that I would like to see in a private and DoS resistant network protocol.

Non-sensitive Client Addresses

If I make two connections, the remote end should not be able to tell if I am the same person. This property should hold whether I connect to the same host twice, or two hosts that coordinate.

DoS Resistant

All methods of connecting to me must be revocable. That is at some point they stop functioning and you need new information to connect. Ideally this could be done at any time, immediately invalidating an address, but could be time based (address expire after a period of time).

This revocation must not just cause the endpoints to ignore traffic. The traffic should be dropped as close to the source as possible. (For example the first “well behaved” network. On the internet this would at least be a tier-1 ISP, if not the attacker’s own ISP)

No Leak of Outgoing Connection Server Address

If I open a connection, a local observer must not be able to identify which server I am connecting to. This property should continue for as many hops as possible. (At some point the traffic will have to reach the server and someone who can observe every hop on the path will be able to correlate it.)

Independent Services

A server should be able to host any number of public services and no client should be able to tell if these services are co-located.

No Leak of Incoming Connection Server Address

If I receive a connection a local observer must not be able to identify which address was used to create it. (If the address is publicly known they can likely probe this by making connections and comparing traffic, but I’ll consider that out of scope of this feature.)

Stretch Goals

Encryption

I didn’t require encryption as while it would be nice (my default stance is to encrypt everything, more encryption never hurts) it may make the protocol significantly more expensive. Furthermore, fundamental protocols like IP tend to evolve very slowly, so when the encryption included in the protocol is no longer state-of-the-art it will be very hard to update. For these reasons encryption is likely best managed by a higher layer.

That being said it would still be a great stretch goal. Especially if the traffic was differently encrypted on each hop. This would prevent attacks where traffic is sniffed at multiple locations on the network to try to identify different ends of a connection. This is a feature that could be hard to implement just on the endpoints.

Possible Implementations

I haven’t actually designed a protocol. These requirements are more of a discussion point. I’m sure other properties will be desirable as well. But I did think of ways to accomplish some of them.

Onion Routing

My first thought would be some form of onion routing. The client builds a route and encrypts the packet for each layer in turn. Each router then decrypts a layer and forwards it. This has worked well for Tor but seems to have some fundamental downsides.

Some concerns with this approach:

The client needs to know about various routers along the path. On the public internet this is a lot of data that is frequently changing.
Every layer needs metadata on where to forward the packet next, eating into the MTU.
Encrypting data many times is expensive.
How is the connection initially routed while keeping the server private?

The first 3 could be mitigated by reducing the number of hops.

Individual ISPs could act as one large hop, rather than multiple routers. This could reduce the number of hops for each connection and total number of logical routers that clients need to know about.
Instead of having a layer for every hop the packet can be wrapped only until large networks. For example major ISPs. At this point they can absorb any attacks and your traffic is well mixed with other sources.

I don’t know how to solve 4. Tor Onion Services solve this by having the server maintain a few fixed routes leading to it. The client then adds one of these (encrypted) routes to the end of its route. However, this results in a non-optimal path which may not be a good tradeoff for general internet use. Maybe the client can evaluate them all use use whichever one is fastest?

Fine-grained Routing

Another option may be implementing something like BGP but on a mine narrower scale. Servers generate random values as addresses and announce them. These are then broadcast across routers like BGP routes are today. When the client connects it generates a random “return” address and sends its traffic. Routers along the route also remember the return address to establish a connection.

There are also concerns with this approach.

How can we avoid leaking the server address?
Internet routing tables are already huge. To help combat this there are minimum network sizes that are publicly routable. Having many unique routes to each host would explode routing table size.

We can avoid leaking the client address by having each router along the path transform it. (Example: generate a new random address and remember the mapping) However, this trick doesn’t work for the server address because every client needs to be able to connect to the same server address.

I have no idea how to solve 2. If addresses are random it is very hard to be able to route to any random address from anywhere else.