UDP isn't unreliable, it's a convertible (2024)

38 points by mlhpdx 6 days ago

tptacek 12 hours ago

This is something that the classic TCP/IP books (both Comer and Stevens) got wrong, inexplicably, because David Reed was very clear about it. UDP isn't a transport protocol at all. It's an extension interface for building new transport protocols. It is the thing you would use (and weirdly didn't) to ship SCTP, and did use to build QUIC. It does not make sense to compare UDP to TCP; it's like saying TCP is worse at URL handling than HTTP.

Talking about what UDP provides is a little bit like talking about what IP provides. It's almost a category error. Is "UDP" "reliable"? It's as reliable as whatever transport you build on it is!

jeroenhd 11 hours ago

I disagree. Raw UDP is about as useful as raw TCP if your application matches the protocol's features and weaknesses. Plenty of protocols work by spamming packets around the network where it doesn't matter if half of them go missing or arrive out of order.
TCP is also barely enough to be a real "protocol". It does a bunch of useful stuff but in the end raw text is pretty useless as a protocol. That's why everything from IRC to HTTP has its own protocol on top.
SCTP is a bit of a weird outlier to be placed at the level of TCP and UDP, but there's no need to go all QUIC and reinvent TCP if you want to use UDP. DNS, NTP, and plenty of other protocols don't need to rebuild reliability and streams to be useful.
- tptacek 11 hours ago
  
  I mean, OK, but the inventor of UDP disagrees. I agree you don't have to build a real transport at all if you don't want to. But the moment you do so much as a sequence number that lets you detect that you've missed some packets, you've build a transport, and you're not just using UDP.
  - zamadatix 11 hours ago
    
    The inventor of UDP could have intended one thing 45 years ago, but that does not actually mean that's the only way it happened to be. UDP is great for building a transport protocol on top of. Most uses of UDP do treat it as such. That is not evidence UDP must not be able to qualify or be used as a transport protocol itself.
    
    tptacek 11 hours ago
    
    I mean, sure, but then by that logic so is IP. UDP is just IP with an extra checksum and a 32-bit multiplexing field instead of an 8-bit one.
    
    zamadatix 11 hours ago
    
    All transport protocols are "just" adding things on top of a network protocol so you can multiplex to applications instead of hosts. That's what makes them transport protocols instead of alternate network protocols. TCP is "just" adding a few more things. I kinda wish UDP didn't have the checksum field, it would have been a bit nicer to build other transports with if one could decide to validate their headers with one's desired checksum approach (or not bother, if so desired) rather than have 2 separate checksums for the transport layer role. You could at least blank it for wasted space riding on v4, then v6 mandated it for some reason.
    UDP also isn't the only transport protocol used extensible either, it's just far more common because it has fewer assumptions. E.g. TCP has MPTCP and a bolt-on transport extension.
    
    tptacek 11 hours ago
    
    OK. But if you put a 32-bit header with a sequence number and an "ACK requested" flag on top of UDP, and arrange to send NACKs when you see skips in the sequence numbers, you're not using UDP; you're using a transport you built on top of UDP.
    This isn't just a semantic argument. People getting this conceptually wrong has broken the deployment story for things like SCTP, which for no good reason rides on top of IP directly and thus gets blocked by middleboxes.
    
    zamadatix 9 hours ago
    
    The disagreement is in that being the ONLY way UDP can be used is in building a more complex transport layer, not that building a more complex transport layer on top of UDP means it's still just UDP alone. Any transport protocol can be built upon as such, even including transport protocols already built atop UDP. That UDP can be layered on top of does not redefine UDP's ability to be a complete transport layer protocol in itself. UDP cannot be used as a network layer protocol, so it is not one, but UDP provides multiplexing (source+dest based), datagram sizing independent of the network layer, and even header checksumming on top of a network layer protocol - making it a (relatively) minimal transport protocol. The ability to be a transport layer protocol is not defined by the lack of ability to be built on top of to make a more complex transport layer on top of the given protocol.
    A classic practical example of "plain Jane UDP transport" might be an traditional SNMP trap (not one of the newer fancy flavors). No ACK, no bidirectional session setup, no message ID/sequence number, no retransmission, no congestion control - just a one shot blast of info to the given destination - split into datagrams and multiplexed to the receiving service via the transport layer over an arbitrary network layer.
    A lot of things broke SCTP... but I'd rather not get into a side debate about what those reasons are. Just that it's not because UDP alone is unusable as a transport layer protocol.
    
    tptacek 8 hours ago
    
    I honestly think a lot of stuff like SNMP's UDP semantics are based on faulty path dependence from people believing they had basically two options, either TCP's rigid sequential delivery and HOLB, or yolo-mode UDP.
    
    zamadatix 6 hours ago
    
    On the contrary, I think it was clearly a very intentional and deliberate decision to not add complexity (it is, after all, "Simple" Network Management Protocol) to the transport layer of SNMP. At the very least, the people involved certainly had alternative transports in consideration rather than a mistaken view of TCP vs UDP alone. Additional transport requirements were just not relevant to the use cases of SNMP. Select excerpts from RFC 1157:
    > Consistent with the goal of minimizing complexity of the management agent, the exchange of SNMP messages requires only an unreliable datagram service, and every message is entirely and independently represented by a single transport datagram. While this document specifies the exchange of messages via the UDP protocol [11], the mechanisms of the SNMP are generally suitable for use with a wide variety of transport services.
    From this, the authors intentionally kept things within a single datagram across any unreliable datagram service - UDP was just an obvious choice to define for the needs.
    > In the text that follows, the term transport address is used. In the case of the UDP, a transport address consists of an IP address along with a UDP port. Other transport services may be used to support the SNMP. In these cases, the definition of a transport address should be made accordingly.
    They continued to account for and allow for how other generic transport protocols could be used (at the time, not as many), rather than assume they only had the options of TCP or UDP.
    > In cases where an unreliable datagram service is being used, the RequestID also provides a simple means of identifying messages duplicated by the network.
    This shows other portions of SNMP did account for which specific features may need to be built on top of a minimal transport protocol and only added those to the specific PDUs which needed it. E.g. for this request based PDUs used by Get/GetNext/GetBulk etc they intentionally added an ID to handle message duplication, just not to every PDU, like traps, because it was unnecessary.
    > A limited number of unsolicited messages (traps) guide the timing and focus of the polling. Limiting the number of unsolicited messages is consistent with the goal of simplicity and minimizing the amount of traffic generated by the network management function.
    This shows the design of traps was heavily focused around simplicity and minimization of traffic, not based on what TCP or UDP could specifically offer. In fact, you won't find a mention of "checksum" or "hash" anywhere either in the RFC - UDP just had it as extra cruft on top of that generic "unreliable datagram service" they were designing against!
    .
    SNMPv3 did eventually add TCP as an option for traps a couple decades later, and hardly anyone ever opted to use it since as there really isn't much benefit from other transports for the use case. More have used the TLS option, but even more have just relied on the minimal, purpose defined encryption and HMACs added instead.
    Thanks for this discussion by the way, there is nothing more I love working with or talking about than network protocol design and history :).
ekr____ 12 hours ago

I agree with you about the category error.
In all fairness, though, there are quite a few application protocols which are built directly on top of UDP with no explicit intermediate transport layer. DNS, RTP, and even sometimes SIP come immediately to mind.
- tptacek 11 hours ago
  
  I'd argue that DNS has an implicit transport protocol by dint of the QIDs. It's a very simple transport! But that's the point, right.
  - ekr____ 11 hours ago
    
    I think we agree here: protocols built on top of UDP often build their own ad hoc reliability layer, as opposed to having an explicit one like TCP, QUIC, or SCTP.
- juliusdavies 9 hours ago
  
  SIP still has a reliability layer built on top: the humans will say “I didn’t catch that, can you repeat yourself?“ on the video call.
atoav 11 hours ago

There are legit use-cases for "fire-and-forget"-protocol. E.g. anything that is more about "hey, here is a new value, just like the past 100.000 I sent within the last second" than about "here is a critical command where it would really suck if we didn't know about it not arrive at its target system.
Not caring about whether every single transmission arrives is a totally legitimate technological choice and it can have some tangible benefits when it comes to throughput, latency etc.
As with every piece of technology UDP gets problematic when you are ignorant towards its limitations and misapply it.

SoftTalker 12 hours ago

I don't like the convertible analogy. It doesn't have anything to do with reliability.

I'd compare it to the mail. UDP is regular first-class mail. You send a letter, it probably gets to the address on the envelope. But it might get delayed, or even simply lost, and you'll never know.

TCP is like certified mail. It is sent by the same physical systems, but it is tracked and you get an acknowledgement that the recipient received it.

mlhpdx 9 hours ago

Author here, and I don't really like it that much either. The problem is finding some new way to pose it so the only content out there isn't misleadingly trite.
EvanAnderson 12 hours ago

I always use the mail analogy when talking about unreliable packet delivery with non-technical people. People seem to grok it very quickly.
(It makes spoofing and amplification attacks easy to communicate, too.)

eptcyka 13 hours ago

Reliability is only achieved when you’ve spilled blood at least tens of times figuring out the next reliability hurdle. TCP does MTU detection, bandwidth detection and all this at relative real time, not once per connection.

MoltenMan 13 hours ago

Am I going insane or is this an LLM post with all the em dashes replaced by double en dashes??

shiomiru 12 hours ago

It's deep in the uncanny valley. Even ignoring the writing style, the entire post is about the incredible insight of "you can build TCP on top of UDP" using a forced analogy. No further analysis, no real world example, just UDP and the convertible and the convertible and UDP...
abound 12 hours ago

I felt the same way, which is terrible news for me because I explicitly use double dashes in my (100% human) writing instead of em/en dashes -- explicitly because of the LLM association, but they're evolving
weepinbell 12 hours ago

https://www.pangram.com/history/f119a8ee-7e60-442a-8d0c-3fa0...
I show this more of as a discussion point than as a definitive answer. There is more research on this tool than others: https://www.nature.com/articles/d41586-025-02936-6
And I think it's interesting that it flags it as confidently AI generated. I also got a whiff of AI and I'm never sure how to take confirmation bias from AI detectors - though that said, I've gotten a whiff of AI before and had this detector say it's confidently human.
Reading this I kept waiting for the... point? I feel like the whole thing was more like saying "UDP is like a convertible because you can strap a tarp on top when it rains". Like... sure? But that tarp is going to be crappy compared to a real roof. And the idea that that tiny layer is "the best of both worlds" is frankly ridiculous to me.
- mlhpdx 9 hours ago
  
  Author here. I've been called a "robot" in the past so perhaps my writing reflects that, but yes there was some autocomplete involved here.

klysm 12 hours ago

This is such a bizarre read... it's clearly LLM written, at least in outline. It somehow manages to not do the most basic comparison of guarantees that UDP and TCP provide. Ordering is only mentioned in passing, and the important nuances are completely bypassed.

jeroenhd 12 hours ago

It's written by someone trying to sell specialized cloud services targeted towards UDP based protocols.
Talking up UDP like it's something special is part of the business strategy.
And, to be fair, a lot of TCP-based or even HTTP-based application protocols could probably have been UDP without any trouble.
Maxatar 12 hours ago

Even the title uses one of ChatGPT's signature lines "It's not X, it's Y."
- mlhpdx 9 hours ago
  
  Sadly, that's almost an expected way of writing online content now. I'll do better next time!

diath 13 hours ago

> It's so common it's almost an instinct. But that description doesn't quite hold up once you look closer. UDP isn't unreliable in the sense that it fails to do its job -- it's unreliable in the same way a convertible is "un-dry".

Well, no, it's unreliable in the sense that if I ordered a package online, the courier may get lost on the way and never come with my package, and then I have to order it again if I want it delivered, and then hope that this time it actually does, in that case, it kind of does fail to do its job if you want things to be delivered.

pklausler 12 hours ago

If you want your packages delivered in order, without duplication, whenever it is not impossible to get them to you -- don't use a fulfillment company that might send you multiple boxes, or get them out of order, or not deliver them at all, and is really no cheaper anyway.

throw0101c 12 hours ago

Sadly the proliferation of middleboxes means that deploying a new Layer 4 is all but impossible:

* https://en.wikipedia.org/wiki/Datagram_Congestion_Control_Pr...

* https://en.wikipedia.org/wiki/Stream_Control_Transmission_Pr...

Curious to know if things can every move towards a point of allowing things. I know in many firewall 'languages' do have something like "allow tcp/443" or "allow udp/53" for nomenclature: perhaps something like "allow stream/443" and "allow dgram/53" to include (e.g.) TCP/SCTP and UDP/DCCP would allow more flexibility.

tsimionescu 12 hours ago

I never really understood this complaint. UDP adds such a tiny and useful header on top of IP that it hardly seems meaningful to reinvent the wheel instead of just using it.

cestith 13 hours ago

UDP doesn’t back off sending for congestion. How reliable is that?

UDP over the network does not on its own guarantee packet retries or proper delivery order. How reliable is that?

UDP on Linux to localhost or hair-pinned between two public IPs on the same host can result in reordered packets when the kernel is busy with a lot of traffic or the CPU is context switching enough. It happens when a UDP queue is handled by a certain core and gets swapped to another. I’ve had to set up two machines with a cable between them because the ordering is actually more stable than over localhost. A colleague is working on a kernel patch to bypass part of the UDP stack to restore proper ordering, which will probably need to be maintained in-house or at least hidden behind a kernel config knob upstream since this reordering is considered acceptable under the guarantees for UDP. How reliable is that?

So, yeah, you can say UDP is reliable but compared to TCP or QUIC it actually really isn't.

Kiboneu 13 hours ago

The point of the article is that calling a protocol reliable doesn't make sense. You can (and should) use UDP if you don't care about (or you have a special way of handling) packet loss, and instead need to receive the latest packet as soon as possible. In the context of multiplayer FPS games and remote controlling drones -- it is actually more reliable.
- eptcyka 12 hours ago
  
  The protocol is reliable in the sense that the protocol provides guarantees where either party can know that the other party received something. UDP provides no such guarantees. I'm a fan of using UDP where applicable, I loathe TCP-in-TCP, not at all a fan of head of line blocking, but I don't need to call UDP reliable just because it is a great protocol with it's own usecases.
  - Kiboneu 12 hours ago
    
    TCP does not guarantee delivery, and there are cases where getting confirmation of receipt actually makes the /channel/ less reliable.
    Again, the point is not to call UDP reliable, it's that it doesn't make sense for the reasons I stated above. If you choose the wrong tool for a usecase then the communication channel will be unreliable in either case. There's a lot of confusion about this -- it is valid to call a communication channel unreliable for the system to respond accordingly, not the protocol itself.
    
    eptcyka 11 hours ago
    
    TCP does not guarantee delivery, but it guarantees that if you receive an ACK, then the acknowledged bytes were received. UDP provides no such guarantee. Yes, it is OK to call it unreliable, because you can't rely on it to reliably deliver packets - the sender will never know if the packets it sent were ever delivered without any external validation. Game servers and clients work around this by being fault-tolerant. QUIC runs on UDP but the way it achieves reliability is by implementing reliability of TCP in the application layer. You cannot expect to receive all UDP packets you send to yourself on localhost, I think that it is reasonable to call such a network protocol unreliable, in the sense that one cannot reliably know if sent packets have ever been delivered within the confines of the protocol itself.
    
    tsimionescu 11 hours ago
    
    Note that TCP doesn't actually provide that guarantee to the application layer. There's no way to know if some bytes you sent on a TCP socket ever arrived. So, if you want a reliable packet delivery, you have to build that at the application layer yourself anyway. That's why HTTP requires the 204 No Content response code, for example - instead of relying on a TCP ACK to achieve the same.
    The only real guarantee that TCP provides is in-order delivery. That is, it really guarantees that if you sent two requests A and B to a server, the server will never see request B unless it saw request A first. Again looking at HTTP, this is why request pipelining theoretically works without including any request ID in the response: the client knows that if it sent reqA first and reqB second, the first response it gets from the server will be respA, and the second will be respB (if either of them ever arrives, of course).
    Note that the fact ACK is not shown to the application layer is not an implementation limitation, it is a fundamental part of the design of TCP. The ACK is generated by the TCP stack once the packet was received. This doesn't guarantee in any way that the application will ever see this packet: the application could crash, for example, and never receive (and thus act on) this message. The in-order guarantee works instead because the transport layer can simply avoid ever showing reqB to the application layer if it knows that there was a reqA before it that it didn't receive. So the transport can in fact ensure that it's impossible for the application to see out-of-order data, but it can't ensure that the application has actually received any data.
    
    threatofrain 8 hours ago
    
    For the case examples being discussed, like drones, TCP is unreliable in the sense that it may be death penalty for your app. Isn't death penalty a severe kind of unreliable? In that case UDP is much more reliable in not murdering your app.
    
    Kiboneu 11 hours ago
    
    (I don't know if you've seen my edit. I re-read your response to answer more relevantly to your point about ACK. But to be honest I still don't think it's correct.)
    
    eptcyka 11 hours ago
    
    I see, and I still believe that as far as network protocols go, UDP allows for unreliable communication and TCP allows for reliable communication, at least at a naive first glance. I don't see anything wrong with UDP being labeled as unreliable. Is there a particular reason you believe it is reliable? Say if UDP was a reliable network protocol, what would be an unreliable network protocol?
    
    Kiboneu 9 hours ago
    
    > Is there a particular reason you believe it is reliable?
    I don't, nor do I think it is unreliable. Protocols exist in an abstract space, but they are instantiated in a system that works within physical /constraints/ (noise, faults, oxidation, etc.). These constraints vary because different systems work in different environments.
    The systems we make also have to provide utility for the /purposes/ that it's made. Reliability is necessary to trust the utility of a system.
    If I'm playing a multiplayer FPS game and due to backoff I see players snapping back to the position of a lost packet once it is received, I will not think that the game is reliable (and I will be especially sad since this is known issue with known improvements).
    If I don't care about old packets and only care about the latest packet, I don't work with old information -- but the information is incomplete. Latency will still be noticeable on a noisy connection, but my brain can at least anticipate -- guess where to look next -- once the last packet is received.
    That incomplete information can be used to extrapolate movements. So the final step is to use a model to predict movements in (short) moments when a packet is not received. You can't do this when the information isn't relevant anymore. When that's done, I -- the player -- will feel like that game is more reliable than the first TCP case.
    (Games, and real-time interfaces in general, have some interesting aspects that I think is well covered in Game Feel by Steve Swink).
- jeroenhd 12 hours ago
  
  The article willfully misinterprets the way "reliable" is used to talk about the protocol, as if people use the word to insult UDP.
  Either that or it's slop that somehow made it to the front page.
toast0 13 hours ago

> A colleague is working on a kernel patch to bypass part of the UDP stack to restore proper ordering, which will probably need to be maintained in-house or at least hidden behind a kernel config knob upstream since this reordering is considered acceptable under the guarantees for UDP.
My understanding is that in Linux, TCP on localhost (including packets sent to a public address on a local interface) bypasses the stack; I don't see why it would be a problem for UDP to do the same.
This is contrast with FreeBSD, where TCP to localhost is actually packetized and queued on the loopback interface, and you can experience congestion collapse on loopback if you do things right(wrong).
threatofrain 13 hours ago

Some of these things turn out to be death penalty depending on the kind of app you're doing, and it can be quite surprising when it happens too. For example, redelivery with ordering with bad internet can lead to unbounded backup.

chasing0entropy 10 hours ago

I'm not a fan of the 'convertible' metaphor, UDP would be better equated to driving at 100mph on a 55mph TCP road. You might get there safely with no harm at twice the TCP speed... or you might lose some parts or you may never arrive.

That said the old battle.net online and Quake/Unreal multiplayer system was a widespread implementation of UDP transport with reliability checks. Whenever your UDP stream desynced the game clock accelerated replaying the buffered moves back to back until you were back in sync.

mlhpdx 9 hours ago

This video walkthrough of the Quake netcode shows some of the brilliance that can be applied over UDP, too.
https://www.youtube.com/watch?v=b8J7fidxC8s

blackcatsec 11 hours ago

As with any protocol and application, there's a big, massive, "it depends" on pretty much everything. There are pros, cons, tradeoffs to make that sometimes it's easier to just have the underlying platform make those trade-offs, or to surface those problems to the application layer for developers to decide how they want to handle those problems.

Folks shouldn't necessarily argue "one is better" more so than they should consider all engineering aspects of why you should use one technology over another.

elevation 12 hours ago

UDP is perfectly reliable when you control both sides of the link. With adjustments to ring buffers, IRQ/core affinity, and receive buffer size, you can receive a saturated 40GbE stream with no packet loss.

It's only when you pass through a router that's specifically instructed "drop these first" that you run into drops.

jeroenhd 12 hours ago

UDP has a checksum for a reason, bitflips happen and packets get mandled or lost, even if you control both sides of the link.
Your chances are better but you can't assume packet loss or data errors can't happen just because the fiber is right in front of you.
- jandrese 11 hours ago
  
  The UDP checksum is optional and weak. It's mostly a waste of time.
  It's far more likely if you get a bit flip that the layer 1 hardware will just drop the packet because it has a much better checksum that detected the error first. Even if you know every part of the hardware and software stack you have to plan for an occasional packet drop.
ekr____ 12 hours ago

I would argue with "only". Full queues are only one reason you get packet drops. For example, packets can be damaged in transmission. This isn't common but that's not the same as saying it doesn't happen.
klysm 12 hours ago

Okay, but the important part of a protocol is what happens when things don't go perfectly to plan.

BinaryIgor 12 hours ago

Interesting perspective, agreed 100%; having UDP available in the browser would open up a plethora of interesting new use cases - games, video/audio calls without WebRTC overhead - and techniques to use. Ohh wait - we will soon have it! https://developer.mozilla.org/en-US/docs/Web/API/WebTranspor...

ekr____ 12 hours ago

Well, WebTransport is built on top of QUIC, and so the overhead is actually reasonably comparable to that for WebRTC.
- mlhpdx 9 hours ago
  
  But QUIC is another case of "one size fits all" reliability and ordering (etc.) that doesn't allow a great deal of flexibility.

igravious 12 hours ago

Semantically something is unreliable if it's meant to be reliable I'd say.

UDP is guaranteed to not be reliable.

wat10000 12 hours ago

This is a rather long-winded way of saying "unreliable means it doesn't retransmit for you, it doesn't mean it's shit." Which, duh? I don't know, maybe there are people out there who need to hear this, but I'd think that anyone who knows what UDP is at all will also understand the sense in which it's "unreliable."

mlhpdx 9 hours ago

Fair. But there are legions of developers blindly doing REST over HTTP and creating bad experiences. I know it's fine for many, many cases -- but not all.