This advice is for linux based servers.
Run modern kernels. Linux kernels after 5.0 switched to EDF scheduling which helps.
Make sure BQL is working for your ethernet device.
Kill pfifo_fast as your underlying qdisc everywhere. If your server workload is primarily tcp, switch the qdisc to sch_fq, which not only successfully multiplexes up to millions of flows, but also applies packet pacing which is a huge win. However, if your server (this includes whatever underlies your virtual machine) is acting as a "router" for vpns, or quic udp, tunnels of all sorts, presently fq_codel is the better (and often nowadays the default) choice. Monitor sch_fq with ebpf, or the fq_codel statistics. If you are consistently seeing drops or marks the server itself is queuing too much internally and it's time to get a server with more cpu and network bandwidth.
MEASURE. Take packet captures of your services from various vantage points, especially slow networks. Instrument your server side applications to monitor TCP_INFO statistics. Report on loss, marking, retransmits, and (especially) rtt. This measures how you are doing in the network path. ALSO instrument the proxies you are using - it's not unheard of to end up with megabyte of buffering between the server application and the proxy. Also use TCP_NOTSENT_LOWAT as a socket option where you can.
Evaluate tcp BBRv1. For some workloads - notably long running sender side limited youtube-like traffic - it's the best thing out there. For single flow streaming up or downloads, also. For sharded web traffic, it has a tendency to over-compete with itself. YMMV. See item 3.
Many cloud services today make no guarantees as to the actual performance of the underlying network. If you have a latency sensitive service (like videoconferencing) dedicated cpu instances or bare metal are better. You can also artificially limit your outgoing bandwidth via sch_cake easily, below what the cloud provider can reliably provide.
MEASURE MORE. There are some good tools for generating workloads arriving - networkQuality should be out soon for OSX, flent and irtt are common also.
check out the mailing lists at lists.bufferbloat.net
Post
Replies
Boosts
Views
Activity
This answer is for home routers.
Much of the bufferbloat on the internet is on the home router and ISP head ends. There is only so much you can do end-to-end - and the advice in the video is primarily targeted at LTE <-> server interactions. Definitely take a hard look at your server behaviors as I wrote above! I've written a lot and given many presentations on the bufferbloat problems in the typical home or office network, one of the more amusing is here:
https://blog.apnic.net/2020/01/22/bufferbloat-may-be-solved-but-its-not-over-yet/
To improve the ISP link... encourage your customers to get better home routers with "Smart Queue Management" (SQM), and fq-codel for the wifi. It's universally available from third party router firmwares like openwrt, dd-wrt, merlin, and tomato. It's also available for most gamer routers (netduma calls it "anti-bufferbloat"), it's the premier feature of the evenroute (eero also), ubnt's edgerouter products and the udm pro also have sqm. IP fire, tangledOS also, it's a really long list, you just have to turn it on.
As written originally the reference "sqm-scripts" - which work on any linux - was based on htb + fq_codel. Since 2018 most work on it has moved to the sch_cake implementation which is heavily optimized for sub 300Mbit bandwidths. Cake entered the mainline linux kernel as of version 4.19.
PFSENSE also has a fq_codel implementation. Mikrotik is testing theirs (and cake).
If you are on comcast cable, they have fully rolled out the pie AQM on all their docsis 3.1 modems and cmts's. Upgrade to a docsis 3.1 modem, and if not on comcast, nag your cable co to finally just turn that on.
To be clear, however, after a link cracks about 40Mbit, the bufferbloat shifts to the wifi. Routers based on the ath9k, and ath10k, intel and mt76 chipsets have support for fq_codel there if they are based on a recent enough Linux kernel and/or have the right offloads enabled. A comprehensive list is impossible... evenroute/ubnt/eero are the leaders here.
In using a SQM (and/or the PIE AQM) the difference in videoconferencing quality in particular, on a network that is even slightly busy with "other stuff", is really remarkable, and although I'm one of the founders of bufferbloat.net and more than a little biased, I'm delighted apple is tackling these issues with new tools users can use to diagnose their bufferbloat, and they seem to be not only fixing up their server networks, but advising customers on how to do so.
'm not really big on recommending specific products as I just did here!, preferring to run openwrt wherever I can, but it's my hope that now that more understand "why my network" is being jittery or feels slow, that they'll go looking for the solutions, and tell others.
Jim Gettys laid out a call to arms, before retiring, here:
https://gettys.wordpress.com/2018/02/11/the-blind-men-and-the-elephant/