Possible Ethernet TCP/IP Fix Using OT Advanced Tuner!


Subject: Possible Ethernet TCP/IP Fix Using OT Advanced Tuner!
From: Robert Frezza (bfrezza@stanford.edu)
Date: Wed May 31 2000 - 02:31:17 MDT


This potential fix applies to people who:

1) Are using mol 0.9.45 with TCP/IP and Ethernet
2) Are using a dedicated mac ip address, rather than ip forwarding
3) Are experiencing weird problems with TCP crashes/slowdowns/failures
4) Get those <*> Can't allocate Packet Datas error messages

This problem has been preventing me from using the Internet via mol for quite a
long while, and so far I've found a few other people reporting this, and no
fix. What I've found does not fix the problem entirely, however it makes OT
usable by most definitions - I'm able to maintain about 100K/sec within mol
without generating those <*> Can't Allocate Packet Datas errors at all.
(Although I did notice that these messages get cut off after the first few
appear in 0.9.45, which is a nice relief.)

Here's what I've done to make it work on my machine. If this helps you, or if
you think it might help you, let me know and I'd be glad to do further tests or
answer questions (bfrezza@stanford.edu). It essentially involves modifying OT
to send and receive data in smaller segments.

1) Upgrade to the newest mol (0.9.45)
2) Upgrade to OS 9.0.4 (Although previous OSs might work)
3) Upgrade to the newest OT with OS 9 or 9.0.4 - not the one from 8.6
4) Download OT Advanced Tuner from www.sustworks.com
5) Create a configuration file with the following changes to the defaults:
        Set the Upper Limit on the Max Segment Size (tcp_mss_max) to 2K
        Set the rwin_mss_multiplier to 1
6) Reboot
7) Apply your configuration file before establishing any TCP connections

After doing this, my first few bytes of data (approximately 20-50K?) come
through jaggedly and seem to be problematic as before, however after this short
initial period OT seems to stabilize to a state where it can consistently
deliver 100K/sec until the next reboot.

What I believe is going on is that the TCP protocol detects that I'm on a fast
connection, and after the first few packets realizes that it can maximize
performance by setting the segment size to the maximum default of 63K. This
is putting a huge burden on some buffer on the linux side, which keeps
overflowing and exhibiting erratic behavior that's leading to the slowdowns and
crashes. This would explain why people with slower connections don't seem to
exhibit this problem (someone on a modem would never be playing with segments
this big), and why people using fake ips, which take another route, don't have
this trouble either. By changing this setting, we're sending smaller segments
(at a small performance loss, of course), but these smaller segments are handled
better by whatever is delivering the data to OT on mol.

Also note that applying the patch after some data has been sent with the
default settings does not work as well - I've found that once the buffers have
become clogged with the large packets, even if I apply the patch it's not as
smooth and performance is significantly lower (although it's still stable.)
I'll repeat here - apply the patch before any tcp activity. (I believe there's
an option in OT Advanced Tuner to do this on each reboot for you.)

(Btw, the rwin_mss_multiplier needs to be changed too because the default is a
16K window, which seems to be too large - a setting of 1 implies an initial
window of 2K.)

If any of this applies to you or makes sense to you, feel free to drop me a
line. I could be imagining it all, and I don't know that much about OT, but
this sounds reasonable :) So far, it's making all the difference for me!

Best,

Bob



This archive was generated by hypermail 2a24 : Wed May 31 2000 - 02:56:32 MDT