Subject: ATM network managment (This message is being sent to all CPSC 826 students) It is acceptable to discuss the health (or lack thereof) of the atm network with other teams. To see if atm networking is alive, first do /sbin/ifconfig If the atm net is up, you should see (among other interfaces) lec0 Link encap:Ethernet HWaddr 00:04:AC:6C:2B:C7 inet addr:192.168.8.60 Bcast:192.168.8.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1492 Metric:1 RX packets:20 errors:0 dropped:0 overruns:0 frame:0 TX packets:7 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 ping should be used to test reachability between two hosts. ATM networking is NOT started automatically on reboot. It must be manually started using the following procedure: login as root cd /home/westall/atm/init Load the proper driver. On robert, the driver is ia.up-2.4.0 and it is loaded via /sbin/insmod ia.up-2.4.0 On the 25 Mbps systems the driver is atm.up-2.4.0 After loading the driver start the network via ./net.init.new start --------------------------------------------------- One precaution is in order: Make SURE the driver is NOT already loaded before you try to insmod it. Loading a second copy may well crash the system. If you are convinced that atm networking is hung in someway you may restart it by: ./net.init.new stop ./net.init.new start In some instances it could be necessary to stop it on <> and THEN restart it on all 5 machines. Subject: ATM network managment (This message is being sent to all CPSC 826 students) Unfortunately I have not been able to reproduce all of the problems that have been reported. The atm network is now up and appears to be functioning normally. In order to reduce contention for resources and the potential for problems please follow the following policies COMPLETELY. 1 - Do initial development and THOROUGH testing on the SUNS 2 - Do initial port to linux and THOROUGH testing using the lab machines NOT CONNECTED to the atm switch. 3 - When testing on the atm network it is never necessary to send more than 12.5 MB (100 Mb) in a single run. Thus a single run should never take more than a few seconds. It is also the case that I don't need or want hundreds of data points reported. Eight to Sixteen is more appropriate. 4 - If you suspect a problem with an atm host, look at the tail of /var/log/messages. If you see "lec0 shutdown!", you do have a problem. 5 - To restart perform the following steps IN THE FOLLOWING ORDER ./net.init.new stop /sbin/rmmod atm-2.4.0 /sbin/insmod atm-2.4.0 ./net.init.new start If you inadvertently remove the driver while networking is still up you will likely cause a crash or render your system unstable. I do recommend reloading the driver before networking is restarted. mw Subject: Assignment 1 submission (This message is being sent to all CPSC 826 students) Apparently some of you failed to understand or failed to heed my admonition regarding the submission of M$ ware based reports. Offenders in this category submitted some manner of Word generated postscript which unfortunately also contained some Mr. Bill generated header which caused our laser printer to print out a <<>> of raw postscript. Since this was the first assignment, and I may have not made myself clear, I will accept writeups in the correct format (latex) by noon midnight monday with no penalty. A sample latex template is in subdirectory writeup of the 826 directory. One team also failed to include their NAMES on the write up.. That should also be corrected. mw Subject: Assignment 2 (This message is being sent to all CPSC 826 students) Phase 1 of assignment 2 is now available as assn2.s01 You will do code development on system glint2. I have created each of you an account there. The accounts have no password at present. Any accounts still having no passwords by noon Wednesday will be given one by me (This will make it difficult for you to work while I am out of town.. so please don't let that happen.) You will not be able to test on glint2. All testing can be done on any of the lab machines. To test on machines other than the original 5 (which is recommended) you will need to copy over the 2.4.0 kernel and install it using lilo. It is also the case that glint2 has 2.4.1 kernel sources meaning that you will have to use the -f flag when you insmod your module.. WARNING: Broken modules can and will crash systems. Please be considerate of others who might be using the system when you try to test. We don't want to be inventing any new "rages" here. Subject: Assignment 2 Team numbers (This message is being sent to all CPSC 826 students) See: teams.a2 Subject: Last call (This message is being sent to all CPSC 826 students) I must protect all account on glint2 before leaving town.. Thus if your account is not protected by 6:30 today you will be unable to access it until next week. mw Subject: Irresponsible testing... It is a REALLY BAD IDEA to simply abandon modules that are filling up the system log and potentially breaking networking. The output below was generated by an abandoned module that ran for MANY hours last night. There is ABSOLUTELY no need to run any test at this point for more than 2 or 3 minutes. There is another version running right now which also appears to have been abandoned (/var/syslog/messages is now over 500000 lines long!) Anyone who sees stuff like this going on is encouraged to remove the module, send a nasty note to the offender and a copy to me. People who INSIST on doing business this way will be given dedicated test times (between 5 and 8 am on Sat and Sunday!!!!!) mw 107063 Feb 26 22:42:06 darryl kernel: Evil return from c481c0a0(4). 107064 Feb 26 22:42:06 darryl kernel: Outgoing Packet Port=386, Wind=-1071860068 107065 Feb 26 22:42:06 darryl kernel: Evil return from c481c0a0(4). 107066 Feb 26 22:42:06 darryl kernel: Outgoing Packet Port=15412, Wind=-10718600 107067 Feb 26 22:42:06 darryl kernel: Evil return from c481c0a0(4). 107068 Feb 26 22:42:06 darryl kernel: Incoming Packet Port=17856, Wind=-10718600 107069 Feb 26 22:42:06 darryl kernel: Evil return from c481c07c(0). 107070 Feb 26 22:42:06 darryl kernel: Incoming Packet Port=17856, Wind=-10718600 107071 Feb 26 22:42:06 darryl kernel: Evil return from c481c07c(0). 107072 Feb 26 22:42:06 darryl kernel: Incoming Packet Port=17856, Wind=-10718600 107073 Feb 26 22:42:06 darryl kernel: Evil return from c481c07c(0). 107074 Feb 26 22:42:06 darryl kernel: Outgoing Packet Port=769, Wind=-1071860068 107075 Feb 26 22:42:06 darryl kernel: Evil return from c481c0a0(4). 107076 Feb 26 22:42:07 darryl kernel: Outgoing Packet Port=2107, Wind=-107186006 107077 Feb 26 22:42:07 darryl kernel: Evil return from c481c0a0(4). 107078 Feb 26 22:42:08 darryl kernel: Outgoing Packet Port=49320, Wind=-10718600 107079 Feb 26 22:42:08 darryl kernel: Evil return from c481c0a0(4). 107080 Feb 26 22:42:09 darryl kernel: Outgoing Packet Port=15412, Wind=-10718600 107081 Feb 26 22:42:09 darryl kernel: Evil return from c481c0a0(4). 107082 Feb 26 22:42:09 darryl kernel: Outgoing Packet Port=49320, Wind= Subject: assn 2 (This message is being sent to all CPSC 826 students) When you create a device with mknod use the name /dev/tcpm## where ## is your team number. mw Subject: assn 2 (This message is being sent to all CPSC 826 students) In attempting to process the TCP header I HIGHLY recommend that you locate it directly by adding the IP header length to the pointer to the IP header that appears in the firewall... Apparently some folks have attempted to use various union/struct definitions lying in the skbuff structure and have found that they are lying about where the tcp header is. mw Subject: Rest of assignment 2 (This message is being sent to all CPSC 826 students) The rest of assignment 2 is now on assn2.s01 Please read it before class tomorrow. I will be glad to answer questions at that time. It will be due on the 16th. The snmp data for a couple of production routers is in the class directory with name r*.snmp As an exercise see if you can construct a table intf # | ip-addresss | who-is-t How does the handling of subnet .60 differ from .48 and .56. Subject: Assignment 3 (This message is being sent to all CPSC 826 students) See assn3.s01 Subject: SACKS TCP SACK is ON by default and should be left on by default. To turn it off temporarily: echo "0" > /proc/sys/net/ipv4/tcp_sack To turn it back on: echo "1" > /proc/sys/net/ipv4/tcp_sack Subject: Drop rates (This message is being sent to all CPSC 826 students) Several folks have reported low drop rates. There are two possible explanations: 1 - someone has reconfigured the gateway pc so that the expected drop rate is not being realized 2 - drop rate detectors are not working correctly Its hard for me to diagnose that problem from home. Here are some ideas.... If someone reboots a gateway pc (such as jeff for shemp) the proper device driver to install is atm.o (not atm-2.4.whatever). If a system is rebooted (as it appears jeff may have been) the program apetqdrp MUST be used to set the target drop rate and it must be used AFTER the atm network is restarted. If robert is rebooted (which blessedly it has not been) reverse routes to moe, shemp, and curly must be manually reestablished. I will be in the office tomorrow and will verify that all systems have correct drop rates. Subject: Care and feeding of routing tables: There is an old American school-kid expression: "Close only counts in horse shoes and hand grenades" Close doesn't count in routing table construction. A student just reported (correctly to me) that moe was not working. Here is moes routing table: Destination Gateway Genmask Flags Metric Ref Use Iface moe.cs.clemson. * 255.255.255.255 UH 0 0 0 eth0 130.127.60.0 * 255.255.255.0 U 0 0 0 eth0 192.168.0.0 david.cs.clemso 255.255.255.0 UG 0 0 0 eth0 127.0.0.0 * 255.0.0.0 U 0 0 0 lo default 130.127.60.1 0.0.0.0 UG 0 0 0 eth0 Here is curly's table: [westall@curly westall]$ /sbin/route Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface curly.cs.clemso * 255.255.255.255 UH 0 0 0 eth0 130.127.60.0 * 255.255.255.0 U 0 0 0 eth0 192.168.8.0 darryl.cs.clems 255.255.255.0 UG 0 0 0 eth0 127.0.0.0 * 255.0.0.0 U 0 0 0 lo default 130.127.60.1 0.0.0.0 UG 0 0 0 eth0 Please take care in constructing routes when rebooting moe, curly, and shemp. Broken packet filters MAY make it necessary to reboot. Subject: more troubles... Please do not install ANY student tcp monitors on the router machines!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Someone did so on darryl yesterday: Apr 15 17:08:41 darryl last message repeated 532 times Apr 15 17:11:18 darryl last message repeated 567 times Apr 15 17:11:18 darryl last message repeated 5 times Apr 15 17:23:24 darryl kernel: atm_xmit: Random drop on vci 51 Apr 15 17:26:39 darryl last message repeated 2 times Apr 15 17:27:50 darryl last message repeated 128 times Apr 15 17:28:05 darryl last message repeated 141 times Apr 15 17:42:00 darryl kernel: Register tcp monitor returned 0 Apr 15 17:42:00 darryl kernel: not to be loggedEvil return from c481d0cc(0). Apr 15 17:42:00 darryl kernel: not to be loggedEvil return from c481d168(4). Apr 15 17:42:00 darryl kernel: not to be loggedEvil return from c481d0cc(0). Apr 15 17:42:31 darryl last message repeated 2 times Apr 15 17:43:33 darryl last message repeated 4 times and all these "Evil return" messages commenced immediately thereafter. After logging about 140,000 of these suckers in 24 hours poor old darryl finally expired at 17:00 this afternoon. I can't say what, if any, the presence of this defective tcpmon may have had on results taken during this period. If you need to monitor a router machine for any reason feel free to use /usr/sbin/tcpdump -- but DON"T leave it running indefinitely. If your tcp monitor is producing the these messages in /var/log/messages then it is definitely broken! Thanks! Subject: assn 2 directories The directories have been created. Teammates please pick ONE and ONLY ONE directory and turn in code and paper there. thanks, mw Subject: Ugh ^ 2 A grad student who is not in the class mistakenly rebooted both robert and jeff and who knows whatelse this afternoon. I am presently trying to get everything reconfigured. mw Subject: back in business The systems should be back to normal now. Subject: sigh... In a perfect world every team would be assigned its own private network consisting of an ATM switch and several hosts. Until some of you guys make a bezillion $'s and donate a perfect world to us we must live in an imperfect world. Thus it is important that we be careful not to foul up the communal test environment. I try to monitor this environment and unfortunately I just recorded on Robert: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface moe.cs.clemson. darryl-lane 255.255.255.255 UGH 0 0 0 lec0 moe.cs.clemson. jeff-lane 255.255.255.255 UGH 0 0 0 lec0 moe.cs.clemson. david-lane 255.255.255.255 UGH 0 0 0 lec0 shemp.cs.clemso jeff-lane 255.255.255.255 UGH 0 0 0 lec0 curly.cs.clemso darryl-lane 255.255.255.255 UGH 0 0 0 lec0 Someone has totally trashed routing to moe.. and this creates other nasty side effects. I will try to fix this one more time tonight but then I am packing it in. mw Subject: The plot thickens... Here is what we now find in the routing table of curly: [westall@curly westall]$ /sbin/route Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface curly.cs.clemso * 255.255.255.255 UH 0 0 0 eth0 130.127.60.0 * 255.255.255.0 U 0 0 0 eth0 192.168.8.0 jeff.cs.clemson 255.255.255.0 UG 0 0 0 eth0 192.168.8.0 david.cs.clemso 255.255.255.0 UG 0 0 0 eth0 192.168.8.0 darryl.cs.clems 255.255.255.0 UG 0 0 0 eth0 127.0.0.0 * 255.0.0.0 U 0 0 0 lo default 130.127.60.1 0.0.0.0 UG 0 0 0 eth0 [westall@curly westall]$ Subject: more on toxic dump clean up... I THINK but can't guarantee that the network has been restored to health.. In the process of doing so I noted on one machine a bezillion messages of the form: Apr 18 22:26:45 curly kernel: here is the ack 1292175014 Apr 18 22:26:45 curly kernel: here is the first flag value 1 Apr 18 22:26:45 curly kernel: hi there base.c !!this is the portno 33229here is the second flag value 0here is the second flag value 0here is the first flag value 1here is the second flag value 0here is the first flag value 1here is the dest 67108864 Apr 18 22:26:45 curly kernel: here is the ack -2071820928 Apr 18 22:26:45 curly kernel: here is the first flag value 1 Apr 18 22:26:45 curly kernel: hi there base.c !!this is the portno 33229here Diagnostic stuff like this is an EXCELLENT way to test to see if your monitor is correctly capturing a small exchange of 10 packets or so. LEAVING diagnostic stuff like this active during attempts to capture REAL performance measures renders measures LESS THAN USELESS because logging overhead can badly skew results. Subject: late penalties It is my normal policy to never apply a late penalty to assignment n+1 before I have returned assignment n. Therefore, late penalties for assn 2 will start when I return assn 1 (hopefully monday). Sorry about the big delay on that. mw Subject: assn3 clarification You are to build everything from scratch here... NO use of snmp library routines. Since it is late in the semester and easy to overlook the obvious don't overlook this possibility: Start a session on one of the lab machines and run a properly parameterized version of tcpdump host glint2 > dump.log Start another session on the same lab machine and do: snmpbulkwalk -v2c glint2 public at And you will have a snapshot of what it is you are trying to duplicate! Subject: snmpd vulnerability snmpd on glint2 apparently expired today. It would appear that it may be possible for a badly mangled request to cause this. I have restarted it. Please let me know if (1) it fails again or (2) you can identify a broken packet that produces a "sure kill". In case (2) we can send your packet to the fine folks at UCD and maybe they will fix the problem! mw Subject: How to tell If you dont see the /usr/sbin/snmpd line the daemon is dead. (And it can only be restated by root). ==> ps -aux | grep snm root 19958 0.1 3.1 3160 1956 pts/2 S 21:51 0:01 /usr/sbin/snmpd westall 19974 0.0 0.8 1360 520 pts/2 S 22:07 0:00 grep snm Subject: snmpd I have now converted snmpd to setuid status. Thus anyone may restart it by simply entering the command /usr/sbin/snmpd Please make SURE it is really dead before attempting to restart it. The fact that you don't seem to be getting replies is NOT compelling evidence. You need to run ps. mw Subject: big ints... It appears possible that you will see counters that are larger than 4 bytes. I don't think you will see more than 32 significant bits though. If you do, you can discard most significant bits. Counters should be printed as UNSIGNED ints. 41 - Counter Unsigned 42 - Gauge Unsigned 43 - Ticks Unsigned Subject: grades I will be sending out individual grade rpts shortly. Included are the makeups and any assn 3's that were turned in by around noon today. Anyone with a final avg of 88 or higher is exempt from' the exam and has an A. Assn three IS counted in your average. If you haven't turned it in yet, you can add about 11pts to your average by making 100 on it. In averaging quizzes each quiz is equally weighted. That is, your percentage marks for each quiz is what is averaged. If you sum your total marks and sum the possible marks and then divide you will get a slightly wrong answer. I will do another round of assn 3 grading sometime sunday. mw Subject: final grade... As I stated at the start of the semester, the final exam cannot hurt your grade. Therefore, if you are content with whatever your average is you do not have to appear for the final. Subject: updated grades I have now updated the grades of those who had submitted assn 3 by 2:00 pm today.. I am resending all reports, but only the grades of the 6 people who submitted assn 3 since the previous distribution will see any change. As before those having a grade of 88 or higher are exempt from the exam and will receive a grade of A. For others the exam cannot hurt your final grade, but the A cutoff will not fall below 88.