DragonMage Posted March 5, 2004 Report Share Posted March 5, 2004 Ok folks, if you were in irc #musb, you already know that I have a problem with an all linux networking at work. I will describe you the setup and the problems that I have. Then you can give me all the suggestions that you can think of, short of reinstalling everything. It's going to be long and complicated so be prepared. Oh yeah, only network gurus may apply The setup: 1. A dual Pentium 2-350 384 MB of ram using K12LTSP version 4.0 (Fedora Core based) with these services running: BIND (for name server, required for sendmail) Sendmail and pop3 for internal mail server samba dhcpd iptables and nat (for IP masq for internet access later on). IP = 192.168.0.254 (as suggested by LTSP) 2. Five computers ranging for a P2-333 to P4-2 ghz. All running PClinuxOS installed in hd (so think of them as modified mandrake). All of them running samba and have static IP 3. One computer running Slackware. Also used static IP and running samba Now, the problem is that the network is slow.. not low bandwidth but slow response. Pinging to each computers work, but weird. What I meant with weird is like this a. If I ping directly to the numerical IP address, it will get the response immediately. No packet loss and good ping rate (less than 1 ms) b. If I ping to the name, it will take about 10 seconds to half a minute to get a response (ie. ping test.test.org (192.168.0.1)) and another 10 seconds to half a minute to get the first result of the ping. However, the ping rate is also less than 1 ms and there are no packet loss. Now because of this, all of the services are slow going. If I try getting and sending internal e-mails, it will take a minute or more before getting /sending the first email. Ditto with samba. It will take one minute using smb4k and linneighborhood before I get the listing of what share drives are open on remote computers. However, after I mount the share drives, the bandwidth is good.. Now, I have some ideas of where the problem can be: 1. PClinuxos is just slow in networking, but if that is the case, then the networking between slackware and the server should not be affected, but they are. 2. The problem can be located in the server. Now it can be BIND since I am using K12LTSP which have its built in bind setting. It seems that the naming scheme for the ltsp and my own internal domain name may have clashed with each other (since they both uses 192.168.0.x scheme) but when I disable the ltsp scheme (by commenting all the lines for the ltsp in the /etc/named.conf), it has no effect whatsoever. 3. Some service is hogging the response. The problem is I don't know what and I don't know how. All suggestions, how-tos, are welcome. Thanks in advance. Quote Link to comment Share on other sites More sharing options...
himuraken Posted March 6, 2004 Report Share Posted March 6, 2004 Ok, I will take a stab at this, and I think that it is a reasonable one. I see this all the time in Windows servers. I think this greatly pllies here as well. The only thing different in the response time is the use of a netbios \ samba name. I think that this has to be a dns issue. DNS resolution is taking way to long. That would explain your fast transfer's but slow lookups. I am not an expert in BIND, but I would suggest that you start there. Hope to help. Quote Link to comment Share on other sites More sharing options...
aru Posted March 6, 2004 Report Share Posted March 6, 2004 I also agree in that the problem is with the bind server. Take a look to its configuration. Also do dig tests from the clients to see if it is working properly Quote Link to comment Share on other sites More sharing options...
jimdunn Posted March 6, 2004 Report Share Posted March 6, 2004 It's almost certainly a DNS problem, from what you describe. A quick, dirty solution could be just to create /etc/hosts files on all the machines, containing all the 6 IP addresses and names. Since you only have 5 or 6 machines this could be a "quick fix" while you troubleshoot the problem. I think the names are getting resolved by broadcasts, which is taking an age. As already suggested, using dig will let you see if BIND knows about these local machines, and is resolving them for you. I think it may not be. You could make your server machine a dhcp server, and set the others to dynamic addresses to see if this solves it. This may help BIND to get updated. If they always need to get the same address you can make dhcp always give them a specific address, by setting reservations based on their adapter MAC addresses. (see the dhcpd docs). I think you will need to set up a reverse look up zone for your domain in BIND if you want to carry on using static IPs - the trouble is you'll need to update it if things change, whereas is you use a dhcp server this can be done dynamically.- this link has some good info on setting up BIND for reverse lookups: http://www.crazysquirrel.com/linux/dns.php That's about all I can stab at based on what you said - hope some of it is of some use to you... Quote Link to comment Share on other sites More sharing options...
DragonMage Posted March 6, 2004 Author Report Share Posted March 6, 2004 Ok, some quick updates. I brought my laptop home (with pclinuxos installed) and connect it to my desktop (also pclinuxos) via crossover cable. I found out that maybe the problem is partly caused by PClinuxOS. ssh-ing takes one minute before the "enter your password" screen comes out. Luckily, I still have my Mandrake 9.1 acronis backup handy and I restore my desktop back to Mandrake 9.1. Now, if I ssh from my desktop to laptop, it takes about 3 seconds for the password screen to appear, while from laptop to desktop takes about a minute. So the slow network seems to be from pclinuxos. However, I kind of agree that BIND may be part of the problem. Here is how I configure BIND. I installed webmin (using apt-get webmin), and configure it from there. However, as far as I know, the configuration files should be saved in /var/named directory. In this case, I notice that there is a /var/named/chroot directory and my internal domain name seems to be put in that directory (/var/named/chroot/etc/ to be exact). Is this normal? It does work in resolving name address though. Now, since it is the weekend and the server is at work.. I can only do the work on Monday. Anyway, thanks for the link jimdunn. I will be reading it soon enough. Quote Link to comment Share on other sites More sharing options...
DragonMage Posted March 7, 2004 Author Report Share Posted March 7, 2004 Another update. It's definitely pclinuxos (at least my version of it). I acronis backuped my current pclinuxos and reinstalled it.. Now the ssh takes around 3 seconds for the password screen to come out. Restoring back to the acronis backuped version creates the problem again.. ssh taking around a minute for the password screen to come out. Well.. That's one end down.. Now all I need to do is to reinstall pclinuxos for all the infected clients at work.. oh boy.. Quote Link to comment Share on other sites More sharing options...
Michel Posted March 12, 2004 Report Share Posted March 12, 2004 Maybe this program can help determine problems with dns if they don't have a clue .... http://www.zonecheck.fr/ Quote Link to comment Share on other sites More sharing options...
DragonMage Posted March 13, 2004 Author Report Share Posted March 13, 2004 Last update Ok.. after reinstalling all of the clients and changing the domain name (I should've known underscore is a big nono).. Everything is hunky dory again. Now, if only LTSP works, but that's another problem entirely. Thanks for all the suggestions :) Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.