Jump to content

Slow Network


DragonMage
 Share

Recommended Posts

Ok folks, if you were in irc #musb, you already know that I have a problem with an all linux networking at work. I will describe you the setup and the problems that I have. Then you can give me all the suggestions that you can think of, short of reinstalling everything. It's going to be long and complicated so be prepared. Oh yeah, only network gurus may apply :jester:

 

The setup:

 

1. A dual Pentium 2-350 384 MB of ram using K12LTSP version 4.0 (Fedora Core based) with these services running:

BIND (for name server, required for sendmail)

Sendmail and pop3 for internal mail server

samba

dhcpd

iptables and nat (for IP masq for internet access later on).

IP = 192.168.0.254 (as suggested by LTSP)

 

2. Five computers ranging for a P2-333 to P4-2 ghz. All running PClinuxOS installed in hd (so think of them as modified mandrake). All of them running samba and have static IP

 

3. One computer running Slackware. Also used static IP and running samba

 

 

Now, the problem is that the network is slow.. not low bandwidth but slow response. Pinging to each computers work, but weird. What I meant with weird is like this

 

a. If I ping directly to the numerical IP address, it will get the response immediately. No packet loss and good ping rate (less than 1 ms)

b. If I ping to the name, it will take about 10 seconds to half a minute to get a response (ie. ping test.test.org (192.168.0.1)) and another 10 seconds to half a minute to get the first result of the ping. However, the ping rate is also less than 1 ms and there are no packet loss.

 

Now because of this, all of the services are slow going. If I try getting and sending internal e-mails, it will take a minute or more before getting /sending the first email. Ditto with samba. It will take one minute using smb4k and linneighborhood before I get the listing of what share drives are open on remote computers. However, after I mount the share drives, the bandwidth is good..

 

Now, I have some ideas of where the problem can be:

 

1. PClinuxos is just slow in networking, but if that is the case, then the networking between slackware and the server should not be affected, but they are.

 

2. The problem can be located in the server. Now it can be BIND since I am using K12LTSP which have its built in bind setting. It seems that the naming scheme for the ltsp and my own internal domain name may have clashed with each other (since they both uses 192.168.0.x scheme) but when I disable the ltsp scheme (by commenting all the lines for the ltsp in the /etc/named.conf), it has no effect whatsoever.

 

3. Some service is hogging the response. The problem is I don't know what and I don't know how.

 

All suggestions, how-tos, are welcome.

 

Thanks in advance.

Link to comment
Share on other sites

Ok, I will take a stab at this, and I think that it is a reasonable one. I see this all the time in Windows servers. I think this greatly pllies here as well. The only thing different in the response time is the use of a netbios \ samba name. I think that this has to be a dns issue. DNS resolution is taking way to long. That would explain your fast transfer's but slow lookups. I am not an expert in BIND, but I would suggest that you start there.

 

Hope to help.

Link to comment
Share on other sites

It's almost certainly a DNS problem, from what you describe.

 

A quick, dirty solution could be just to create /etc/hosts files on all the machines, containing all the 6 IP addresses and names. Since you only have 5 or 6 machines this could be a "quick fix" while you troubleshoot the problem.

 

I think the names are getting resolved by broadcasts, which is taking an age.

 

As already suggested, using dig will let you see if BIND knows about these local machines, and is resolving them for you. I think it may not be.

 

You could make your server machine a dhcp server, and set the others to dynamic addresses to see if this solves it. This may help BIND to get updated. If they always need to get the same address you can make dhcp always give them a specific address, by setting reservations based on their adapter MAC addresses. (see the dhcpd docs).

 

I think you will need to set up a reverse look up zone for your domain in BIND if you want to carry on using static IPs - the trouble is you'll need to update it if things change, whereas is you use a dhcp server this can be done dynamically.- this link has some good info on setting up BIND for reverse lookups:

 

 

http://www.crazysquirrel.com/linux/dns.php

 

That's about all I can stab at based on what you said - hope some of it is of some use to you...

Link to comment
Share on other sites

Ok, some quick updates.

I brought my laptop home (with pclinuxos installed) and connect it to my desktop (also pclinuxos) via crossover cable. I found out that maybe the problem is partly caused by PClinuxOS. ssh-ing takes one minute before the "enter your password" screen comes out. Luckily, I still have my Mandrake 9.1 acronis backup handy and I restore my desktop back to Mandrake 9.1. Now, if I ssh from my desktop to laptop, it takes about 3 seconds for the password screen to appear, while from laptop to desktop takes about a minute. So the slow network seems to be from pclinuxos.

 

However, I kind of agree that BIND may be part of the problem. Here is how I configure BIND. I installed webmin (using apt-get webmin), and configure it from there. However, as far as I know, the configuration files should be saved in /var/named directory. In this case, I notice that there is a /var/named/chroot directory and my internal domain name seems to be put in that directory (/var/named/chroot/etc/ to be exact). Is this normal? It does work in resolving name address though. Now, since it is the weekend and the server is at work.. I can only do the work on Monday.

 

Anyway, thanks for the link jimdunn. I will be reading it soon enough.

Link to comment
Share on other sites

Another update.

 

It's definitely pclinuxos (at least my version of it). I acronis backuped my current pclinuxos and reinstalled it.. Now the ssh takes around 3 seconds for the password screen to come out. Restoring back to the acronis backuped version creates the problem again.. ssh taking around a minute for the password screen to come out.

 

Well.. That's one end down.. Now all I need to do is to reinstall pclinuxos for all the infected clients at work.. oh boy..

Link to comment
Share on other sites

Last update

 

Ok.. after reinstalling all of the clients and changing the domain name (I should've known underscore is a big nono).. Everything is hunky dory again. Now, if only LTSP works, but that's another problem entirely.

 

Thanks for all the suggestions :)

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
 Share

×
×
  • Create New...