PDA

View Full Version : BOINC linux issues somewhat resolved



Bok
09-12-2005, 11:34 AM
Ok, I was getting somewhat tired of some of my machines not appearing to run, or showing up in the 'machines' list on the boinc websites as 'localhost' etc..so I started doing some investigation..

I figured boinc would be getting the name of the machine from /etc/hosts so I checked these first..

I'm fairly lazy so when I install a cruncher more often than not /etc/hosts just contains a single line

127.0.0.1 localhost localhost.localdomain

i.e. I do nothing with it :)

boinc then names this as localhost with an IP of 127.0.0.1 (though sometimes it uses my DNS server for some reason!)

I went through and added for example

192.168.1.20 Blade20

This fixed all of those issues once I ran boinc with -update_prefs

The worse issue was when I was checking what the project id of each box was. For Einstein I would do this

cat /BOINC/sched_request_einstein.phys.uwm.edu.xml | grep hostid

in a number of cases this would return a hostid of 0

Once I had fixed the hosts and also made sure my /etc/resolv.conf was pointing to a valid DNS server (some were very old..:) ) doing the update_prefs fixed the hostid and in some cases returned up to 6 results for einstein.

The result of all this is that my computers are now for the main showing correctly on the einstein site and are all now communicating correctly though I still have to find a couple of missing ones :p And my pending credit just jumped by 1500 or so..

Bok

PCZ
09-14-2005, 03:50 AM
Bok

I also ran into the problem of all my PXE nodes being called localhost.
Basically I went down the same road you did.

I made sure that resolv.conf had an entry for a valid DNS server and that the DNS server had records for all the hosts.
That didn't cure it though. seems boinc looks in the hosts file.
That fixed the delay when telneting in that I had been experiencing previously though.


The PXE nodes all share the same image and consequently the same hosts file so i had to find a way to change the hosts file at bootup.

The PXE nodes set up a small ramdrive at bootup mounted as /tmp.
The hosts file was written there and symlinked from etc/hosts.

A HOSTNAME variable is set up in rc.local
HOSTNAME=`hostname`

An IP variable is also set up in rc.local
IP=`ifconfig eth0 | grep 'inet addr'| awk '{print $2}'|sed -e "s/addr\://"`

An echo command writes localhost info to /tmp/hosts
echo "127.0.0.1 localhost" >/tmp/hosts

Another echo command writes host IP and hostname to the second line of the hosts file.
echo "${IP} ${HOSTNAME}" >>tmp/hosts

The hostname is also written to network as well
echo "HOSTNAME = ${HOSTNAME}" >/tmp/network

with a symlink in etc.

BTW
The hostname info comes from the dhcp server.

Hope the above info can help other folks using discless nodes with boinc.

Edited to add IP variable.
Thanks Bok

Bok
09-14-2005, 08:01 AM
If you add another line in /etc/hosts with the IP address and the name, boinc uses that one and also the IP address on the machines page on the project is correct too.

Not necessary for DHCP obviously.

Bok

PCZ
09-14-2005, 03:57 PM
So i need to find some way to get the IP echoed to /tmp hosts.

Bok
09-14-2005, 04:06 PM
Originally posted by PCZ
So i need to find some way to get the IP echoed to /tmp hosts.

I take it at that point you have got your ip address ??

try this

echo `ifconfig | grep inet|grep Bcast | cut -c21-33` $HOSTNAME >> /tmp/hosts

might need tweaking ever so slightly if the output is different.

Bok

PCZ
09-14-2005, 04:49 PM
Think I have it.

HOSTNAME=`hostname`
IP=`ifconfig eth0 | grep 'inet addr'| awk '{print $2}'|sed -e "s/addr\://"`
echo "127.0.0.1 localhost" >/tmp/hosts
echo "${IP} ${HOSTNAME}" >>tmp/hosts


hosts file contains these lines:

127.0.0.1 localhost
172.31.158.103 ws003

PS
That IP= line was hard work for a windows admin ;)

Edited to remove hostname from first line of hosts file.

Bok
09-14-2005, 04:57 PM
That'll work.

Now check one after you've done an update_prefs. You may need to not have the hostname in the 127.0.0.1 line, I'm not entirely sure of that.

Bok

p.s. nice IP naming :) You couldn't just go for 192.168.x..x ? :rotfl:

PCZ
09-14-2005, 05:09 PM
Its part of a 172.31.128.0 /19 netblock that i manage.

Still getting 127.0.0.1 for the IP's.

I will change the hosts file a bit to see if the real IP's for my PXE nodes can be displayed.

Edit

Adding a second line to the host file works ;)
IP's are now correct on my linux boxes

Thing that was throwing a spanner in the works is that boinc only adds the IP to cliemt_state.xml when the file is first created.

To change the IP you either have to edit the xml file or delete it and let boinc set itself up again.

So the first line of the hosts file is
127.0.0.1 localhost

Dont add hostname in that first line.

Second line is
host ip hostname

example
127.0.0.1 localhost
192.168.0.1 host1

MerePeer
11-07-2005, 08:58 PM
Originally posted by Bok
I'm fairly lazy so when I install a cruncher more often than not /etc/hosts just contains a single line

127.0.0.1 localhost localhost.localdomain

i.e. I do nothing with it :)

boinc then names this as localhost with an IP of 127.0.0.1 (though sometimes it uses my DNS server for some reason!)

I went through and added for example

192.168.1.20 Blade20

This fixed all of those issues once I ran boinc with -update_prefs


I hit this too. My hosts were:

127.0.0.1 localhost ComputerName

I reversed the names (dont need IPs; they are dhcp-served):

127.0.0.1 ComputerName localhost

Which appears to have worked -- for boincview and wcg at least -- although I'm hoping I didnt mess up any other local apps.