Well, I've now fixed it, but I'm still not too sure what is happening.
omarfarid: I've tried the input redirection from /dev/null, but am still getting the same results. The process is not requiring any other input other than the arguments that it gets called with. Either way, even with the redirection, it was still stopping randomly.
woolmilk: I doesn't look like its anything to do with ssh unable to login as when that happens (and it does), the expect script would error out and time out, then script 1 would move onto the next item in the for loop. Also, when it freezes, I do a ps aux to see exactly what command is run with which arguments. When I run that manually, it works perfectly.
The way I found the issue was by putting echo statements all over the script, to see where exactly it stops. I will then look at the echo'ed string and identify the exact point.
It turned out to be in the for loop, the first command in there is to call script 2 with the arguments - this runs and immediately zombie-fies
As there is nothing wrong with script2, or the way that its called, and it is apparently stalling at random intervals, I thought it may be a timing issue, so I put a "sleep 1" as the first command in the for loop. This now appears to work perfectly - it never stalls.
So my question now is, why is it doing this? Why do I need a "sleep 1" in there? In theory I could do without the for loop and have all these process running at the same time, they will not interfere with each other. - they run independently.
If it makes any difference, I am running this on Redhat AS 4, 32-bit. Do you think I am running out of file descriptors or some other resource, which requires the process to stop until additional resources are available?