The answers are as follows
• Add an HTTP header to your response set on the app server which tells the client which server processed the request (this really should only be for debugging IMO)
• Enable sticky sessions in your load balancer. I don't recommend this however as it doesn't allow your traffic to scale naturally
• This sounds like you have poor capacity management. Your cluster should be beefy enough that it can keep running after losing at least one server. If the current number of servers minus one is not enough power for the load, then your cluster is too small. A "just barely big enough" cluster may give you some horizontal scaling, but it won't give you fail-over and high availability.