Computer Science – Distributed – Parallel – and Cluster Computing
Scientific paper
2003-06-12
Computer Science
Distributed, Parallel, and Cluster Computing
5 pages, Proceedings for the CHEP 2003 conference, La Jolla, California, March 24 - 28, 2003
Scientific paper
Having built up Linux clusters to more than 1000 nodes over the past five years, we already have practical experience confronting some of the LHC scale computing challenges: scalability, automation, hardware diversity, security, and rolling OS upgrades. This paper describes the tools and processes we have implemented, working in close collaboration with the EDG project [1], especially with the WP4 subtask, to improve the manageability of our clusters, in particular in the areas of system installation, configuration, and monitoring. In addition to the purely technical issues, providing shared interactive and batch services which can adapt to meet the diverse and changing requirements of our users is a significant challenge. We describe the developments and tuning that we have introduced on our LSF based systems to maximise both responsiveness to users and overall system utilisation. Finally, this paper will describe the problems we are facing in enlarging our heterogeneous Linux clusters, the progress we have made in dealing with the current issues and the steps we are taking to gridify the clusters
Bahyl Vladimir
Chardi Benjamin
Eldik Jan van
Fuchs Ulrich
Kleinwort Thorsten
No associations
LandOfFree
Installing, Running and Maintaining Large Linux Clusters at CERN does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Installing, Running and Maintaining Large Linux Clusters at CERN, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Installing, Running and Maintaining Large Linux Clusters at CERN will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-408140