Distributed ﬁlesystems and mirroring - Local network orientation and analysis

Networked communities

3.8 Local network orientation and analysis

3.8.9 Distributed ﬁlesystems and mirroring

• To instantly be able to identify the correct locations of ﬁles on backup tapes, without any special labelling of the tapes (see section 12.3.3).

System administrators are well known for strong opinions, and many practicing system administrators will strongly disagree with this practice. However, one should have an excellent reason to ignore a systematic approach.

• Users’ home directories.

• Software or binary data (architecture speciﬁc).

• Other common data (architecture unspeciﬁc).

Since users normally have network accounts which permit them to log onto any host in the network, user data clearly have to be made available to all hosts.

The same is not true of software, however. Software only needs to be shared between hosts running comparable operating systems. A Windows program will not run under GNU/Linux (even though they share a common processor and machine code), nor will an SCO Unix program run under Free BSD. It does not make sense to share binary ﬁlesystems between hosts, unless they share a common architecture. Finally, sharable data, such as manual information or architecture independent databases, can be shared between any hosts which speciﬁcally require access to them.

How are network data shared? There are two strategies:

• Use of a shared ﬁlesystem (e.g. NFS, AFS or Novell Netware).

• Remote disk mirroring.

Using a network filesystem is always possible, and it is a relatively cheap solution, since it means that we can minimize the amount of disk space required to store data, by concentrating the data on just a few servers. The main disadvantage with use of a network filesystem is that network access rates are usually much slower than disk access rates, because the network is slow compared with disks, and a server has to talk to many clients concurrently, introducing contention or competition for resources. Even with the aggressive caching schemes used by some network filesystems, there is usually a noticeable difference in loading files from the network and loading files locally.

Bearing in mind the principles of the previous section, we would like to minimize load on the network if possible. A certain amount of network traffic can be avoided by mirroring software rather than sharing with a network filesystem. Mirroring means copying every file from a source filesystem to a remote filesystem. This can be done during the night when traffic is low and, since software does not change often, it does not generate much traffic for upgrades after the initial copy.

Mirroring is cheap on network trafﬁc, even during the night, During the daytime, when users are accessing the ﬁles, they collect them from the mirrors. This is both faster and requires no network bandwidth at all.

Mirroring cannot apply to users’ ﬁles since they change too often, while users are logged onto the system, but it applies very well to software. If we have disk space to spare, then mirroring software partitions can relieve the load of sharing.

There are various options for disk mirroring. On Unix hosts we have rdist, rsync and cfengine; variations on these have also been discussed [264, 309, 117, 98].

The use of rdist can no longer be recommended (see section 6.5.6) for security reasons. Cfengine can also be used on Windows. Network filesystems can be used for mirroring, employing only standard local copy commands; filesystems are first mounted and then regular copy commands are used to transfer the data as if they were local files.

The benefits of mirroring can be considerable, but it is seldom practical to give every workstation a mirror of software. A reasonable compromise is to have a group of file-servers, synchronized by mirroring from a central source. One would expect to have at least one file-server per subnet, to avoid router traffic, money permitting.

Exercises

Self-test objectives

1. What is the main principle at work in any cooperative enterprise, such as a network or community with limited resources?

2. Explain the role of policy in a community.

3. Are rules meant for humans comparable to rules meant for machines?

Explain.

4. Describe the social community structures in a human–computer system.

5. What consequences result from placing a computer in an environment that is controlled by external parties?

6. What are the pros and cons of making a network completely uniform in the choice of hardware and software?

7. Explain how patterns of user behavior have a direct and measurable effect on a computer system.

8. Explain the pros and cons of centralization versus delegation in a system.

9. List the different identiﬁers that label a computer.

10. How does a computer know its IP address?

11. How does a computer know its Ethernet address?

12. What is a MAC address?

13. What is the service that relates Internet Domain Names to IP addresses?

14. What is the service that relates IP addresses to MAC addresses?

15. Describe alternative models for organizing network resources.

16. What is meant by a ‘server host’ and how is it different from a ‘server’?

17. How are user preferences stored on Unix and Windows?

18. How would you go about mapping out an existing Local Area Network to ﬁnd out how it worked?

19. Name the most common network services that most Local Area Networks implement.

20. Why is it important to know what software and hardware is running across a network that you are responsible for?

21. What is usually meant by a ‘resolver’?

22. What tools can you use to ﬁnd out the IP address of a host?

23. What tools can you use to ﬁnd out the IPv6 address of a host?

24. How would you ﬁnd out the domain that a given IP address belongs to?

25. How would you ﬁnd out the domain that a given IPv6 address belongs to?

26. How would you get in touch with the Network or System Administrator who was responsible for a particular IP address?

27. Explain what the ping program does.

28. Explain what the Unix program traceroute and Windows program tracert do.

29. How would you go about trying to locate the World Wide Web server of a network that you were not familiar with? (Would the same method work for other services like E-mail or FTP?)

30. Why is computer clock sychronization important? How can this be achieved?

31. What is meant by a Uniform Resource Locator (URL) and how can this be used to create a systematic naming scheme for network resources?

32. What is meant by dependency amongst computers and services? What are the pros and cons of dependency?

Problems

1. Use the ping and ping6 commands to ping different IP addresses on your network (note that these differ somewhat on different platforms – the exam-ples here are from GNU/Linux). Try pinging the addresses repeatedly with a large packet size (9064 bytes):

ping -s 9064 192.0.2.4

2. What are the advantages and disadvantages of making access to network disks transparent to users? Discuss this in relation to the reliability of hosts.

3. What is meant by a name service? Name two widely used name services that contain IP addresses and one that contains Ethernet addresses.

4. What is the Domain Name Service? How do hosts depend on this service?

Suppose that the data in the DNS could be corrupted. Explain how this could be a security risk.

5. In what way is using a name service better than using static host tables? In what way is it worse?

6. Draw a diagram of the physical topology of your local network, showing routers, switches, cables and other hardware.

7. Determine all of the subnets that comprise your local network. (If there are many, consider just the closest ones to your department.) What is the netmask on these subnets? (You only need to determine the subnet mask on a representative host from each subnet, since all hosts must agree on this choice. Hint. try ifconfig -a.)

8. If the network xxx.yyy.74.mmm has subnet mask 255.255.254.0, what can you say about the subnet mask for the addresses on xxx.yyy.75.mmm? (Hint:

how many hosts are allowed on the subnet?) Which IP addresses does the subnet consist of?

9. If the network xxx.yyy.74.mmm has subnet mask 255.255.255.0, what can you say about the subnet mask for the addresses on xxx.yyy.75.mmm?

10. Using dig or nslookup, determine the answers to the following questions:

(a) What is the IP address of the host www.gnu.org?

(b) What are names of the nameservers for the domain gnu.org?

(d) What is name of the mail exchanger for the domain iu.hio.no?

11. The purpose of this problem is to make you think about the consequences of cloning all hosts in a network, so that they are all alike. The principles apply equally well to other societies. Try not to get embroiled in politics, concentrate on practicalities rather than ideologies.

(a) Discuss the pros and cons of uniformity. In a society, when is it advan-tageous for everyone in a group to have equal access to resources? In what sense are they equal? What special characteristics will always be different, i.e. why are two persons never completely equal? (e.g. their names are different)

(b) When is it advantageous for some members of a community to have more resources and more power than others? You might like to consider what real power is. For instance, would you say that garbage disposal workers and water engineers have power in a society? What does this tell you about the organization of privilege within a human–computer system?

(d) What is meant by dependency? How does delegation lead to dependency?

Can you foresee any problems with this, for network efﬁciency?

(e) What is meant by a network service? What issues can you identify that should be considered when deploying a new network service?

(f) Discuss each of the above points in connection with computers in a network.

12. Design a universal naming scheme for directories, for your site. Think about what types of operating system you have and how the resources will be shared; this will affect your choices. How will you decide drive names on Windows hosts?

13. What are ARP and RARP? Why can’t we use Ethernet addresses instead of IP addresses to send data from one side of the planet to the other? Could IP addresses eliminate Ethernet addresses? Why do we need both these addresses?

14. At some sites, it was common practice to use remote mirroring to synchronize the system disks or ﬁlesystems of hosts, where compiled software had been mixed in with the operating system’s own ﬁles. This solves the problem of making manual changes to one host, and keeping other hosts the same as the source machine. Discuss whether this practice is advisable, with respect to upgrades of the operating system.

15. Discuss the pros and cons of the following advice. Place all file-servers which serve the same data on a common host, e.g. WWW, FTP and network file systems serving user files. Place them on the host which physically has the disks attached. This will save an unnecessary doubling of network traffic and will speed up services. A fast host with a lot of memory and perhaps several CPUs should be used for this. Explain how the optimal answer depends on the hardware one has available.

16. Prepare a sample of what you consider to be the main elements of a system policy. Swap your answers with classmates and review each other’s answers.

Chapter 4

Dalam dokumen and System Administration (Halaman 118-124)