Monday, October 17, 2005

Designing Reusable Classes

Excerpts from the article Designing Reusable Classes.

Attributes of object-oriented languages promote reusable software. Data abstraction encourages modular systems that are easy to understand. Inheritance allows subclasses to share methods defined in superclasses, and permits programming-by-difference. Polymorphism makes it easier for a given component to work correctly in a wide range of new contexts. The combination of these features makes the design of object-oriented systems quite different from that of conventional systems.

As with any design task, designing reusable classes requires judgement, experience, and taste. However, this paper has organized many of the design techniques that are widely used within the object-oriented programming community so that new designers can acquire those skills more quickly.

An abstract class never has instances, only its subclasses have instances. The roots of class hierarchies are usually abstract classes, while the leaf classes are never abstract. Abstract classes usually do not define any instance variables. However, they define methods in terms of a few undefined methods that must be implemented by the subclasses.

A class that is not abstract is concrete. In general, it is better to inherit from an abstract class than from a concrete class. A concrete class must provide a definition for its data representation, and some subclasses will need a different representation. Since an abstract class does not have to provide a data representation, future subclasses can use any representation without fear of conflicting with the one that they inherited.

The top of a large class hierarchy should almost always be an abstract class, so the experienced programmer will then try to reorganize the class hierarchy and find the abstract class hidden in the concrete class. The result will be a new abstract class that can be reused many times in the future.

Object-oriented design starts with objects. Booch suggests that the designer start with a natural language description of the desired system and use the nouns as a starting point for the classes of objects to be designed [Booch 1986]. Each verb is an operation, either one implemented by a class (when the class is the direct object) or one used by the class (when the class is the subject). The resulting list of classes and operations can be used as the start of the design process.

It can be difficult to decide whether an operation should be implemented as a method in a class or as a separate class. Halbert and O'Brien discuss this problem at length [Halbert & O'Brien 1987]. In general, there is no absolute way to decide, but positive answers to the following questions all indicate that a new class should be created.

  1. Is the operation a meaningful abstraction?
  2. Is the operation likely to be shared by several classes?
  3. Is the operation complex?
  4. Does the operation make little use of the representation of its operands?
  5. Will relatively few users of the class want to use the operation?

It can also be difficult to decide which class should implement an operation. Operations with several arguments can frequently be implemented as methods in the classes of any of its arguments. The rules listed above can also be used to make this decision. For example, if an operation does not send messages to an object or access its instance variables then it should not be in the object's class.

Rules for Finding Abstract Classes

Rule 5: Class hierarchies should be deep and narrow.

A well developed class hierarchy should be several layers deep. A shallow class hierarchy is evidence that change is needed, but does not give any idea how to make that change.

An obvious way to make a new superclass is to find some sibling classes that implement the same message and try to migrate the method to a common superclass. Of course, the classes are likely to provide different methods for the message, but it is often possible to break a method into pieces and place some of the pieces in the superclass and some in the subclasses.

Rule 6: The top of the class hierarchy should be abstract.

Rule 7: Minimize accesses to variables.

Since one of the main differences between abstract and concrete classes is the presence of data representation, classes can be made more abstract by eliminating their dependence on their data representation. One way this can be done is to access all variables by sending messages. The data representation can be changed by redefining the accessing messages.

Rule 8: Subclasses should be specializations.

There are several different ways that inheritance can be used [Halbert & O'Brien 1987]. Specialization is the ideal that is usually described, where the elements of the subclass can all be thought of as elements of the superclass. Usually the subclass will not redefine any of the inherited methods, but will add new methods. An important special case of specialization is making concrete classes. Since an abstract class is not executable, making a subclass of an abstract class is different from making a subclass of a concrete class. The abstract class requires its subclasses to define certain operations, so making a concrete class is similar to filling in the blanks in a program template. An abstract class may define some operations in an overly general fashion, and the subclass may have to redefine them.

In the middle of a project, it may be useful for a designer to make subclasses that are not specializations of their superclasses. If the superclasses are poorly designed, it might take a great deal of work to determine the proper design of the class hierarchy. It is better to forge ahead and then to later reorganize the classes. Thus, a subclass might be more general than its superclass or might have little relationship to its superclass other than borrowing code from it.

No comments: