This was an interesting paper, because I never knew there were metrics for software out there. I always just kind of thought software as something that is hard to judge, almost like an art piece. You can't really score an art piece, but make some observational comments about it. This paper talks about a new metric system to evaluate OOP designs.
The author attempts to improve previous metrics, because he believes previous metrics were not mathematically rigorous. Then he uses a metric defined by Weyuker to show that his metrics are better. However, I see there are still ways to go before metrics can become something that people can agree upon. The Weyuker's criterias uses notions of monotonicity, interaction, noncoarseness, nonuniqueness, and permutation. To show how these apply, he lists several examples. For example with nonuniqueness, it tries to capture the idea that two classes can have the same metric classes, meaning they are equally complex. The metric system cannot be such that all classes have different metric value. This is definitely useful in that a class has a different complexity even by having a different name. For example, two classes with the same function but different name (one a descriptive and another name a cryptic name) could have different metric value, and the nonuniqueness and noncoarseness would account for the value of the metric value. One of the criterion, I personally don't agree with which is monoticity. He means that given a class P, and a metric unit, u, u(P) <= u(P+Q), Q being another class. It is to measure that adding another class would only increase complexity. But I could think of instances where adding a class would demonstrate what the other classes are doing. For example, if class P was draw, and class Q was deckofcards, then it would make it obvious what the whole system was doing. While class P by itself makes the class ambiguous in terms of whether draw was referring to drawing a picture or drawing a card.
To determine the metrics, he uses Booch's outline of what are the essential features of object oriented programming. Booch listed them as: identification of classes, the semantics of classes, relationship between classes, and implementation of classes. Hence, all the metrics are centered around how the classes are defined and the relationship between the classes. He cites that the central them of OOP is classes, and that it is agreed upon by others though there are criticism of this. He does however admit that this system would not capture the dynamic behavior of the system, since it does not collect any information about the run time behavior.
For example, one of his metric was weighted methods per class. This is a number summing up all the complexities of each method in a class. He believes the higher the number it'll mean more time and effort to maintain the class, and larger number means more effect on children, and more methods mean very specific class. I don't think the way he determined the complexity of the methods is very rigorous.
Second metric: depth of inheritance tree. The basis for this metric is the scope of classes. How much the parent classes may affect the child classes. For example, if there is a great depth in the class hierarchy, then it is possible the child has a lot of methods that it inherited from the parents and it will become hard to predict how the child will behave. I think this is one of the best criterion for the oop metric. I think programmer often think in terms of 2 or 3 depth of inheritance, and after that I think people forget about the parents, so in a deep tree, I would forget what behaviors the child has inherited.
Third metric: number of children. This can suggest several things. The higher number of children will be good since there will many reuse of functions, but higher number of children also means the parent has to be extremely robust, and kind of suggest a strong single point of failure. It would indicate the need for perhaps breaking up the inheritance tree into different category to reduce the number of children.
Fourth metric: coupling between object classes. This means the how many classes a certain classes is attached or depends on. This could result in very bad programs and leading to highly unmodular design. If there is a change in one class, then all the classes that are coupled must also change in order to match that design. This metric is a good indicator of how maintainable the code will be in the long run. If there needs to be change everytime a small snippet changes, then it will be very hard to maintain the code. When the code really blows up, at certain point it will be beyond maintaining.
Fifth metric: response for a class. This refers to the idea of the set of responses a class can make to a message. This relates to idea of method overloading and dynamic binding between class hierarchies. For a particular message, if there are many methods that can be potentially be executed, then it makes it really hard to debug, and when the code gets large again, then it'll be really hard to figure out exactly what method is being called. Some of the worst bugs that I've had to deal with circles around this. I would be coding something and something behaves nothing like what I expected, then I start changing up the code to do something ridiculous and I get confused to why it won't change. Then I would have to trace the code even further to determine that it went on to call a different method than I expected. Not only does this increase in complexity, it also increase the run-time since the program has to determine which method to execute.
Last metric: lack of cohesion in methods. The way he determines cohesiveness between the methods is by looking at which local variable the method uses. The more local variables they share, the more cohesive these functions are. He suggests that if there is lack of cohesion, then the functions should be separated in to more subclasses because there isn't enough encapsulation. However, I don't particular agree with the reasoning, because from experience, I feel like there are instance when classes are well-defined but the method don't share too many local variables. I don't really know the reasoning for my statement. However, I can definitely understand how a class can be bloated when there is a different set of variables for each of the methods, and the entire program essentially being under one class.
These all have to do with how classes are defined. I think class definition are a very important point in oop design, but I think there can be many subtle things outside of class definition that could be useful for metrics. I think a lot of them are kind of common sense to programmers who have been programming for a decent amount of time. I think many people implicitly understand these concept and probably have even applied them during programming, but I think it is still useful to explicitly state and read about why decisions we make during class implementations matter. Overall the paper was pretty interesting, it opened me up to a topic that I didn't know existed, and it was interesting to read about how he collected the data to use against his metrics.
https://wwwbroy.in.tum.de/lehre/vorlesungen/vse/WS2004/1994_chambers_metric_suite_oo.pdf
No comments:
Post a Comment