Professional Documents
Culture Documents
Williams
v. 1.2
2012-08-05
Keywords: VHDL, Verilog, function points, function point analysis, object orientation, object oriented design, project management, software requirements specification, software requirements, software design, programming language, language standard, Backus-Naur format, BNF, program size
Copyright (c) 1998, 2012 by John Michael Williams. All rights reserved.
J. M. Williams
v. 1.2
2012 Preface
This paper originally was prepared in about 1998 for the IEEE OO VHDL Study Group, which ceased its activities a year or so thereafter. The purpose of the group was to investigate the possibility of creating an alternative VHDL language standard which would include object oriented syntax and semantice. After that time, the author more or less ceased all development in VHDL and worked exclusively in Verilog. In about 2003, the author ceased all design work in favor of the advocacy of Tcl (Tool Command Language) for the benefit of HDL users as a control language to simplify logic synthesis or simulation. Nevertheless, VHDL is an important hardware description language, and the proposal herein still is relevant to the evaluation of the requirements for any VHDL or Verilog design, whether object-oriented or otherwise. The text of this paper is full of acronyms; to render it more readable while preserving the structure, I have included a list of interpretations here: OO = Object Oriented VHDL = Very high speed Hardware Description Language VHDL LRM = VHDL Language Reference Manual EDA = Electronic Design Automation Function Point Abbreviations: FP = function point LS = language standard LP = language point EI = External (transaction-type) Input EO = External Output EQ = External Query GA = generalized application BNF and Counting Abbreviations: BNF = Backus-Naur Format bnR = Right-hand side appearance in BNF Lbn = Left-hand side appearance in BNF ILF = Lbn FP-like Internal Logical File IC = Lbn Internal Control (as opposed to ILF) EIF = bnR External Interface File EC = bnR External Control (as opposed to EIF) LCS = Language Change Specification
J. M. Williams
v. 1.2
Introduction
Function Point (FP) analysis was devised by Allen Albrecht in the 1970's as a way to improve project management for applications being developed for IBM mainframes. A full explanation and tutorial on the approach may be found in Dreger (1989). Meaningful understanding of the present document requires some FP familiarity, a minimal fraction of which will be provided in this introduction. Conventional FP analysis is concerned with application development and generally assumes text terminal (tty) user operation. FP analysis can be extended easily to graphical user interfaces. Function points measure the size of a program, the measurement being based solely upon the software requirements specification of that program. Skilled FP counters can achieve 95% agreement on independent counts from the same requirements document. If a careful and reasonably complete requirements specification has not been created, FP analysis is not available; and, this will make little difference, because the project very likely will not be completed successfully anyway, no matter how long it takes. In counting the function points of a requirements document, the counter looks at the tasks required of a user of the system and estimates the amount of business functionality the program would provide when completed. Serious errors of estimation result from taking a technical or team-programmer's point of view, as any reader who has worked on a software project would expect. For estimation purposes, the size of the project is just the FP count. Accurately knowing the size, the program management accurately can estimate the duration of development and the size of the programming team required to complete it. In this regard, as a rule of thumb, with good requirements and a completed detailed design at hand, a single skilled programmer can implement perhaps a few FPs a week, including all stages of testing and verification. Function points do not measure time or effort, only size. There is a provision in FP analysis for altering the count because of reuse of preexisting code. One might consider use of this provision to be treated as optional: Having counted the FP value of a set of requirements assuming no reuse, it then would seem to be possible to identify the parts of the required functionality to be implemented by reused code. Subtracting a count of the reuse functionality from the total no-reuse FP count would seem to leave a reasonable measure of the size remaining to be coded. Very importantly, though, the time and effort spent at integration of reused code then must be estimated separately, which adds to the calculations and the likelihood of error. At a minimum, one would have to recount the requirements with reused code more or less removed, but with a reuse factor added, because the integration of reused code generally would increase the size of the final result. FP analysis is independent of programming language. In this respect, it is much superior to an analysis based solely upon estimated lines of code: Line of code
J. M. Williams
v. 1.2
depend on choice of language and other arbitrary decisions, such as whether to include comments, whether expressions and statements should be weighted differently, and whether cut-and-paste lines, such as error handling routines, should count. Even if lines of code were reduced to assembly-language lines of code, the choice of CPU instruction set would introduce a language bias totally absent from an FP count. Because FP analysis is possible very early in a software development project, before any software design or implementation had been done, before even a design team has been assembled, it is far more valuable than lines of code in providing the planning basis for the workload and duration of the project. The textual context of FP analysis, which is just tty application programming requirements, suggests its possible extension to other lexical interface contexts, such as requirements for scanners, parsers, or code generators. In the present work, we shall be proposing such a possible extension.
J. M. Williams
v. 1.2
synthesizer and a simulator should be expressible, somehow, in the LP count. All EDA tools which implemented the same language constructs should have the same LP count. For LP analysis to be as useful and valuable as FP analysis, it must be true in general that two different standard languages should have the same LP count if and only if they each would take the same amount of code to implement, other things equal.
Figure 1. Relationship of application-level function point analysis (FPA) to EDA language standard usage. I = transaction-type input (EI), O = output (EO), Q = query (EQ). EI, EO, and EQ are commonplace FP analysis abbreviations in which the "E" means "external". For example, in the case of VHDL, the language constructs supported for synthesis are a subset of those supported for simulation; so, we would expect the size of a VHDL synthesis language standard, as measured by LP count, to be less that that of a VHDL simulation language standard, regardless of the particular synthesizer or simulator involved. In particular, we would expect a text-based VHDL simulator to be bigger, in LP count, than a VHDL synthesizer, even if the latter included multitudinous debugging facilities and a fancy graphical user
J. M. Williams
v. 1.2
interface. However, we would expect the FP count of such a simulator to be smaller than that of the synthesizer.
Figure 2. A user-oriented attempt to define the domain of LP analysis: Partitioned FPs, language vs nonlanguage. The application might be, for example, a VHDL simulator. To simplify, only one LS query system is shown for developers and users. The problem here is that the user in general cannot choose, through the application, whether any particular specified function should be of the language or of the nonlanguage kind. For example, an application which included an input text
J. M. Williams
v. 1.2
editor displaying certain keywords in color would be showing language (keyword) and nonlanguage (color) functionality for the same user-visible feature. We might pursue the idea of user partitioning by describing the color-word interaction in terms of the complexity of the associated language function of such an editor. However, it would seem that the software requirements for a typical application would not often specify that the user should be able to identify every LS function in isolation; therefore, in general, the application would not function in a way meeting requirements if it made each of its LS functionalities, as such, visible to the user. It appears that the perspective for a valid LP count will have to be one which ignores the application requirements and therefore the application user.
A Syntax-Oriented Approach
Abandonment of Semantics
The meaning, if any, of a statement in an application programming language must be realizable in some application. But, for any given meaning, we always may specify a different application with that statement bearing the opposite -- or some arbitrary but different -- meaning. Because the user always may be viewed as requiring every visible feature of an application, we assert here that to ignore the user is the same as to ignore every specific application; so, it also is the same as to generalize over all applications in a particular way. At this point, therefore, we assert that by ignoring the application user, and by generalizing over all possible applications, we are factoring out the semantics of the language.
J. M. Williams
v. 1.2
Figure 3 below illustrates a first cut at implementing the Scanner approach. The role of the developer is to develop the generalized application (GA), in part by reference to the LS document, and in part by LP-counting requirements as will be suggested below. The LP count in all other ways will be independent of the developer and of any user requirement.
Figure 3. A syntax-oriented attempt to define the domain of LP analysis. File-like constructs are counted as internal logical files (ILFs) as in FP analysis. A reference to a nonlanguage file would count as an external interface file (EIF) in FP analysis. In Figure 3, the language functionality partition, for an appropriate construct, may make transactional references to the nonlanguage partition. The function resolver shown provides the transactional functionality for constructs referred outside the generalized application. Also in Figure 3, we assume that a generalized application has been defined and may be partitioned into language functionality vs. nonlanguage functionality. Operationally, this might be done by noting where the developer used the LS query
J. M. Williams
v. 1.2
system, but we ignore the question of how to define this kind of partitioning for the present. We then treat the language and the nonlanguage partitions as two separate "applications" in the sense of FP analysis, with the "application" to be counted being the language partition. Instead of a user and user requirements, we operate the Scanner on the LS document (not the GA). Each construct in the LS then also must be identified in the generalized application. We hope the LS context plus the GA generalization of the construct's application will yield a valid LP value for each language construct in the LS document. The purpose of the function resolver in Figure 3 is to count application-external file or control functions. During the LP counting, the Scanner might encounter LS constructs implying such functionality; we resolve the functionality over the external interface to the function resolver shown. If necessary, the requirements defining the GA might be used to define substructure of the function resolver; it is unclear at present whether such substructure ever would be necessary. We also may count file or control functions over the application-application interface to the nonlanguage partition, as certain language constructs might demand. These last accesses would be internal transactions, if viewed in terms of the generalized application as a whole (see Fig. 3).
We note that in this scheme, the semantics are attached to the language by association of attributes. We have only trivial need for semantics in the current approach; we will define our GA along lines analogous to a dependency graph. We assume that the main development effort (for use of the LS in writing an EDA application) in relation to syntax would be at the parse stage, including input scanning, and that a parsed language element will require coding which will depend in size (a) upon the element's allowed position(s) in the set of all trees; and, (b) upon the element's intrinsic functionality. We use the (a) factor in defining the GA. In regard to (a), we assume that the LP size of a parsed element will, for such an element encountered during any arbitrary language use, be equal to its size as encountered in the LS in the context of greatest visibility (scope) of that element allowed in the language. Otherwise, the
J. M. Williams
v. 1.2 10
dependencies of the GA generally could not be made to match those specified in the LS. In regard to (b), we assume that the LP count for any language element in isolation may be determined in a way analogous to FP counting of similar elements in function point analysis. For the present, this will involve use of the fully-defined GA and only enough semantics to distinguish "file-like" vs. "control-like" usage of such elements.
J. M. Williams
v. 1.2
11
Because the count is done on the LS, not the GA, multiple instances or contexts of a construct in the application are counted as many times as named by the LS -- each time at the scope in the GA in which they have been introduced. For example, consider the VHDL LRM. Suppose we wish to count the LP value of an entity: We add it to an initially empty GA; however, to do this, we first must add a compilation unit, say, as a disc file. We now have a GA consisting of a file containing a VHDL entity. The LP count for the entity will be determined by the LP count value of an entity in a file. If now we wanted to count LP for an architecture, we would have the option of adding it to the GA either in the same or a different compilation unit. We therefore choose to add the architecture to the same compilation unit, because the scope of a compilation unit including an entity and an architecture is greater than that of a compilation unit containing either one. This kind of approach is easy to see, because, for example, the scope of a port (signal) name declared in the entity would extend to the architecture, too. In regard to the immediately preceding example, recall that during elaboration, the scope of an entity binded to an architecture would be the same regardless of compilation unit. However, we still choose a compilation unit containing both declarations for the GA, because such choice makes the scope contained in the compilation unit greater than otherwise. The idea of greatest specified scope solves many problems: For example, a particular Scanning (= application of the Scanner -- syntax orderer -- suggested above) of the LRM might encounter a constant declaration construct in several places: entity, architecture, process, subprogram, etc. The principle of greatest specified scope immediately solves the problem by having us put the GA constant declaration at the entity level. Later, in counting the LP value, we use the LP value at the entity level, multiplying it by however many times the LS specifies a constant declaration for entity or for anything else. Regardless of the resolution of multiplicity, to proceed meaningfully, we shall at this point accept the assertion that the size of a language in LP may be determined from a GA based solely on a complete Backus-Naur description of that language. Of course, such an assertion is subject to future rejection or improvement.
J. M. Williams
v. 1.2 12
J. M. Williams
v. 1.2
13
The final LP count for the construct then is obtained by multiplying the Representative and Intrinsic sizes for the construct.
J. M. Williams
v. 1.2 14
Protected type declarations may occur anywhere a subprogram declaration may occur, including: entity declarations, <--- GREATEST SCOPE architecture bodies, subprogram bodies, package declarations, package bodies, block statements, process statements, and generate statements.
Protected type definitions may occur anywhere a subprogram body may occur, including: entity declarations, <--- GREATEST SCOPE architecture bodies, subprogram bodies, package bodies, block statements, process statements, and generate statements.
protected_type_declaration [05] ::= protected protected_type_declarative_part [06] end protected [protected_type_simple_name [07]]
protected_type_definition [11] ::= protected body protected_type_definition_declarative_part [12] end protected body [ protected_type_simple_name [13] ]
J. M. Williams
v. 1.2
15
protected_type_definition_declarative_item [16] ::= subprogram_declaration [17] | | | | | | | | | | | | subprogram_body [18] type_declaration [19] subtype_declaration [20] constant_declaration [21] variable_declaration [22] file_declaration [23] alias_declaration [24] use_clause [25] attribute_declaration [26] attribute_specification [27] group_template_declaration [28] group_declaration [29]
Second, as in Step (1) in the Generalized Procedure above, we construct a GA for the changes. We do this by using the minimum of constructs from the LRM adequate to include all the changes. For this, we need only locate all sections in the LRM changed by the LCS. Each step below incrementally extends the GA. The step numbers in the construction are arbitrary, for this example only: Step 1 in GA construction. Compilation_Unit:
entity end; architecture architecture_name of entity_name is begin end; entity_name is
J. M. Williams
v. 1.2 16
protected protected_type_declarative_part end protected; protected body protected_body_name protected_type_definition_declarative_part end protected body; end; architecture architecture_name of entity_name is begin end;
Step 3 in GA construction. (This merely increases the scope--but it makes a more comfortable entity declaration, too): Compilation_Unit:
package package_name is protected protected_type_declarative_part end protected; end; -package body package_name is protected body protected_body_name protected_type_definition_declarative_part end protected body; end; -use package_name.ALL; entity end; architecture architecture_name of entity_name is begin end; entity_name is
J. M. Williams
v. 1.2
17
Step 4 in GA construction. Implement the package declarations, using the LCS. This defines the Generalized Application for this example, and completes the steps for the example. Compilation_Unit:
package package_name2 is --inner package, just to be USEed protected function function_name(InInt: IN integer) return integer; end protected; protected procedure procedure_name(InInt: IN integer; OutInt: OUT integer); end protected; protected type type_name is INTEGER range 0 to 31; end protected; protected subtype subtype_name of type_name is INTEGER range 0 to 15; end protected; protected constant constant_name: integer:= 0; end protected; protected shared variable shared_variable_name: integer; end protected; protected file file_name: FILE of integer; end protected; protected alias regetni: bit_vector(2 downto 0) is type_name(24 to 31); end protected; protected USE Std.Standard.ALL; end protected; protected attribute COLOR : integer; end protected; [continued next page]
J. M. Williams
v. 1.2 18
protected attribute IMPLEMENTATION of ALL:function is "generalized app."; end protected; protected group group_template_name is (integer, integer); end protected; protected group group_instance_name: group_template_name (constant_name, constant_name); end protected; end package package_name2; -package body package_name2 is protected body function function_name(InInt: IN integer) return integer is begin end; end protected body function_name; -protected body procedure procedure_name(InInt: IN integer; OutInt: OUT integer) is begin end; end protected body; end package body package_body_name2; -package package_name1 is protected use package_name2.ALL; end protected; end package package_name1; -use package_name2.ALL; entity end; entity_name is -- opens the scope for the entity -- just for the use clause
J. M. Williams
v. 1.2
19
architecture architecture_name of entity_name is begin process process_name; protected variable variable_name: integer; end protected; begin end process; end;
Third, and not explicitly described in the Generalized Procedure above, in this example we would count LP for the existing LRM. For brevity, this will not be done in this example. Our LP count here will be incremental, only. Fourth, as the Step (2) in the Generalized Procedure above, we scan the LS to map each instance of a construct to its representative in the GA. Each construct was numbered in brackets above, as it occurred in the LCS. We copy the GA below and attach the bracketted construct numbers to it. Numbers clearly never further expanded are attached immediately only to one GA occurrence, at greatest scope. This turns out to be a very repetitive operation in this example; however, it is quite routine and well-defined. After that, in the second copy of the GA below, we remove all duplicates of each number, leaving only the one associated with the greatest scope. Redundant_Compilation_Unit:
package package_name2 is --inner package, Max scope. protected function function_name(InInt: IN integer) return integer; [01][02][03][05][06][08][09][10][11][12][14][15][16][17][30] end protected function_name[07]; protected procedure procedure_name(InInt: IN integer; OutInt: OUT integer); [01][02][03][05][06][08][09][10][11][12][14][15][16][17][30] end protected; protected type type_name is INTEGER range 0 to 31; [01][02][04][11][12][14][15][16][19] end protected type_name; [13] just to be USEed
J. M. Williams
v. 1.2 20
protected subtype subtype_name of type_name is INTEGER range 0 to 15; [01][02][04][11][12][14][15][16][20] end protected; protected constant constant_name: integer:= 0; [01][02][04][11][12][14][15][16][21] end protected; protected shared variable shared_variable_name: integer; [01][02][04][11][12][14][15][16][22] end protected; protected file file_name: FILE of integer; [01][02][04][11][12][14][15][16][23] end protected; protected alias regetni: bit_vector(2 downto 0) is type_name(24 to 31); [01][02][04][11][12][14][15][16][24] end protected; protected USE Std.Standard.ALL; [01][02][03][05][06][08][09][10][31][25] end protected; protected attribute COLOR : integer; [01][02][04][11][12][14][15][16][26] end protected; protected attribute IMPLEMENTATION of ALL: function is "generalized app."; [01][02][04][08][09][10][32][27] end protected; protected group group_template_name is (integer, integer); [01][02][04][11][12][14][15][16][28] end protected; protected group group_instance_name: group_template_name (constant_name, constant_name); [01][02][04][11][12][14][15][16][29] end protected; end package package_name2; --
J. M. Williams
v. 1.2
21
package body package_name2 is protected body function function_name(InInt: IN integer) return integer is [01][02][04][11][12][14][15][16][18][30] begin end; end protected body function_name; -protected body procedure procedure_name(InInt: IN integer; OutInt: OUT integer) [01][02][04][11][12][14][15][16][18][30] is begin end; end protected body; end package body package_body_name2; -package package_name1 is protected use package_name2.ALL; [01][02][03][05][06][08][09][10][31][25] end protected; end package package_name1; -use package_name2.ALL; entity end; architecture architecture_name of entity_name is begin process process_name; protected variable variable_name: integer; [01][02][04][11][12][14][15][16][22] end protected; begin end process; end; entity_name is -- opens the scope for the entity -- just for the use clause
J. M. Williams
v. 1.2 22
J. M. Williams
v. 1.2
23
protected group group_template_name is (integer, integer); [28] end protected; protected group group_instance_name: group_template_name (constant_name, constant_name); [29] end protected; end package package_name2; -package body package_name2 is protected body function function_name(InInt: IN integer) return integer is [18] begin end; end protected body function_name; -protected body procedure procedure_name(InInt: IN integer; OutInt: OUT integer) is begin end; end protected body; end package body package_body_name2; -package package_name1 is protected use package_name2.ALL; [31][25] end protected; end package package_name1; -use package_name2.ALL; entity end; entity_name is -- opens the scope for the entity -- just for the use clause
J. M. Williams
v. 1.2 24
architecture architecture_name of entity_name is begin process process_name; protected variable variable_name: integer; end protected; begin end process; end;
Fifth, and finally, we count LP for the proposed change. Recalling that this example is meant to illustrate an unexplored and untested method, the result is in Table 1 below:
J. M. Williams
v. 1.2
25
Total = 232
J. M. Williams
v. 1.2 26
Conclusion of Example
In this example, we now have found that the count for the LCS LP count was 232. There are no available calibration data yet, so it's not certain what this value would mean in the context of language development. If the LP count were a typical FP count, the size of the LCS, as a raw estimate, would be a substantial project of some 25,000 lines of C code, or over 15,000 lines of C++. With good requirements and design documentation, this count, in FP units, might take three experienced developers a year or so to code and test. Our first, coarse guess at calibration is that perhaps the LP count should be divided by 10 for size equivalence with an FP count. Assuming this calibration to be correct, a three-developer team would be expected to spend perhaps two months at coding and testing.
Conclusion
There would seem to be little reason why the procedures above could not be coded into a new kind of metrics program and completely automated. Calibration of the weights could be performed without too much effort; in this way, the actual size represented by the numerical LP count might be given meaning. The question to the group was not answered as of early 2003. The question remains: Does the approach seem valid?
References
Aho, A. V., Sethi, R., and Ullman, J. D. Compilers: Principles, Techniques, and Tools. Menlo Park, CA: Addison-Wesley, 1988. Albrecht, A. J. and Gaffney, J. E. Jr. Software Function, Source Lines of Code, and Development Effort Prediction: A Software Science Validation, IEEE Transactions on Software Engineering. November 1983, pp. 648-652. Dreger, J. B. Function Point Analysis. Englewood Cliffs, New Jersey: Prentice Hall, 1989. IEEE Standard VHDL Language Reference Manual (IEEE/ANSI Std 1076-1993). New York: IEEE, 1994. Willis, J. IEEE Shared Variable Language Change Specification. PAR 1076A, v. 5.7. IEEE Internal Document. September 1996.