A Fuzzy Model for Function Point Analysis to Development and Enhancement Project Assessments

Function Point Analysis (FPA) is among the most commonly used techniques to estimate the size of software system projects or software systems. During the point counting process that represents the dimension of a project or an application, each function is classified according to its relative functional complexity. Various studies already propose to extend FPA, mainly aimed at achieving greater precision in the point assessment of systems of greater algorithmic complexity. This work proposes the use of concepts and properties from fuzzy set theory to extend FPA to FFPA (Fuzzy Function Point Analysis). Fuzzy theory seeks to build a formal quantitative structure capable of emulating the imprecision of human knowledge. With the function points generated by FFPA, derived values such as costs and terms of development can be more precisely determined.


Introduction
The project management is of crucial importance within software engineering.It encompasses the entire process of software development, making it comprehensible and organized, and assuming the function of coordinating the activity of personnel in the execution of processes in a disciplined and consequential manner.
Upon determining the scope of a project, the manager must stipulate, for instance, the development period and the cost of the product.Such activities are critical since unrealistic estimates here may lead to the failure of the project.The development process may suffer shifts in focus what can sacrifice quality or reduce final product functionality.All too frequently, software projects fall into worst case scenarios characterized by the bloating of schedules and costs.
In order to calculate such estimates, various methods have been developed, such as the Metric System of Halstead (Halstead, 1977), COCOMO (Boehm, 1981), PUTNAM-SLIM (Putnam, 1978) and FPA (Albrecht, 1979).FPA has attained expressive application in software project management; above all, due to its technological independence, simplicity and conciseness, being among the software metrics best practices along the year of 2001 (Kulik, 2002).
FPA begins with the decomposition of a project or application into its data and transactional functions.The data functions represent the functionality provided to the user by attending to their internal and external requirements in relation to the data, whereas the transactional functions describe the functionality provided to the user in relation to the processing this data by the application.After identifying the functions, each needs to be classified as either simple, average, or complex in accordance with its relative functional complexity, which is expressed by a certain value in points, depending on the function.Upon completing a point assessment to all functions, the application is then adjusted in accordance with the general characteristics of the system, which evaluates the general functionality of the application.
As originally conceived, FPA uses an abrupt and disjoint manner of classifying its functions.For example, an external input that references two files and five data items is classified as "average", receiving four points.Another external input that references two files and fifteen data items is also

Fuzzy Set Theory Approaches
A fuzzy set is characterized by a membership function, which maps the elements of a domain, space or discourse universe X for a real number in [0,1].Formally, Ã : X → [0,1].Thus, a fuzzy set is presented as a set of ordered pairs in which the first element is x ∈ X, and the second, µ Ã (x), is the degree of membership or the membership function of x in Ã, which maps x in the interval [0,1], or, (Zadeh, 1965).
The membership of an element within a certain set becomes a question of degree, substituting the actual dichotomic process imposed by set theory (Pedrycz and Gomide, 1998), when this treatment is not suitable.In extreme cases, the degree of membership is 0, in which case the element is not a member of the set, or the degree of membership is 1, if the element is a 100% member of the set (Turksen, 1991;Zimmermann, 1991).
Therefore, a fuzzy set emerges from the "enlargement" of a crisp set that begins to incorporate aspects of uncertainty.This process is called fuzzification.Defuzzification is the inverse process, that is, it is the conversion of a fuzzy set into a crisp value (or a vector of values) (Zimmermann, 1991).
Theoretically, any function in the form of Ã : X → [0,1] can be associated with a fuzzy set depending on the concepts and properties that need to be represented along with the context in which the set is inserted.Nevertheless, the literature already contains families of parameterized membership functions such as triangular, exponential, and Gauss functions, etc.Each one of these functions is characterized by a fuzzy number that is a convex and normalized fuzzy set defined in the set of the real numbers R, such that its membership function has the form [15,24]: µ Ã : R → [0, 1].(Klir andYuan, 1995 e Zimmermann, 1991).
Fuzzy logic is also another extension realized in boolean logic that may be considered a generalization of multi-valued logic.By modeling the uncertainties of natural language through concepts of partial truth-truth-values falling somewhere between completely true and completely false (Kantrowitz, 1997) -fuzzy logic deals with such values through fuzzy sets in the interval [0,1].These characteristics allow fuzzy logic to manipulate real-world objects that possess imprecise limits.Utilizing fuzzy predicates (old, new, high etc.), fuzzy quantifiers (many, few, almost all etc.), fuzzy truth-values (completely true, more or less true) (Dubois and Prade, 1991) and generalizing the meaning of connectors and logical operators, fuzzy logic is seen as a means of approximate reasoning (Grauel, 1999).
In the next section, are the main concepts of FPA and the manner in which these concepts have been extended by fuzzy set theory.

Function Point Analysis
At the beginning of the 1970s, researchers at IBM initiated studies aimed at determining what critical variables were involved in programming productivity.Instead of considering code volume or complexity, they discovered that a system would be better evaluated by analyzing the functions executed by programs and mapping pertinent questions to estimating and evaluating software development productivity in heterogeneous environments (Albrecht, 1979).
According to Smith (Smith, 1997), the first definitions of FPA were refined and extended in IBM CIS Guideline 313, AD/M Productivity Measurement and Estimate Validation, in 1984.In 1986, a group of FPA users formed the International Function Point User Group (IFPUG), which is responsible for keeping their associates informed regarding any updates in technique.
FPA can be applied to calculating the size of applications, project development, or enhancement projects.Thus, the procedures to be executed are: (i) determine counting type; (ii) identify the counting scope and boundary of the application to be measured; and (iii) execute the actual calculation process itself.
The counting types may be: • Calculating a development project: measure the size of a new application development project considering the requested functions and delivery to the user including those relative to the data conversion process.• Calculating an enhancement project: measure a maintenance project of an existing application, considering all inclusions, changes, and exclusions of user functions and the data conversion process.• Calculation of an application: measure the size of existing applications, that is, the functionality of the application available to the user from their point of view, excluding data conversion functions.The counting scope determines what functionality will be considered as part of the point calculation.Normally, the calculation is applied to all application or project functions.The boundary of the application is a key question in order to correctly recover the data properties that belong to the application, as well as to correlate the relationship of the application under study to external systems.
The actual calculation process itself is accomplished in three stages: (i) determine the unadjusted function points (UFP); (ii) calculate the value adjustment factor (VAF); (iii) calculate the final adjusted function points.
The first stage, determining the unadjusted function points (UFP), reflects the functionality of modules delivered to the user that they have requested and defined.The unadjusted function points include data and transactional functions.The data functions are: • Internal Logical File (ILF): are groups of logically related data or control information maintained by the application itself.For instance, the file that stores the customer data of a company is an ILF for the customer database system since such a system is responsible for customer maintenance.• External Interface File (EIF): are groups of logically related data or control information whose maintenance is under the responsibility of another application.For instance, the file that stores employee data of a company is an EIF for the customer database system, assuming that it accesses employee data whose actual maintenance is accomplished by the employee database system.The transactional functions are classified in the following manner: • External Input (EI): are elementary processes that involve data or control information that are input at the boundary of the application with the main objective of doing ILF maintenance.
For instance, updating the personal data of a customer of an organization.• External Output (EO): are elementary processes that send control information or calculated data to the end user or to other applications.For instance, a report that summarizes sales volume per quarter for a particular customer of a company.
• External Inquiry (EQ): are elementary processes that send control information or uncalculated data to the end user or to other applications.For instance, a personal data query operation for an employee of a company.Once identified, each function presented must be classified according to its relative functional complexity as low, average, or high.The data functions' relative functional complexity is based on the number of data element types (DETs) and the number of record element types (RETs).A RET may be defined as a subset of logically related data within an ILF or an EIF.A subset can be classified as optional or mandatory in terms of its use during the process that creates the instances of data.A DET is a singular, non-repeated field recognized by the user as having its own meaning.Thus, it represents a subdivision of an ILF or an EIF.The transactional functions are classified according to the number of file types referenced (FTRs) and the number of DETs.The number of FTRs is the sum of the number of ILFs and the number of EIFs updated or queried during an elementary process.
The second stage, calculating the value adjustment factor (VAF), is an earmark of the general functionality provided to the user.The VAF is derived from the sum of the degree of influence (DI) of the 14 general system characteristics (GSCs).The DI of each one of these characteristics ranges from 0 to 5 as follows: The general characteristics of a system are: (i) data communications; (ii) distributed data processing; (iii) performance; (iv) heavily used configuration; (v) transaction rate; (vi) online data entry; (vii) end-user efficiency; (viii) on-line update; (ix) complex processing; (x) reusability; (xi) installation ease; (xii) operational ease; (xiii) multiple sites; and (xiv) facilitate change.
The third and last stage is the final calculation of the function points.The calculation formula varies with the type of counting.According to Eq. ( 1), the total points of an application, for instance, is obtained as follows: AFP = UFP * VAF (1) where AFP = adjusted function points; UFP = unadjusted function points; and VAF = value adjustment factor.
The formula to determine the function points of an enhancement project is described in Eq. ( 2 After executing the enhancement, the application function points must reflect the results of the enhancement project.To do this, the Eq. ( 3 All definitions, rules for counting and classifying, exception handling, and practical examples that illustrate this process can be found in (FPCPM, 1999).

The Measurement of Function Points
During the measurement process, a function (data or transactional) goes through various implicit transformations (Abran and Robillard, 1996), with the final objective of obtaining a representation of its relative functional complexity expressed in function points .
However, before being expressed in points, the complexity of a function is characterized by one of the following linguistic terms: low, average, or high.The attribution of these terms to the functions follows [what determines] the functional complexity matrix of each function.For example, Table 1 represents the complexity matrix of an ILF, whose terms low, average, and high complexity correspond to 7, 10, and 15 function points, respectively: There are at least two clear situations in FPA that do not accurately translate the function points measurement process as can be observed in the data of Table 1: • Situation 1 (S 1 ): an ILF with 1 RET and 19 DETs (function f 1 ) is classified as low complexity, which translates to 7 function points.By the same criteria, an ILF with 1 RET and 50 DETs (f 2 ) is also classified as low complexity (7 function points).However, with an increment of only one more DET to the latter case, thereby increasing it to 51 DETs, the ILF (f 3 ) would be considered of average complexity, contributing with 10 function points in the counting process.Thus, FPA considers f 1 and f 2 as identical functionalities and f 2 and f 3 as different functionalities.In the case where they are configured into the same project, the final resulting measurement will not correspond to a sufficiently accurate function points value.• Situation 2 (S 2 ): an ILF with 6 RETs and 20 DETs has the same number of function points as an ILF with 6 RETs and 70 DETs; that is, they have the same functionality.In such a case, the number of items referenced, which determines the lower limits of the high complexity range, can lead to the same measurement precision difficulties observed in the above situation, especially in systems that reference a large number of DETs.Although function points represent the functionality of a system, many empirical studies point to an existing relationship between these points and the amount of work-effort necessary to develop them (Albrecht and Gaffney, 1983;Desharnais, 1990e Kemerer, 1997).Measurements derived from function points, such as cost and term, may produce non-factual values as a result of the current methods of specifying the functionality of the functions that constitute the system.
Additionally, the abrupt and disjoint manner of classifying the functions avoids the application to reflect in function points the functionality that was added to it after the enhancement.For example, if an enhancement project is constituted only by one modification on an existing function, which will not change its level of complexity after the enhancement is complete, thus CHGB has the value of CHGA.According with the Equation 3, the value of AFP will remain unchanged in relation to what is already installed.
The fuzzy model, as proposed below, aims to provide a more precise approach to the function points counting process by extending FPA to FFPA while at the same time guaranteeing the validity of the final calculation of traditional function points.

Fuzzy Function Point Analysis (FFPA)
The central idea of extending FPA to FFPA through fuzzy set theory is to expand traditional FPA semantics by making use of the concepts and mathematical formalism of this already well-established theory.
The types of data functions (ILF and EIF) and transactional functions (EI, EO and EQ), within their respective functional complexity matrixes, can be mapped to discourse universe X, which corresponds to referenced DETs.These matrixes all use the same linguistic terms low, average, and high, to express their complexity.For each line of these matrixes, fuzzy numbers were generated for each of their linguistic terms.
Through experimentation, it has been shown that trapezoid-shaped fuzzy numbers best preserve FPA complexity matrix values in addition to circumventing the difficulties presented in S 1 (item 3.1).
A trapezoid-shaped fuzzy number can be represented by Ñ (a, m, n, b), whose membership functions are presented in Eq. 4 below.The values a and b identify the lower and upper limits respectively of the larger base of the trapezoid, where µ Ã (x) = 0.The values m and n are the lower and upper limits respectively of the smaller base of the trapezoid, where µ Ã (x) = 1, as shown in Figure 1.

First Stage
In this stage, trapezoid-shaped fuzzy numbers Ñ (a, m, n, b) are generated for each linguistic term T i (low, average, high) belonging to the complexity matrix of the data and transactional functions.The value m i assumes the lower limit of the linguistic term i of the complexity matrix being considered.The value n i is calculated from the mathematical average of the values for m i and m i+1 , whose result must be a rounded whole number.The values for n i-1 and m i+1 are attributed to a i and b i , respectively.Necessary adjustments are made when dealing with the first or last linguistic term in accordance with (4).
As an example, observe the complexity mat1rix of an ILF, Table 1, with a number of 2 to 5 RETs, and the linguistic term of average, as depicted in Fig. 2

Second Stage
This stage is seen to minimize the problems mentioned in S 2 (subsection 3.1), which becomes all the more critical as the number of RETs and the number of FTRs increase in the data functions and in the transactional functions, respectively.
In this context, a new interval of high complexity was added to data and transactional functions that called for at most an interval of average complexity.For the same reason, a new interval of very high complexity was added for the remaining functions, applying the modifier very to the linguistic term high.
Due to the semantics of the new fuzzy number generated in FFPA (very high), the transformation operations indicated in the literature for the modifier very were not applied.The critical question concerning this new fuzzy number Ñ i is the calculation of the most appropriate value for m i and n i-1 .The value of n i-1 , for which the value of m i is calculated, is the point at which the membership function will begin to lose characteristics of high complexity and consequently begin to acquire characteristics of very high complexity.
The last line of the complexity matrix of each function was the starting point for the creation of the new fuzzy number.In accordance with what has been established in (FPCPM, 1999), the functions that fall into the last two cells of the matrix are of high complexity.In order to maintain the use of values used by the (FPCPM, 1999), it was decided that the number that indicates the lower limit of the third column of the matrix represents the value n of the fuzzy set of high complexity functions.
In an ILF, for example, the value of n i-1 would be 51 DETs, according to Table 1.Since the value of n i-1 of any given fuzzy number corresponds to the value of a i , it follows that the value of a i for the fuzzy number of a function of very high complexity would also be 51.Since the value of a i is calculated from the mathematical average of m i and m i-1 , then for an ILF, the value of m i = 82, as follows: (m + 20) / 2 = 51 → m = 82 Table 3 presents the extension made to the values of the linguistic term very high to each FPA function: From this point forward, the value of m i for a fuzzy number of very high complexity will be referred to as k, generalizing this new linguistic term in relation to the others that already exist.The value corresponding to k must be calculated for each of the five function types belonging to FPA.Upon obtaining the value of k, for example, Table 1 (subsection 3.1) was extended (shaded area) to Table 4 below.Taking a value of k = 82 as calculated above and based on the data from Table 4, the membership functions of the trapezoid-shaped fuzzy numbers are presented in the following three figures (Fig. 3, Fig. 4, and Fig. 5) for each of the three lines of the complexity matrix of an ILF, in accordance with the model depicted.

DETs
Very High High Average Figure 5: Membership functions to the fuzzy numbers defined for ILFs with 6 or more RETs From a historical basis of function points estimated systems, a more appropriate value for k can be arrived at, which conforms to the organization of software development in focus.This will become clearer in the case study described in section 5.

Third Stage
In FPA, p i function points are attributed to each linguistic term T i (low, average, and high) in accordance with the complexity matrix under consideration.In FFPA, these points are directly associated with the fuzzy number of the linguistic term, where µ Ñ (x) = 1.
The value of the function points for the new linguistic term of very high complexity is calculated by the extrapolation of the values already defined for the terms low, medium, and high of each function.
Since the interpolation routines and functions may be used to extrapolation (NUMREC, 1992) and considering that the groups which represent the linguistic terms are equally spaced, it became viable to apply interpolation with finite differences, a special case of binomial interpolation in which the abscissas of the points are equidistant.Specifically, the approximant function of Newton's Formula is thusly defined (Santos, 1972): where u is the value corresponding to the abscissa x in the interpolar and ∆ is the operator of the progressive differences.In this case, the values for the abscissas 1, 2, 3 and 4 were attributed to the terms low, medium, high and very high, respectively.The value of u is obtained by the equation below: where h is the step, in other words, the difference between the value of the two subsequent values of x.The value of u is 3 for all the data and transactional functions, since h = 1 (step), x = 4 (very high) e x 0 = 1 (low).
The terms ∆f i e ∆ n [f(x)] of the approximant function are calculated in the following way: where n =2, 3, ... .Applying the definitions above, the values obtained for the fuzzy numbers of very high complexity functions were estimated (Table 5).Below, follow the calculations for the data functions ILF: The value of the function points for the term very high of an ILF was obtained by substituting the above terms in the approximant function: In this way, an ILF of very high complexity equates to 22 function points.Repeating the above definitions, the values of 14, 9, 10 and 9 function points were obtained for the fuzzy numbers of the very high complexity functions for function types EIFs, EEs, EOs and EQs, respectively.

Fourth Stage
Following (LAEKWIJCK, 1999), individual criteria for defuzzification, which are not directly related to theoretical concepts and fundamentals, can be formulated when practical results of greater importance exist.In FFPA, to obtain the number of function points p d from fuzzy trapezoidal numbers, where µ Ñ (x) < 1, execute the following defuzzification process:

Cases Study
The FFPA model is being validated on the basis of real data, which contains information on governmental systems, including development and enhancement projects.This database is mostly constituted of legacy systems developed mainly in Natural 2. Table 6 presents estimates in FPA and FFPA of some systems.The term estimates (in days) to program these systems were calculated in accordance with data supplied by Jones (Jones, 1996), considering both the level of the language used as well as the experience of the team utilizing it.D1 and D2 are development projects, while M1, M2, M3 and M4 are enhancement projects.In order to obtain Error 1, the column 3 is subtracted from column 4.After this, the result is divided by column 3.In the same way, to calculate column 2, the column 7 is subtracted from column 4.Then, the result is divided by column 7.
With the results obtained above, it can be noted that there was a reduction between the predicted and real time taken to develop or enhance a system when function points counting was done through FFPA.This corroborates the hypothesis that the fuzzy numbers generated better represents the functionality of an application when it possesses a large number of data or transactional functions with a large number of DETs.
From a prototype built in Java, the values attributed to k, for each function type, were successively refined as far as the model structure permits (k = 53 to ILF and EIF, k = 18 to EI and k = 22 to EO and EQ), to prevent fuzzy groups of high and very high complexity from agglutinating.This refinement concerns the adjustment of FFPA to the characteristics of the organization, especially for the complexity calculation of large systems, although initial values for k are suggested in the Table 3.The objective of the refinement process is to reduce the margin of error in the estimates, that is, to find a time estimate in FFPA as close as possible to the actual programming time.
The goal was to discover the combination of values for k whose Mean Absolute Error (MAE) for margin of error was as small as possible.In this case, the Mean Absolute Error corresponds to the average of the absolute values of the error percentiles, since negative values signify that the project was concluded before the predicted period, also indicating a forecasting error.Table 7 presents the MAE for different values of k.
According to the data of the Table 7, the combination of values for k equal to (53, 18 and 22) presented the smallest Mean Absolute Error (MAE).Therefore, these results indicate such values for k as indicated for use in maintenance project size estimates for this organization.However, it is worth pointing out that Project M6 strongly influenced these results since its small size gives it greater weight in percentile terms.The inclusion of new projects in the historical database may modify this scenario thereby identifying which would be the best combination for use in estimates for the organization.27, 34 74, 25, 31 67, 25, 31 67, 23, 31 67, 23, 28 61, 23, 28 61, 21, 28 61, 21, 25 55, 21, 25 55, 19, 25 55, 19, 23 53, 19, 23 53, 18, 23 53, 18, 22 Values for k

Conclusion
This work extended standard FPA (Function Point Analysis) to FFPA (Fuzzy Function Point Analysis) , utilizing concepts and properties adopted from fuzzy set theory.By analyzing some of the results obtained through the use of FFPA, we note: • Through the use of trapezoid-shaped fuzzy numbers for the linguistic terms low, average, and high, functions falling along the border areas of the intervals used receive values with a continuous graduation, without an abrupt change of those values; • The creation of the linguistic term of very high complexity, pertaining to a parameterized interval through the value of k, which can be adjusted according to the characteristics of organization to better deal with larger systems; • This model offers a more precise programming term estimate than standard FPA, especially when evaluating systems that cross the threshold of high complexity by referencing a large number of DETs and FTRs within the same elementary process; • The model has become a more sensitive technique for modifying existing functionality and enabling the function points of an application to reflect the results of maintenance.This allows for better administration of the evolution of a system.
• Development of a prototype that automates the calculation of function points using FFPA.
The model proposed uses of a historical base of governmental systems data calculated in FPA to validate its results.This study will also be carried out using other historical bases within other application domains.It is important to point out that FFPA generates the same function points as traditional FPA for projects that have their functions with the complexity classified within nonextended intervals.
): EFP = [(ADD + CHGA + CFP) * VAFA] + (DEL * VAFB) (2) where, EFP = enhancement project function points; ADD = unadjusted function points of those functions added by the project; CHGA = new unadjusted function points of those functions modified by the project, CFP = function points of the conversion functions; VAFA = value adjustment factor after project is finished; DEL = unadjusted function points of those functions deleted by the project; and VAFB = actual value adjustment factor.
): AFP = [(UFPB + ADD + CHGA) -(CHGB + DEL)] * VAFA(3) where, AFP = adjusted function points of the application; UFPB = unadjusted function points of the application before the enhancement project; and CHGB = unadjusted function points of those functions modified by the enhancement project before the changes.

Figure 2 :
Figure 2: Trapezoid-shaped fuzzy number to the linguistic term average of an ILF

Figure 3 :Figure 4 :
Figure 3: Membership functions to the fuzzy numbers defined for ILFs with 1 RET above to an ILF with 1 RET and 35 DETs, we have: µ Ñ (35) = (51 -35)/(51 -26) = 0.64, and the complement ) 0.64 (7) + 0.36 (10) = 8.08 function points In the following section, we present some cases study based on work done at a governmental organization as part of the validation process of the proposed FFPA model.

Figure 6 :
Figure 6: Mean Absolute Error of Estimates for different values for k

Table 1 :
Complexity Matrix of an ILF

Table 2 :
Translation table for the terms low, average and high

Table 3 :
Values of m i to the linguistic term very high

Table 4 :
An extended ILF complexity matrix

Table 5 :
Table of the Progressive Differences -ILF

Table 7 :
Estimates for different values of kThe Figure6graphically presents the Values for MAE obtained from the variation in the values for k.