B Y Ming Lu @ A thesis submitted to the Faculty of Graduate Studies and Research in p h a l fdultillment of the requirements for the degree of Doctor of Philosophy in Construction Engineering and Management The Department of Civil and Environmental Engineering University of Alberta Edmonton, Alberta, Canada Spring 2001 National Library Bibliathque nationale du Canada Acquisitions and Acquisitions et Bibliographie Services services bibliographiques 395 Wellington Street 395. rue Wellington Ottawa ON K1A ON4 Ottawa ON KI A ON4 Canada Canada Your file votre rfrence Our fi& Notre rdfdrence The author has granted a non- exclusive licence allowing the National Library of Canada to reproduce, ioan, distribute or sell copies of this thesis in rnicroform, paper or electronic formats. The author retains ownership of the copyright in this thesis. Neither the thesis nor substantial extracts fkom it may be printed or otherwise reproduced without the author's permission. L'auteur a accord une licence non exclusive permettant a la Bibliothque nationale du Canada de reproduire, prter, distribuer ou vendre des copies de cette thse sous la forme de microfiche/fih, de reproduction sur papier ou sur format lectronique. L'auteur conserve la proprit du droit d'auteur qui protge cette thse. Ni la thse ni des extraits substantiels de celle-ci ne doivent tre imprims ou autrement reproduits sans son autorisation. Estirnating labor productivity is one of the most difficult aspects of prepming an estimate, or a control budget based on the estimate for labor-intensive activities in construction. The primary objective of research is deveioping artificial neural neavork or ANN based C$i.Kna~g tools CO offer estimators valuable information about labor productiviv in bidding nem jobs. In conjunction with a major Canadian industrial contractor, the thesis research presents case studies on the theoretical basis and practical considerations for me a s k g and analyzing labor productivity in industrial construction. Two important activities of process piping mere investigated: pipe installation in the field and spool fabrication in the fabrication shop. Emerging cornputer modeling techniques such as data warehouses and ANN were researched fiom an academic perspective and irnplemented in industry to meet the challenges in productivity studies. The thesis research has addressed: (1) how to quannfv labor productivity in indusmal construction Erom a contractor's point of view; (2) how to measure actual labor productivity in industrial construction based upon on- site control practices; and (3) how to utilize ANN to analyze the vmiability of actual labor production rates and the sensitivity of i dens ed influencing factors. Using actual data, the proposed ANN models were proven to be effective in both risk analysis and sensitivity analysis of construction labor productivity. The developed data warehouses and - W- bas ed decision-support tools have been lmplemented or are in the process ofimplementation at the involved Company. The h a 1 results of the research not only assist estmaton to improve the accuracy of e s t i ma ~ g labor production rates for studied activities in biddlig new jobs, but also offer the management a precise and integrated view of corporate productiviq information spanning across many business divisions. The esperience and lessons levned Gom the successful, productive and rnumally beneficial collaboration benveen academh and industry in the diesis research wX potentialiy benefit other universiq-indusq joint research projects in the hture. This thesis is organized in a paper format, consishg of five main chapters and five appendices. Every chapter is an independent paper and c m be read separately. However, 11 the chapters are logically coherent and pertinen~ to the theme of thesis. Each ;ippendix is a user manual for one computer program that was developed in house in the thesis research. Chapter 1 o v e ~ e ws the mhole thesis by introducing background information, problem statements, research objectives, methodologies used, and contributions achieved. Chapter 2 discusses a case study of industrial construction Iabor productivty, which depicts the settings of the research. Chapter 3 presents a probabilistic neural nenvork classification model along with its application in estimating the production rates of field pipe installation. Chapter 4 presents a sensitivity analysis method of back propagation neural networks along with its application in estimating the production rates of shop spool fabrication. Chapter 5 surnmarizes what has been done thus far and recomrnends what to do in the future research. Appendk A is for the PINN trainer program based on the mode1 described in Chapter 3. Appendis B is for the FabMaster program, which is the data marehouse for the fabrication facilities. Appendis C is for Fab-OLAP, which is an on-line analytical processing program in cornpanion wth FabMaster. Appendx D is for the PipinghIaster program, whch is the data warehouse for the field construction systems. Appendks E is for the SensitiveNN program based on the model as described in Chapter 3. First and siacerely, 1 would like to thank my universitg adssor, Dr. S. bI. AbouRizk, mithout whose visions, guidance and encouragement this academic achievement would not have become a redty. Especially, 1 would like to ha& my industry advisor, b k U. H. Hermann of PCL, whose professionalsm and enthusiasm have set the pace and the standard for the mhole work. 1 am exaemely gratefd to PCL Industriai Constructors, Inc. for s p o n s o ~ g the research hnancially and allowing me to use its actual data for developing problems and validaMg solutions diroughout the thesis. Findy, 1 wodd like to dia& my wife, Duojia, for her understanding, love and assistance in making this thesis fiom thoughts to finish. 1 dedicate this work to her and o u arriving Wua n Hum. Table of Contents CHAPTER 1 INTRODUCTION ................................................................................................................. 1 BACKGROUNDS .......................................................................................................................................... 1 Industrial Construction ....................................................................................................................... 1 3 Prodrictivity Strtdies .............................................................................................................................. - Prodrictiviry Modeis ............................................................................................................................. 3 An@cial Neural Nehvorks .................................................................................................................... 5 .............................................................................................................................. PROBLEM STATEMENTS 8 Procirictivity Stridies .............................................................................................................................. S ........................................................................................................................................ A NN Models 13 RESEARCH OBJECI-IVES ............................................................................................................................ 15 ............................................................................................................................ Prodrictivity Strtdies 16 ............................................................................................. Probnbilistic Neural Network Modeling 16 .............................................................................................. Sensitivity Analysis of Neural Nehvorks 16 M~ ODOLOGES ..................................................................................................................................... 17 Reviewing Literatrtre to Recognize Issues ........................................................................................... 17 Identihing Factors froni Brairlstorrriing by Donrairz Erperts .............................................................. 17 Using Data Warehorlse tu Qrrantitative Data ..................................................................................... 18 Qriestionnaire Srirvey .......................................................................................................................... 19 Cornputer Progrnmniing ..................................................................................................................... 21 ..................................................................................................................... AcADEMIC CONTR[B UTIONS 21 INDUSTRIAL CONTRBUTIONS ................................................................................................................... 22 CONCLUS~ONS .......................................................................................................................................... 23 REFRENCE~ ............................................................................................................................................... 23 CHAPTER 2 A CASE STUDY OF INDUSTRIAL CONSTRUCTION LABOR PRODUCTIVITY ..................................................................................................................................... 28 INTRODUCTION ......................................................................................................................................... 28 .................................................................................................................... INDUSTRIAL CONSTRUCTION 30 FIELD PIPE INSTALLATION ........................................................................................................................ 32 ................................................................................................................. Prodrictivity Qiianrificcntion 33 .................................................................................................................. Producrivity Measrrrerrtenr 34 ....................................................................................................................................... Input Factors 36 ............................................................................................... Probabilistic neriral nenvork trrodeling 40 ....................................................................................................................... SHOP SPOOL FABRICATION 42 ................................................................................................................. Prodrictivity Quantification 43 .................................................................................................................. Prodrtctivity Measurernent 46 lnprit Factors ....................................................................................................................................... 47 ......................................................................................... Sensitivity Analysis of Ittflrienci~zg Factors 51 CONCLUSIONS .......................................................................................................................................... 57 REFRENCES ............................................................................................................................................... 58 CHAPTER 3 ESTIMATING LABOR PRODUCTIVITY USING PROBABILITY INFERENCE NEURAL NETWORK ........................................................................................................................... 6 0 ....................................................................................................................................... INTRODUC~ION 6 0 ................................................................................................................................. Probleni Dorrlairz 61 .................................................................................................................. Review of NN Applications 64 ..................................................................... P R O B A B ~ ~ INFERENCE NEURAL L'WWORK ( P I W) MODEL 64 Introduction of the PINN Mode1 ......................................................................................................... 64 Overview of the PlNN Topology and Process ........................ ... ...................................................... 67 Data Pre-Processing ........................................................................................................................... 70 Orrtpilt Zone Setzip ............................................................................................................................... 74 Processing Elements ( P E) crt Kotionen Layer ................................................................................... 74 .......................................................................................................................... NN Learning Process 75 List of Tables Table 2-1: Sample of pipe installation unit labor rates ............................................................. 33 Table 2-2: Input factors to pipe installation productioi~ ...................................................... 39 Table 2-3: Sample of degree-of-difficulty factors for converting welds into units ............... 4 4 Table 2-4: Explanatory factors ro spool fabrication productivity ........................................... 50 Table 3-1: Input Factors and Data Type of PINN Mode1 ...................................................... 72 Table 3-2: Inpur Data Sample of PINN Mode1 ......................................................................... 73 Table 3-3: Scaled Input Vector and Initial Weight Vectors ................................................... 78 Table 3-4: Updating Kreight Vectors in Firs t Leaming Stage .................................................. 79 Table 3-5: Updathg Weight Vectors in Second Leaning Stage ............................................... 81 Table 3-6: Trained PINN Ready to Recd for A Given Input Vecmr ............................... .... 85 ....................................................................... Table 3-7: Recall Calculations at Bayesian Layer 86 Table 4-1 : Data Set for Testing BPNN and Regression Analysis ......................................... 117 aN1 Table 4-2: Pan5al Derivative (Slope) (-) at Four Input Points ...................................... 119 NP aN. Table 4-3: Statisucs of Partial Derivative (Slope) Values: (-) ........................................ 119 ~ S P Table 4-4: Input Factors of Spool Fabrication Labor Productivity .................................. 128 Table 5-1: PINN vs . BP NN ...................................................................................................... 144 Table B-1: Size Range Codes ...................................................................................................... 159 Table B-2: Material Type Codes ................................................................................................. 160 Table B-3: Item Codes for Spool Level Data Compilation ................................................... 160 Table B-4: Sarnple cf Fabhlaster Outputs ................................................................................ 170 ................... Table D-1 : Sample of Quantity Calculation Summary Table in PipinghIaster 190 List of Figures Figure 1-1: Sample Ques ho~ai r e for Findng Facts about Spool Fabrication .................... 20 Figure 2-1: Output interface of PINN recall program ............................................................. 41 Figure 2-2: Sensitivity Analysis of Spool Fabrication BPNN Mode1 .................................... 51 Figwe 2-3: Tes ting Sensitivity of BPNN to Matenal Type ...................................................... 57 Figure 3-1: Topology of PINN Mode1 ........................................................................................ 66 Figure 3-3: Operations at Bayesian Layer in Recd ................................................................... 80 Figure 3-3: Cornparison of PINN and Back Propagation NN .......................................... 88 Figure 3-4: PINN Output for the Base Case Scenario ............................................................. S9 Figure 3- 5: PINN Output for Scenario 1 .............................................................................. 91 Figure 3-6: P W Output for Scenario 3 ................................................................................... 92 Figure 4-1: St ni cme of Back-Propagation NN Mode1 .......................................................... 103 Figure 4-2: Illustration for Node and Laper Representations ................................................ 108 Figure 4-3: Distributions for Input Sensitivity ..................................................................... 120 Figure 4-4: Sensitivity Analysis of Spool Fabrication BPNN Mode1 .................................... 123 Figure 4-5: T e s ~ g Sensitivty of BPNN to Material Type .................................................... 132 Figure A-1 : Select an identifier key of one previous mal ........................ . . . . . . .................. 147 Figure A-2: User selects data table .... .. ....................................................................................... 149 Figue A-3: Flag status of records .............................................................................................. 150 Figure A-4: Setup structure and leaming parameters for PINN .......................................... 152 Figure A-5: SpecZy training iteraons and train- test PINN .................................................. 153 Figure A-6: Check training results ............................................................................................. 151 . . Figure A-7: Detected noise m training data .............................................................................. 155 Figure A-8: Global report for a train-test t r i al ........................................................................ 156 Figure A-9: PINN Trainer on-line help .................................................................................... 157 ...................................................................... Figure B-1: Program Flow Chart of FabMaster 165 ........................................................................ Figure B-2: Main User Interface of FabMaster 128 ......................................................................................................... Figure C-1: Select one ratio 181 .......................................................... Figure C-2: T d on "number of pipe pieces per foot" 182 ........ ....................................................................................... Figure C-3: View details of data .. 183 Figure D-1: S t r u c ~ e s of Raw Data Tables for A Project ..................................................... 185 Figure D-2: Main user interface of FabMaster ........................................................................ 186 Figure D-3: Handikg: S-Reference Information Inte@o/ Check ....................................... 188 Figure D-4: Welding: X-Re ference Information In t egi t y Check ....................................... 189 .................................................. Figure D-5: Productivity Analvsis Page for Pipe HandlLig 191 Figure D-6: Sample of Pipe Handling Ques tiomaire ............................................................. 192 Figure D-7: Program Flow Chart of Pipinghfas ter ................................................................. 193 ............................................................... Figure E-1: Splash Screen of SensitiveNN program 195 ............................................................................................. Figure E-2: Program Switchboard 197 Figure E-3: Open FFBPNN-mdb First ..................................................................................... 197 Figure E-4: Select data source table ........................................................................................... 198 Figure E-5: Examine details of data and edit record s tatus ................................................... 199 Figure E-6: Program main interface of SensitiveNN ............................................................. 201 Figure E-7: Check leaming results when NN training temiinates ........................................ 202 Figure E-8: Check s Input Sensitivitp for each input-output pair ........................................ 204 Chapter 1: Introduction Indus trial Construction Barrie et al (1992) described industrial construction as: "Indusmal construction covers a wide range of construction projects that are essential to our ualities and basic industries, such as petroleum r ehenes and petrochemical plants, synthetic fuel plants, fossil fuel and nuclear power plants, off s bore oil/gas production facilities, cryogenic plants etc. Indus trid construction generalIy eatures large amounts of hghly cornples process piping, mechanical, electrical, and instrumentation mork; bo th design and consuuction require the highes t level of engineering expertise Gom multiple disciplines." In particular, the installation of process piping systems in indusuial construction is selected for productivty studies because it accounts for the bulk of direct labor hours of an ind~soial contractor. Process piping is used to transport fltids between storage tanks and processing units. Instdation of piping systems generally consists of nvo processes: (1) spool fabrication in a commercial pipe shop; (2) pipe installation in the field. Although the trvo processes are inseparable and can be integrated to optimize the econornics of a particular situation, they are treated independent of each otl~er in the thesis because of the cuxent estimaMg and control practices of the involved company. The productivity studies described in the thesis are conducted to support the management's decision-rnaliing in the context of the company's m e n t management systems, as opposrd to radically changing these systems. Productivity Studies In a construction task that is performed by hand labor, productivity is commonly espressed as the labor production rate (man-hours per installed unit), which measures a key dimension of performance and is a critical factor to estimating, scheduling and control of the project (hlfeld, 1988). Little information could be found in literature on the theoretical basis and practical considerations for measuring and analyzing labor productivity of indusmial construcuon. In conjunction wth a major indusnial concrac tor (refened to as "the company" hereafter), productivity smdies were conducted for tsvo Lnportmt activities in industeid construction: pipe installation in t he field and spool fabrication in the fabrication shop. In general, productivity studies encompass three tasks: (1) developing special methods and techniques to quanti@ labor productivity for e s ma ~g, and to measure actual labor productivty for on-site control, (2) identifjmg input factors that cause the vmiability in productiviq, and (3) analy zing the relations hips benveen input factors and productivitg to enhance the accuracy of productivity estimating or Mprove the on-site performance dkectly. The focus of investigation is the average labor production rates (man-houn per unit) of these activities 3t the end of a project, rather than the d d y hbor production rates, because the primary objective of reseaech is developing ANN-based estimating tools to offer estimators valuabie information about labor productivity in bidding new jobs, rather than assessing and improving the crem performance in the field. Produc tivity Models Several established models for studying productivity cm be found in the literature, including work s tudy techniques, expectancy model, action-response model, regression model, expert systems, and arnhcial neural networks (NN). Work-study techniques were adopted in a nurnber of productivity models, in which only a fenr factors related to work method were included (Thomas and Dd y , 1983). Such work-study models cannot be used to model esternai and management factors. Thomas et al. (1990) and Thomas et ai. (1991) discussed additional drawbacks of work-study techniques for construction productivity modeling. The Espectancy model and action-response rnodel are tsvo alternative techniques proposed to exphin variations in construcuon productivity. In the elrpectancy model, the effort that an individual is &g to evert accounts for the ciifferences in job performance or productivity (Maloney and McFillen 1985). The action-response model graphically depicts the interaction of a number of factors that lead to the loss of productiviq- (Halligan et. al. 1994). Both models contribute to understanding the variations in productivity; however, neither can be used to quanufy the influences of multiple factors on construction productivity (Sommez and RoMngs, 1998). Sanders and Thomas (1993) developed an additive linear regression model to study the effect of sis project-related variables on masonry productivity based on data obtaioed fiom 11 projects. Eight binary variables mere used in the model to represeat the variations in productivity due to temperature and hurnidity. The effect of crew size was also taken into account in the model. The results of this regression model suggested higher productiviq rates for crew with femer members. Thomas and Sakarcan (1994) connued the reseaxch of Sandea and Thomas (1993) by developing the additive Linear regression model for the purpose of forecasckg labor productivity. They only included job condition variables chat describe the work content and the physical components of the work. The focus of both studies was to determine the coefficient of condition variables, or the effect of a present condition on the activity productiviry rate based on the results of historical study; such coefficients were derived independendy of other inputs vithout accounting for combined effects. In addition, the determined coefficients are constants based upon the average values of historical data, and do not reflect the real situations in wiiich the values of such coefficients may vary with the specific job conditions. Esqxrt systems is another technique applied co model labor productivity in trvo studies found in literature. Hendnckson e t al. (1 987) developed an hvo-stage espert system named ''MASON" to estimate acuvity durations for ma s o q construction. First, the maximum espected productivity was estirnated. Nest, this rate was adjusted for various characteristics of job or site. The masimum productivity estimates and the followhg adjustments mere based on the knowledge obtained Erom interviews wi t h a professional mason and a supportkg laborer. Christian and Hachy (1995) developed an esTert system to estimate the production rates for concrete pouring. The expert system relied on the knowledge extracted Erom experts and data coLlected rom seven construction sites. The user simply queried the expert system for an estimate through a ques tion-and-ansmer routine. In both e-upert sys tems, productiviq mas es timated through previously dehned decision d e s obtained from domain experts. Because the nature of forrnulating rules is subjective, the resultant rules may be inconsistent. h o t h e r disadvantage of analyzing productivirp based on expert systems is expert systems do not perforrn Functional input-output mapping, i.e. quantitative evaluation of the impact of job condiuons on productivity. In the follow-..ng subsection background information about r \ NN models d be introduced and the technique of modeling productirity using IWN d be discussed. Artificial Neural Networks htficial Neural Networks (ANN) research involves multiple disciplines including biology, xtificial inteiligence, cornputer science, and mathematics and evolves with the developments in each related discipline. Kohonen (1995) dehned MJN as " massively pardel htercomected network of simple (usudy adaptive) elements and their hierarchical organizations, intended to interact with the objects of the real world in the same way as the biological nervous systems do." Sirnply put, an ANN mode1 is an analytical mode1 that sirnulates the cognitive learning process of the huma. brain, and is automaticdy constructed feom leaming esamples or data by ttid and error Gthout heuiristic design or other human intervention. ANN deds effkctively mith ill-structured problems, in which the algorithms required to solve them CaMOt be given in a precise and explicit fashion, or the data for a partcular problem are either not complete or cannot be s pe de d preasely (Widman et. al., 1989). ANN has been found to be capable of perfomilig pardel computations on different tasks, such as pattern recognition, 1inea.r optimization, speech recognition, and predicuon (Mukhe jee and Deshpande 1995). In short, the s p e d leaming algonthms of ANN are capable of performing . g h dimensional, non-lineu input-output mapping and extracMg hidden patterns and predictive information from observing the leuning esamples. In recent years, ANN has been rescarched and applied as a convenient dedsion- support tool in a rarieq of application areas in civil engineering, including modulv consmiction decision making &Iurt;iza and Fisher, 1993), s t n i c ~ a l analysis (Flood and Katirn, 1994), e s t i ma ~g construction productivity (Portas and AbouRizk, 1997), mode choice analysis of beight mansport market (Sayed and Razavi, 1999), construction m&p estnating (LI et al 1999), measurkg organizauonal effecveness (Sinha and hfcI<im, 2000), and predicbng setdement durkg tunneling (Shi, 2000). In o u research, ANN was selected as the main methodology and utilized to analyze the varGbility of actual labor production rates and the sensitivicy of identified influencing factors due co two reasons. Fkst, construction labor productivity is influenced by a variety of factors. bIodel fit-ting based on construction labor productivity data requires quantification of the effects of factors on labor productivity and quanacation of the interactions among the factors. The task of hdi ng a mapping hinction h m the independent variables to the dependent variable is analogous to that performed by some of the neural netrvork models such as back propagation (Sonmez and Rowngs, 1998). In statistics, regression analysis is the most common method to explore this relationship; in particular, the objectives and operations of nonlinear regression analysis are comparable to back propagation neural nenvorks. Homever, regression models requke the user to define a priori the parametric expression for the model (lnear, quadratic, etc.). In the case of modeling productivity, the user is mainly concemed wth what the productivity dl be for any given set of work conditions, and may not necessarily be interested in the parametrc expression of the model, for instance, a highly complex nonlinear hnctional equation. On the other hand, ANN is capable of nonlinear mapping for most complicated problems such as modeiing productivity; the modeler does not need to esert much effort to decide on the class of relationships in a precise and e,uplicit fashion. Secondly, one of the amactive properties of ANN is th& capacity for tolerating moderate amouncs of noise in the data. In many real applications, the quantity and quality of the avdable data for modehg labor productivity rnay not support the fitting of a regression model. In such cases, ANN may be applied to generalize the knowledge from incomplete or noisy data and provide good solutions the problem. Moselhi et al. (1991) pointed out the possible use of ANN for construction labor pr oducvi ~ modeling. Portas and AbouRizk (1997) developed an M N model to estimate construction productivity for concrete f or mor k msk. The rnajority of data used in the study mas collected by questiomaires on a project basis. The prediction of the ANN model was compared wth that of senior estbators for a single project. Sonmez and RoMngs (1998) developed ANN models for quantitative evaluation of the impact of multiple factors on productivity in concrete pouring, fomwodc, and concrete anishing tasks, using data compiled fiom eight building projects. Their study also compared regression models induding the pure Linev regression rnodel, the regression models rvith interaction and nonlinear tenns Gth ANN models, and concluded, " the use of neural networks helped the over d modeling process. Neural networlcs have shown potential for quantitative evaluation of the effects of multiple factors on productivity, especially when interactions and nonlineu relations mere present-" The problems to be solved in the thesis research were identified dirough i nvest i ga~g the m e n t eshmating and control practices of the involved Company and reviewing the established -! I NN models and applications as found in the litetanire. Placed into civo different perspectives, i.e. productivity studies and ANN models, the dehed problems can be stated as follows: Productivity Studies EstunaMg labor production rates for field pipe instdation commences Mth establishing base production rates for various work items. Base production rates reflect the contractor's present labor productivity level under normal rvorlc conditions that are most ofien encountered in the field. The installation location is one of the major considerations for an estimator to d e h e a classification of work conditions. For example, the base produccion rates of pipe installation are valid for the conespondlig base classification only, in which the installation location is above ground up to 12 fi hgh. fm estimator determines a degree-of-difficulty factor (often referred to as c'multiplier" in the company) for each non-base classif5cation to adjust the base rates up or down in order to reflect the unfavorable or favorable work conditions for the job bWig estimated. This is a subjective detision process, requiring substantial esperience and skill on the part of the estimator to determine realistic production rates for the work conditions to be encountered. Empirical degree-of-difficul~ factors for each classification of work conditions based on the installation location serve as a guide or tool to assist in decidng on such factors and can be found in the company's business manual. For example, the degree-of-difficulty factor for underground pipe installation (3 to 10 ft deep) is about two times the factor for aboveground pipe installation (up to 12 f i high), while the factor for pipe installation inside building at over 10 fi of height is about ~ v o Mies the factor for underground pipe installation (4 to 10 ft deep). Historical piping productivity data of 66 projects mas coliected Erom the company and compiled into numerc format for andysis. The folIowing hvo observations mith regard to the actual degree-of-difficulty factors can be made Erom the histoncd dam of the company: (1) the degree-of-ciifficulty factor for one classification of installation location may reveal a widespread distribution instead of a constant value as in the company's business manual; and (2) different ciassifications of installation location may end up with veq- close values of the degree-of-difficulty factor, not as distinguished as in the cornpany's business manual. The above obsemations are not iniaally e-xpected and the e-uplanation cm be attributed to the fact that more factors esist, other than the location of installation, which contribute to the variability in labor productivity. In practice, an estnator rnay adjust the value of degree-of-difficulty factor iri the business manual on a job based on es~erience and specific job conditions, and subjected to the approval of senior management. Barrie et al (1992) found that construction hbor productivity may fluctuate d d l y due to nurnerous factors that affect it, and many are highly qualitative in name, including the effect of location and regiond v ~ u o n s , the learning c u v e , work schedule and work d e s , environmental effects, crew eqerience and management factors. Identification of input factors in the study of field pipe installation productivity was mainly based on Knowles (1 997). A total of 36 input factors are considered relevant and used to r edehe the classi6cation of pipe installation. Those factors include bath global project-level information and specific activity-level information. To estimate a fabrication project, a speual "unitization" scheme is applied to quanti$ the various work items uniformly into an abstract unit of measure cded "fabrication unit" or "unit" by weighting them for their degree of difficulty. A degree-of- difficulty Factor is empirically determined for each weld, taking into account pipe diameter, mall thickness of pipe, weld type put t weld, socket weld, saddle and lateral welds) and the time required to lay out and perfom the weld. Quantity of non-welding work items such as cutting, beveling, handling pipe and fittings, i ns t ahg supports are also converted into "units" by appIying conesponding degree-of-difficuly factors in the scheme- Once the total "units" for a project are detemilied, the focus of productiviq study in spool fabrication is on the production rate directly (man-hour/unit). Sirnilar to deciding on the degree-of-dffiultg factor for a classification of work conditions in field pipe installation, deuding on unit tabor rate for spool fabrication requires the esperience and judgment of the estnator. The environmental effects and management factors ae noc considered as sgnficant factors, as in the field productivty studies, because of the controlled shop environment, consistent policy and management personnel d h g the period of investigation. h totd of 29 input factors are identified as affecting labor productivi~ of shop spool fabrication based on consultation -5th expenenced estimators and shop superintendents in the Company. It is not straightfomasd to create a conventional andytical model so as to accommodate the impacts of numerous factors on t he mget xsky variable - degree-ol- difficdty factor or production rate. It takes yevs of site ex~erience and eshat i ng practice for an estimator to develop his/her own mental model. The decision process relies heavily on individuas experences and the results are often inconsistent r e f l e c ~g the experience and disposition of the estirnator. ANN has becn proposed by many as an alternative to streamllie the e s t ma ~g process and reduce the subjective nature of the work. ANN Models The classic Back Propagation NN predicts a single value without gi~ing any bacbvp infornation on the risks of taking ths value as correct. Observing the acmal values for the degree-of-difficulty factors of field pipe instaliation indicates that the target nsby variable lies over a relatively wide range. The result from an informal end- user s w e y showed that estimators are more cornfortable to accept a desion support model \.th the capability of analyzing the uncenainty of its outpur. Thus, a probabilistic NN modeling approach that c m predict a distribution or probability densitg h c u o n over the output range is preferred ar,d has been researched. Portas & AbouRizk (1997) proposed a feed fonvard baclc propagation neural network model for estimating construction production rates of formwork. The nenvork outputs a single point prediction dong wi& a number of output zones, with equal Iikelihood of the production rate being in any one zone. The output zones are symrnemc and divided evenly across the range of likely production rate values. D k g training, the output zone with the output that coincides with the actual production rate is remarded with a prirnary score of 1.0, representing strong certaliv. A certain degree of fuzziness is considered by remarding the 2 adjacent output zones with secondary scores of 0.5, representing weak certainty. iV1 the other output zones are assigned a score of O. Once the hW is trained and inputs are entered, the NN d predict a point value as weU as the likelihood of production rates being within the output zones. This model achieved lmited success due to the fact that the adopted back-propagation NN mode1 is long on non-linear regression, but short on &ssiication. Specht (1 99 1) revisited Probabilis tic Neural Network (PNN) and General Regression Neural Nenvork (0 algorithms wth the objective of i nt e gr a ~g statistics and neural training. GRNN/PNN is a memory-based feed fomrard neural network mode4 mhere the training is performed in one pass, thus requing less training &ne. GRNN/PNN is able to identify a posterior distribution over the NN meiglit vectors and a point-value prediction is generated based on the predicted distdbution. However, based on experhentations and observations, GRNN/PNN is not quite tolerant of noisy data (inaccurate or incomplete records) and imposes a demandlig standard of data quality that is hard to achieve in reality. The memory demand and cornpuMg t h e for GRNN/PNN increase very rapidly when the dimension of input vector and the quantity O E training samples increase. Kohonen proposed tmo speal NN models, namely Sel f-Orgki ng Map (SOM) in the late 1980s and Leaming Vector Quantization (LVQ) in the rniddle 1990s. SOM performs unsupervised classification and clusteng to represent high-dimensional, nonlinearly related da ta items in an illustra tive, O ften nvo-dirnensiond dis play. LVQ combines unsupervised and supervised leaning and is recornmended for statistical pattern recognition problems. i n LVQ, "deusion surfaces, relating to those of the Bayesian classinec, are defined by neares t-neighbor classification mith respect to sets of codebook vectors assigned o each class and describing it" @<ohonen, 1995). It is noted that the predicted result O E LVQ and SOM is detenninis tic, being classiEied into one of n predehned clusters or classes. IGiowles and AbouRizk (1997) presented a tmo-stage NN model in p r e d i c ~g pipe-installation labor productitity. The input factors are used to invoke a LVQ classification process, follomed by a predictive one. With the classification, the mode1 predicts whether the output is likely in a cypical or non-typical range. The proper feed- fomard back-propagation network is then esecuted. The drawback of this method is that a build-up of errors occurs when the classification fails. For instance, if the classification accuraq is 90% at the hrst stage of NN, and the prediction accuracy at the second stage of NN is 85'10, the prediction accuracy of the whole 1Wis only 76.5% (90% &es 85Yo)- In c ona s t mith a rather wde distribution of the actual production rate in field pipe installation, the actual labor production rates for shop spool fabrication are bounded within a relatively narrom range. Thus, the NN modeling of hbor productivity in the shop puts more emphasis on the sensitivity analysis of influenchg factors based upon the classic back propagation N-N model, as opposed to the uncertaincy analysis of cspected production rate. However, leamuig algorithms such as BPNN do not attempt to infer causality, hence, classification or prediction is based on b h d coirelation of new examples nrith previously analyzed examples, mithout giving information on the effect of each input parameter or influencing variable upon the predicted output variable. In the reported NN applicatioions, model validation has thus far relied upon measuring accuracy of the calibrated netrvork to an independent testing data set that are hidden fiom the neural nenvork in learning. The modelfs sensitivity to changes in its parameters is generally probed by t e s ~ g the response of a manire neturork on various input scenaros. In short, a NN model hc t i ons like a "black boxf' package, giving no clue on how the answers or model outputs are obtained, or how the input parameters affect the output. Widman et. al (1989) pointed out that the credibility of an AI program Gequently depends on its abiliq to explain its condusions. Lack of interpretability is a pi t f d of the neural netmork models recognized by many and has inhibited NN fiom achieving its fidl potential in real-morld applications. Dhar and Stein (1997) argued that because NN algorithms such as the back-propagation NN are non-linear, high dimensional hinctional equations f e a h g paralle1 distebuted data processing, it is liard to esplicidy hterpret mhich parameters cause what behavior in the NN model. YVhile mathematical and operational methods do esist for the analysis of neural nenvorks, die methods are fairly in~olved, and are less than satisfyuig because of their theoretical assumptions. They stated that "unlike most statistical methods, it can be difficult to say, even in general, mhich variables are significant in what respect." (Dhar and Stein 1997) The ultimate goal of the thesis research is to hnd better neural nework modeling approaches to predicting production rates and productivity indices. When applied in industrial construction estimahng as decision-support tools, the dereloped ANN-based models for analyzing productivity should be acceptable and effective to offer estimators valuable infomiaon about labor productivity in bidding nem jobs. To amin this ultimate goal, the follonrulg objectives are de he d in regard to three aspects: Productivity Studies Investigate the cuirent estimating and ou-site control practices for industrial construction as applied to the involved Company, in order to advance the theoretical basis and practical considerations for measuring and analpzing labor productivity in indusrrial construction. Probabilistic Neural Network Modeling Building upon the previous developments achieved b y O thers, es tablis h a more effective NN approach that suits the needs of estimating indusmal construction projects, which requires the recreation of a new training acd r ecd algonthm that combines the hctionality of probabilistic classification and prediction in one integrated neural netsvork. Sensitivity Analysis of Neural Nerworks Define the input sensitivicg. of a NN mode1 in mathematical terms, and establish a method of i nt er pr e~g the relevance and impact of NN &put parameters on the predicted output variable so as to gain insighr into the rationale by which NN reason and make decisions. Main folloming. methodologies udized to fulhll the abore research objectives include the Reviewing Literature to Recognize the Issues A compreheasive literature review was conducted in regard to the established AI N models, productivity studies, ANN applications in the problem domain, optimization, statistics, and industtial consmction. Literature covers a wide range of joumals, books, and reports, which document the latest academic developments and industrial applications in the related areas. Licerature review helps recognize the issues to be addressed in the thesis, namely, how to get data kom ndusuy in modeling labor productivity, how to analyze the uncertainq of the output from an ANN-based productivity model, and how analyze the sensitivity of the input for an ANN-based productivity model. Identifying Factors fiom Brainstorming by Domain Experts The senior management and domain expens at the nvolved Company including superintendents, production engineers, consmiction engineers, drafking supe~t endent s, quality control superintendents, and melding foremen were convened for a bralistomiing exercise to identifV the factors that influence productivity of the studied activities. It should be noted that those factors as identified to influence labor productivitg holds only midiLi a specifc setting and over a specific period. The input factors may need adjusmient by adding relevant ones and deletlig inelevant ones when the setting of application changes to a different contractor, or a different penod, even if the consauction process being studied remains the same. Using Data Warehouses to Gather Quantitative Data Idenufgrng relevant factors and gatherng )ueh quality data for those factors are crucial to the success of modehg labor productivity using XNN. Fouonring identification of factors, data needs to be coilected. The collaborative company (PCL Industrial) provided us with access to its business data for validation of ANN models and development of ANN-based decision- support tools. Although the company has invested resources in management information systems at various business divisions, those systems mere developed and implemented separately. Productivity studies using ANN require vast amounts of data fiom different information management systems. A corporate data warehouse is "a process by which related data korn many operational systems is merged to a single, integrated business information view that spans many business divisions" Fang, 1997). With the support of the company's management, tsvo data marehouses, namely, PipuigbIaster and FabMaster, were custom-developed for field pipe installation and shop spool fabrication respectively to integrate the corporate management sys tcms of es timating, production resources planning, quality control, and labor cost control. Validating and processing of quantitative data were automated through cornputer programming within data marehouses. The developed data warehouses provide solid platform of integrated historical data from which to validate the ANN models and develop ANN-based tools for productivity analysis. If data for some factors is not recorded in the elasting management systems, questionnaire surveys were carefully designed and personnel at the cornpnny were i n t e ~ e we d to collect the needed data. Questionnaire Survey With the heIp of domain experts, questions and descriptive idormaiion of choices for r at i ne on a 6ve-point scale mere f o d a t e d into a questionnaire format with the objective of reducing ambiguities and confusions. It is worth mentioning that such questionnaire smrey for modehng productivity using ANN is intended to fnd facts of the past projects only. BasicaUy, in conducting the quest i o~ai r e survey, no persond judgment or opinion about the relationships between the facts and the results is involved. Questions of "What" type mere asked about the factors affecting productivity only, and no questions of "Why" and "How" types were asked about the relationships between the factors and the productivity. In k t , i\NN would sort out the relationslups between the facts and the results on its own through an iterative leaming process based on e x p l o ~ g sample data. Intelligence emerges mhen ANN hnds the input-output patterns or relationships hidden Ln the data. This feanire draws a distinct line betsveen A m approach and other intelligent modeling approach such as expert systems: ANN relies on facts and data, but requires less direct input fiom domain experts (Dhar and Stein, 1997). In short, modeling productivity using ANN is relatively an objective approach compared to expert systems. Figure 1-1 shows the questioanaire designed for fnding additional facts about spool fabrication. PCL INDUSTRIAL CONSTRUCTORS INC: Fabrication Facility Productivity Questionnaire General: Reported By Report Date r Bob Smith FabMaster Processed Flag Project # Project Name Schedule: 1700204 Gas Plant & Piperack Process Modu How busy was the shop? (Based on shop workload in ternis of units and concurrent jobs processed) - -- -- - 1 Very Slow Relatively Slow 9 Normal Relatively Busy Very Busy I Were there rnany rushed spools? .- - None a Relatively Few 12 Normal Relatively Many Li Many (30% plus) (5% less) (1 0%) (20%) .---- Engineering: What was the rework percentage due to drawing changes? 7 None fl Relatively Few Normal Relatively Many Many (30% plus) 1 (5% less) (1 0%) (20%) 1 Were there any late drawing issues? g Relatively Few i-~ Normal Relatively Many 7 Many (30% plus) (5% less) (1 0%) (20%) i I What was the drawing revision rate? Materials: Were there rnany material shortage problems that impacted production? 1 None Relatively Few Ci Normal Relatively Many Many 1 Figure 1-1: Sample Questionnaire for Finding Facts about Spool Fabrication Foilowhg the formulation of a questionnaire, superintendents, project managers and estimators who were involved in the past projects were interviemed to compile facts and gather the needed information. The interview process was straightfonvard; the domain e-xperts had no difficulty hnding the records or recalling the facts on the projecrs that they maaaged. Cornputer Programming Mcrosofi Visual Basic, Visual Basic for Application, and Access svere found to be flesible and powerful in handling large amounts of data and comples programming logic, hence, were selected as cornputer programmhg tools to develop both the data warehouses and ANN models in the thesis research. hli the programs in di s thesis research including data ~varehouses, ANN trainers, and ANN recall programs were developed in house without third-party sohvare and hac-e been utilized in the involved Company. The solutions to the identified problems, which are provided through the thesis research, d contribute to the general knowledge of productivity srudies and ANN modeling in regard to: Advancing the theoretical basis and practical considerations for measuring and analyzing labor productivity in industriai construction, which has been documented in a paper entitled "A case study of industrial construction Iabor productivity" and has been submitted for publication in the Journal of Consmiction Engineering and Management, ASCE; Devising a new neural netrvork scheme to meet the requirements in modeling Iabor productiviq of indusnial construction, and is termed the Probability Inference Neural Network (PNN). P m T is a dassihcation-prediction combined neural nemork mode1 based on I<ohonenYs LVQ concept (Kohonen, 1995), but integrated \.th a probabilistic approach, which has been documented in a paper entitled " Es t k n a ~ g labor productiviv using probability inference neural nenvork" and is published in the October/2000 issue of the Joumal of Cornputhg in C i d Engi ne e ~g, Vol 14(3), pp 341 -338, ASCE; EstabEshing a simulation-based method of nterpreting the relevance and impact of back propagation NN input parameters on the predicted output variable so as to gain insight into the rationale by which back propagation NN reason and make decisions, mhich has been documented in a paper entided "Sensitivity anaiysis of neural nenvorks in spool fabrication productivity studies" and has been submitted for publication in the Journal of Computing in Civil Engi nee~g, ASCE. The developed data warehouses and ANN-based decision-support tools have been irnplemented or ate in the process of implementation at the involved Company. The hnal results of the research not only assist estimators in irnproving the accuracy of e s t i ma ~ g labor production rates for studied activities in biddlig new jobs, but dso offer the management a precke and integrated view of corporate producavity infomiation spanning across rnany business divisions. The experience and Iessons learned fiom the successful productive and m u d y beneficial collaboration betsveen academia and industry in the thesis research d potentially benefit other university-industry joint research projects in the future. The problems addressed in the thesis research were idenufled through i nvest i ga~g the curent estimating practices in i ndus q and understanding the real concems of industry prokssionals. Emerging compter-modeling techniques such as data warehouses and ANN were researched Gom an academic perspective in order to meet with the challenges in industry. The proposed novel ANN models and developed decision suppoa tools were proven to be effective in both uncertainty mdysis and sensitivty arialysis of construction labor productiviq; they were validated using real data Gom industry and successfully applied to assist esmators in deciding on labor production rates for new jobs. Alfeld, L. E. (1 9 88). Constndon prodi~~~ivity - on -site mea~~zfnmetzt and nzmzcrgenzeizt., McGraw-Hill, New York, NY. Bmi e, D. S., and Paulson, Ir., B. C. (1 9%). Pmj~siona/ com-hzccion management, incl~ding CJW., deng~i-conrtnict, aodgenerd contrachg. 3" ed, McGram- W, New York, NY. HaLligan, D- W., Demsets, L- A., Brown, J. D., and Pace, C. B. (1994). "Action- response models and loss of productivity in construction." joz~rnuf of Com-tr/~~-tion Engizeeniig andManagement, ASC E, 1 20(1), 47-64. Li, Y., Shen, L. Y., and Love, P.E.D. (1999). "MN-based mark-up estimation s ys tem with self-e-uplanatory capabilities ." Joz~n~cd Comtni~.tion Engieenig md &lanagcme)zt, ASCE, 125(3), 185-189. Ma l o q , W. F-, and McFiUen, J. (1985). "Valence of and satisfaction with job outcomes ." ]ozmai of Corn-tntction Etzgineerirrg and Management, ASCE, 1 1 1 (1) , 5 3-73. Mukherjee, A., and Deshpande, J.M. (1995), 'SvIodeling initial design process using d c i d neural networks", Joumaf Coqzhzg i11 Ci d Engineering, ASCE, 9 (3), 1 9 4 200. Murtaza, MB., and Fisher, D.J. (1993), 'Tu-euromex: Neural Network System for Modular Cons tmction D ecision Making'', Jorrmal Compti~zg ~ I I Ch7 Enginecnk& ASCE, 8(2), 221 -333. Sander, S. R., and Thomas, H. R. (1993). "blasonry productivity forecasting model." Jounmi of Constnicr'on Engineering ami Ma~zagemelr, AS CE, 1 1 9 (1 ) , 1 63- 1 79. Saped, T., and Razmi, A. (1999). "Cornparison of neural and conventional approaches to mode choice analysis" Jozmzai of Compziing itz Civil Engi~teebng ASCE, 14(1), 23-30. Shi, J. (2000). "Redung prediction enor by trans forming input data for neural networks", ]ozcnzuL Coqzhng in Cid Engineenhg, ASCE, 1 4(2), 109-2 15. Sinha, S. K. and Md(im, RA. (2000). c c 4 Mc i a l neural netmork for measung organiza tio na1 e ffec tivenes s .", Jotcri~nI Compting in Ci d Engzheenhg, AS CE, 1 4(l), 9 - 1 4. Sonmez, R and Ronings, J. E. (1998). "Consmiction labor productivity rno deling mith neural netmorks ." JozmaI of Con~tmctioiz Engineering mzd ~tlanngeme~zt, AS C E, l23(6), 498-504. Thomas, H. R. and Sakarcan, AS. (1994). "For ecas ~g labor productivity using factor model." JotimaLof Con~~n~ciio~i Eirgineenhg and~u~z age ~e / ~t , ASCE, 120(1), 228-239. Thomas, H. R. (1991). "Labor productiviy and work sampling: the bottom line." JozmiaI of Co~zsfmction Engince* and Marzugement, ASCE, 1 17 (3), 423-444. Thomas, M. R., bfaloney, M.F., Horner, R.M., Smith, G.R., Handa, V.K., and Sanders, S.R. (1990). "Modehg construction labor productivity." ] o z d of Consindoil Engineering und lUziiagemerz~, ASCE, 2 16 (4), 703-725. Thomas, H. R., and Daily, J. (1983). "Crew performance measurement via activity sampling." J o j m d of Constmction E ~zgineeritg min Mmzagen~e~zi, AS CE, 1 09 (3), 3 09 - 320. Wang, C., B. (1997). Techno Viaoiz II, McGraw-Hill, New York, N.Y. Widman, L-E., and Loparo, KA (1989). " Ar t i Ed intelligence, Simulation, and modeling: a critical s wey", Artff;n'aL inteIhgence, dation, and modebrzg- L.E.Widman, K.A. Loparo, and N.R. Nielsen, eds., John Wiley & Sons Ltd, New York, NY, 1-45. Chapter 2: A Case Study of Industrial Construction Labor ~roductivity' In a construction task that is performed by hand labor, the labor production rate (man-hours per uistalled unit) measures a key dimension of performance and is a cntical factor to estimating, scheduling and control of the project (Alfeld, 1985). Thomas et d (1999) identi6ed the management as nvo factors that affect complexity of the design and the project labor productivty and invesgated the measurements of daily labor productivity in building conspuction including masonr)- cons tniction, concrete fomwork construction, md s t r u c ~ a l s tee1 erection. Thep found that good project management and consis tency constant d d y labor production rates. in design complexity result in relatively 1 A version of this chapter has been submitted for publication. ASCE, Journd of Consuuction Engineering and Management. A good conelauon was also found benveen the final curmhtive production rate (an index of the average iabor performance over the entire project penod) and the variance of daily production rates. For instance, theit study of ma s o q construction observed "high vuiability in daily production rates on the poor perfomiing projects due to disruptions in the work resulting from congestion, sequencing, lack of materiais, etc" (Thomas et al, 1999). Little information could be found in literature on the theoretical basis and practical considerations for measwng and analyzing labor productivity of industriai consrmction. In conjunction with a major industrial contractor (refened to as "the company" hereafter), we conducted a productivity case study for nvo important activities in indusnial consmiction: pipe installation in the field and spool fabrication in the fabrication shop. The focus of investigation is the average labor production rates (man- hours per unit) of these acbvities at the end of a project, rather than the daily labor production rates as in Thomas 1999, because die prirnary objective of research is developing ANN-based eStiIIIa~g tools to offer estimators valuable information about labor productivity in bidding new jobs rather than assessing and improving the crew performance in the field- This paper intends to address: (1) hom to quana$ labor productivity in indusmal construction fiom a contractor's point of view; (2) hom to measure actual labor productivity in indusmal construction based upon on-site control practices; and (3) how to ualize Artifcial Neural Nenvorks (MN) to analyze the variability of actual labor production rates and the sensitivity of identXed influenclig actors. The paper is construction pertinent Constructionyy section. organized as folloms: important characteristics of industrial to productivity studies are hrst discussed in the "Indusmal Next, the "Field Pipe Xnstallatioa" section reviews the cur ent e s t i r na ~g method, the present reportng and accounthg systems for field pipe insrdation in the Company, and summarkes the techniques for quantification and measurement of field pipe installation productivity. Further, the input factors that cause the variability in the productiviv of field pipe installation are discussed, and a probabilistic neural network approach to modeling pipe uistdation productivity is o v e ~e we d . The subsequent section "Shop Spool Fabrication" shifts the focus of productivity studies to the fabrication faalities of the Company, and summarizes the techniques for quantification and measurement of spool fabrication productivty. The input factors that affect the production rate of spool fabrication are identified, and an NN-based sensitivity andysis appraach to modeling spool fabrication productivity is presented. Barrie et al (1992) described industrial construction as: "Indusmal construction covers a wide range of construction projects di at are essential to o u utilities and basic industries, such as petroleum rehneries and petrochernical plants, synthetic fuel plants, fossil fuel and nuclear power plants, off shore oil/gas production fadties, cryogenic plants etc. Industeal construction generdy features large amounts of highly complex process piping, mechanical, electrcal, and instrumentation work, both design and construction require the highes t Ievel of engineering e&xpemse kom multiple disciplines." In particular, the installation of process piping systems in indusEal construction is selected for productivq studies because it accounts for the buLk of direct labor hours of an industrial contractor. Process piping is used to transport fluds benveen storage tanks and processing units. Installation of piping systenis generally consists of tnro processes: (1) spool fabrication in a c omme r d pipe shop; (2) pipe installation in the field (Germin, 1996). hlthough the nvo processes are Liseparable and can be integrated to optimize the econornics of a partcular situation, they are treated independent of each other in the paper because of the current estmating and control practices of the involved Company. The productiviq studies described in this paper are conducted to support the management's decision-making in the contest of the company's curent management sys tems, as opposed to radically changing these sys tems. Parker et al (1984) di s~gui shed industrial construction from heavy constniction in that indusmal construction does not require fleets of construction equipment and plant (such as scrapers, loaders, cranes and trucks etc) to handle basic materials (such as e d , rock, concrete and asphdt etc). They M e r pointed out that industrial constniction "tends to be much more labor-intensive, though some of the largesr hoisting and materials-handling equipment is also required" (Parker et al, 1984). An industrial contractor usudy owns the equipment or rents it fiom a long-terrn supplier, thus, the technology and machinery adopted in consuuction can be considered invariable for a relatively long period of time. This feature lends the producnvity studies of indusmal construction to the unit-cost estmating method, mhich is cornmonly applicabIe to labor intensive work where 'labor production rates must be independent of equipment use and vary among projects only because of differences in labor productivitf (Parker et. al., 1984). For instance, considering the bid item "Pour Concrete Floor" in building constniction, to estimate the total cost in terms of Iabor hours, work quantities are taken off in square meters of floor, then multiplied by a labor production rate, i.e., the labor hours required on one square meter of floor. Analogously, for field pipe installation in indusmd construction, the amount of work-in-place is usually counted in pipe footage; field productivity for pipe installation is measured in the form of unit rate, i.e. manhours per foot of installed pipe. Pipe installation in the field involves "the physical placement of pipe / pipe subassemblies, valves, and other specialty items in their required final location relative to pumps, heat exchangers, turbines, boilers, and other processing units" (Genvin, 1996) Productivity Quantification In practice, pipe is customarily identified by diameter of pipe (dehned by nominal pipe ske) dong mith \vaU thickness of pipe (dehed by schedule nurnber). Hence, the production rate of pipe installation can be detemiined by the diameter and mall thickness of pipe; the Iarger the diameter and the thicker the pipe, the more Iabor hours is required 10 install one foot of pipe. Table 2- 1 shows samples of the labor rates for handling and erecting stright run pipe (mm-hours/ft) as found in the public source (Page and Nation, 1982). Table 2-1: Sample of pipe installation unit labor rates (Source: Page and Nation, 1982) Nominal Pipe Size (Diameter) Schedde Number (Wall Thickness) Base Labor Rate (MWFt) Estimating labor production rates for field pipe instdlztion starts with establishing base production rates for various work items. Base production rates reflect the contractor's present labor productivicy level under normal work conditions that are most oftea encountered in the field. The installation location is one of the major considerations for an estimator to d e h e a classification of work conditions. For esample, the base production rates of pipe installation are valid for the conesponding base classihcation only, in which the installation location is above ground up to 12 ft high. An estimator detemiines a degree-of-difficulty factor (&en referred to as c'multiplier" in the company) for each non-base classification to adjust the base rates up or down in order to reflect the unfavorable or favorable work conditions for the job being estnated. This is a subjective decision process, requiring substantial expenence and skill on the part of the estimator to determine realistic production rates for the work conditions to be encountered. Empical degt-ee-of-difhculty factors for each classficaon of work conditions based on ihe installation location serve as a guide or tool to assist in deuding on such factors and can be found in the company's business manual. For esample, the degree-of-difficultp factor for underground pipe installation (4 to 10 ft deep) is about two t he s the factor for aboveground pipe installation (up to 12 ft high), wMe the factor for pipe installation inside building at over 10 fi of height is about two &es the factor for underground pipe installation (4 to 10 ft deep). Productivity Measurement In the contest of pipe installation, keeping uack of piping labor by individual fittings and pipe sections is economically impractical, if not impossible, to implement in the curent field reporting system of the company. Alfeld (1 9 88) argued that measuring labor producvty requkes grouping similar accomplishrnents and separating dissimilm accomplishment on the job site. The cost control practice of the company for field pipe installation is descnbed next. At the end of a day, the foremen tum in time cards for th& crews, charging the number of labor hours to a series of cost codes. The cost codes of field pipe installation for a particdzu project separate pipe fitters' hours by classihcations of installation location. Thus, the total labor houts of pipe installation at vmious locations for one project can be readily remieved fi-om the field labor cost control system of the company. Homever, this is not the case for the amount of work accomplished. Large amounts of various work items dong -sith variations in size and wall thickness of pipe cause the inclusion of details of work accomplished in the foreman's t h e cards to be impractical, 34 such as the amount of work-in-place counted in footage by diameter and mail thickaess of pipe, the s aew joint or bolt-up connections and the valves and supports installations associated mith the installed pipe. Fortunatelp, the detailed records about the amount of work accomplished can be obtained indirectly h-orn the company's quality control system and estimating system. Thus, we can match the actual manhours mith the work accomplished for one classifxation of installation location, in order to compute the actuai degree-O f-dificuls. factor (@) as gken in Equation (1): Where H is the actual labor hours charged to pipe installation in one class~cation of installation location, N stands for the total number of work items contained in one classification of installation location, P, is the base labor rate for the iCh work item, And Qi is the actual quantity accomplished for the ch work item Note that the estrnabng process desuibed in the preceding subsection is actualiy to transform Equation (1) to compute the labor hours 0, simply by plugglig the quantity take-off as read from construction &anrings into the quantity terni (QJ in (1). Hence, the task of estimakg labor productivity boils down to detennining the degree- of-difficulty factor (@) accutatelv for a future project scenario. It is e-xpected that a constant value of the degree-of-difficul~ factor (or at least a nanom range) could be found for each classi&ation From the company's bistoncal records and shodd be close to the empirical value in the business manual. Input Factors Kistorical piping producticity data of 66 projects was collected Erom the Company and compiled into numeric format for malysis. Because data is not well formatted or readily accessible, a data marehouse was developed first to integrate the conuactor's e s t i r na ~g system, quality control system and hbor-cost control system in order to ease the burden of data collection and ensure the high quality of collected data. The follonring tmo observations wth regard to the actud degree-of-difficulty factors can be made from the histoecal data of the Company: The degree-of-difficulty factor for one classification of installation location rnay reveal a widespread dismbution instead of a constant value as in the company's business manud; Different classifications of installation location may end up with very close values of the degree-of-difficulq factor, nor: as distinguished as in the company's business manual. The above observations are not initidly expected and the explmation can be attributed to the fact that more factors esist, other than the location of instailauon, mhich contribute to the variability in labor productivity. In practice, an estmator rnay adjust the value of degree-of-difficulty fzctor in the business manual on a job based on experience and job conditions, and subjected to the approval of senior management. Barrie et al (1992) found that construction hbor productivicg may fluctuate d d l y due to numerous factors that affect it, and many are highly qualitative in nature, induding the effect of location and regional variations, the learning cuve, work schedule and work d e s , environmental effects, crem esperience and management factors. Portas and AbouRizk (1997) determined seven categories of activity factors and five categones of project performance factors to be relevant to the labor production rate of concrete formwork constxuction. Thomas et al (1999) identihed the complesiq of the design and the project management as nvo major categories that affect labor productivity of masonry construction. In regard to industrial construction, Knomles (1997) invescipted a specmim of e-xplanatory factors to idenufy those that affect the productivity of pipe installation and pipe welding in the field. Identification of input factors in this study was based on Knowles 1997, with die addition of t h e more factors, i.e. the contract type (lump sum or reimbursable), installation of miscellaneous fittings (flanges, specals, elbows etc.), and the on-site labor charging errors between the cost code of pipe installation and that of pipe welding (since pipe fitters and welders mostly work side by side). A t o d of 36 input factors are considered relevant and used to redehe the classification of pipe installation. Those factors include both global project-level information and specific acctivity-level information, as shown in Table 2- 2. Aside hom location of instdation, more activity- s p e d c factors are considered such as material type of pipe, the installation of non-pipe cornponents (valves, supports, and rnisceUaneous items), non-weld joints in ins tailation (screw joints, bolt-ups), the quantities of installed pipe at different size ranges ( s md bore, medium bore, and large bore), the leaming cuve factor (total quantity of installed pipe in footage), the crew elrperience etc. Factors pertinent to project are also included, such as the effect of location and regional variations (project location, province/state), project type variations @roject dehnition, contract type, and prefabrication percentage), mark schedule and mork rules (overtime and unionized), environmental effects (seasonal), management factors (superintendent and project manager) etc. Table 2-2: Input factors to pipe installation productivity Projea Location Administration Y- of Construction Province/State Contract Type Client Engineering Fimi Project Manager Superintendent Project Definition \Vork Scope Project Type Prefab/Field Work ,iverage Crew Size Peak Crew Size Uninized Equipment & Material Estra Work Change Order Drawing & Specs Qualiqr Location Classification Total Quantity (Learnuig) Installation Quanti ties &Taterial Type hIethod Of Installation Pipe Supports Boltups Valves Screwed Joints hfisc. Components Welding Impact Season Crew Ability Site Working Conditions Inspection, Safety & Quality Overd Degree of Difficulty Urban, Rural, Camp Job General E-xpense 89-93,93-93,95-96,97-99 M, SI< Reimbersable, Lump Sum an indes derived &oui historical data an indes derived Gom his torical data an index derived fi-om historicd data an indes derived from histoicicd data Chernical, Cryogenic, Gas, Refining Confiaed / Scattered Upgrade Shutdown, Grass Root etc. Percentages for Prefabrication <25,25-50,50-100, >IO0 <25,25-50,50-100,100-150, >150 Yes, No Equip.& Mat1 Cost/ Direct MH Original Project Cost/Final Projeject Cost No. of Change Orders/Total Direct blH) 1 Poor 3 Average 5 Escelient U/ G on Site, Fab Shop, A/G on Site etc- Total Quanuty In DiaInFt Qty for Size Ranges, <2", 2"-IG", >TG" Moy,Carbon Steel, FRP/PVC,etc. Percentages of Hand Rigging No. of Pipe Supports/Foot of Pipe No. of Boltups/Foot of Pipe No. of Valves/Foot of Pipe No. of Screwed Joints/Foot of Pipe Instail i1lsc.Components hfH/Foot of Pipe LVelding Multiplier (hliscoding on Site) Percentages of Winter & Summer Work 1 Very Low, 3 Average 5 Vesy High 1 Esmeme Problems - 5 No Problem 1 Estremely Detailed - 5 Highiy Tolerant 1 Very Lom 3 Average 5 Very High It should be mentioned that a questionnake survey was careWy designed and conducted to collect some qualitative information that is not obtainable Erom the company's reporting and accounting systems. Such information mas converted into numenc fonriats for the foUoMng NN analysis (See Lu et al, 2000 for derails). Probabilistic Neural Network Modehg ANN has been proposed by many as an alternative to streamline the eshmating process and reduce the subjective nature of the work The dassic Back Propagation NN predicts a single value without giving any backup information on the rislcs of taking this value as correct. Observing the actual values for the degree-of-difhculty factors of field pipe installation indicates chat the target r i sky variable Lies over a relatively nride range. The resulr from an informal end-user survey showed that eshat or s are more cornfortable to accept a decision support model with the capability of analyzing the uncertainty of its ourput. Thus, a probabilistic NN modehg approach that can predict a distribution or probability density hnction over the output range is preferred and has been researched. A nem neural network scheme was devised to meet the requirements in modehg labor productivity of industrial constniction, and is termed the Probnbility Inference Neural Network (PINN). PINN is a chssihcation-prediction combined neural nenvork model based on I<ohonenYs LVQ concept (Kohonen, 1995), but integrated with a probabilistic approach. Because the response of PINN is in the form of a probability density funciion (distribution) at the output range, an estimator be able to decide on the degree-of-difficulty factor for a future scenario by cornbining the PINN7s recommendation with personal judgment. In the PlNN model, the actual output range of the target nsky variable is divided into a number of output zones or sub-ranges wth an equal width. Output zones are actuaUy some discrete dusters wth c o n ~ u o u s boundaries. For field pipe installation, the hrgher the value of the degree-of-difficulty factor, the higher the value of labor production rate, hence, the more difficult and more demanding the job is. Thus, each output zone gives an indication of the relative wotk cbfficulty and productivity levei; for instance, output zone (0-0.71 stands for easier mork and higher productivity level NN Recall Probability Density Graph 1.0 compzuing with output zone (0.7-1-41. The median of each sub-range can be used to represent the typical value for each output zone and to derive a predicted vahe in addition to the predicted dismbution, such as mode and meighted average value. Portable computer software was developed to implement the training and testing of the PINN model on real historical productivity data of field pipe installation at the company. The model was validated based on an independent data set resemed for testing. Sensitivq malysis of the model was perfonned bjr obserping the PINN's output in response to controlled changes in inputs and comparing PINN's output a@st that of a n es~erenced estimator. FoUowing satisfactory t e s ~ g and sensitivity analysis, a r ecd program based on the traned PINN model was irnplemented as a deasion support tool for estirnating the degree-of-difficulty factors of held pipe installation at the company. Figure 2- 1 shows the output interface of the recd program, indicating the predicted probabilit)' density function over the output range, and the likelihood of the degree-of-difficulty factor f d h g into each sub-range. Those mho are interested in the topology and algorithm of the PINN model, and the effectiveness of applying PINN to e s h a t e labor productivity in the contest of field pipe installation may refer to Lu et al 2000. Spool fabrication in a commercial pipe shop involves "the cutting, bending, tacking, and welding of individual pipe components to each other and their subsequent heat treatment and nondestructive esamination to fonn a pipe subassembly or spool for installation" (Gervin, 1996). A pipe spool is a portion of piping system consisting of various piping components, such as h g e s , elboms, reducers, tees, supports, and pipe. These components are prefabricated into distinct assemblies that are later assembled as part of an industeal plant or production skid/module. Such prefabrication is usudy performed under coatrolled shop envixonment located away from the job site, which allows for bettes productivity and quality control, and heace cuts the field labor costs. Major spool fabrication processes, such as cut, bevel, fit, weld, and handle sections of pipe and firtings, also tends to be labor-intensive. Productiiig- data is coIlected fiom the fabrication shop of the Company for 63 projects completed fiom 1995 to 1999, duriag which period the technologies and machines for welding and cutchg in the shop eemain relatively stable. Like field pipe installation discussed previously, the productbity study of spool fabrication is suitable to the unit-cost estnating method- Productivity Quantification Alfeld (1988) pointed out the labor production rate in the shop could not be quantified with the same units as in the field - man-hours per foot of Listded pipe, because the shop does not install the pipe but cuts, fits and welds spools; other units of rneasure such as spool counts and pipe sections do not satisfy the needs of quantifjing the work accomplished in the shop either, because (1) each spool varies so much in components, size and configuration that a simple count ofspools would be misleading; and (2) large-size pipe requixes far more manhours to cut and weld than do the smaLl-size pipe. Weld-inch was ulized as a unit of rneasure to quannfy the accomplishrnent in a fabrication shop and Table 2- 3 shows saniples of the degree-of-difhcultp factors for convemng various butt welds into weld inches as found in Mfeld, 1988. Table 2-3: Sample of degree-of-dititiculty factors for converthg welds into uni t s (Source: Alfeld, 1988) Nominal Pipe Size (Diame ter) Circumference Fab. Units Weld Type (3) Butt Butt Butt Butt Weighting Factors Similar to the concept in hlfeld 1985, in the fabrication shop of the Company, a special "unitization" scheme is applied to quanrifv the various work items uni f ody into an abstract unit of measure c de d "Fabrication Unit" or "Unit" by we i g h ~ g them for their degree of difficulty. The "unitkation" is a conversion based on a standard diameter inch dong the circurnference of a weld. A degree-of-difficulty factor is empirically decemned for each weld, t z h g into account pipe diameter, wall thickness of pipe, weld type @utt weld, socket weld, saddle and laterd welds) and the time required to lay out and perforrn the weld. Quantity of non-melding work items such as cutting, bevelulg, handling pipe and fitnngs, lastalling supports are also converted into "Units" by appljing corresponding degree-of-difficulty factors in the scheme. A commercial fabrication shop usually handes several jobs simultaneously so that it is efficient for the crew to set up and do ail the sarne size pipe fiom dfferent jobs at the same tirne. In the fabrication shop of the Company, it is difficult enough keeping nack of the manhours charged to each individ~d job in the shop floor control systems. Charging labor hours to each individual pipe section or fitng is considered impractical and ineffiaent in hght of the curent control technologies and management systems in the fabrication shop. The basic formula for spool fabrication eshating is shown in Equation (2): VVhere H is the total manhours charged to one job, P is the production rate (?vIf-I/Unit) for the job, N stands for the total number of mork items (\veld or non-weld) contained in the Subscript i stands for the P work item in the job, @; is the degree-of-difficulty factor for the ih work item in the job, Qi is the quanticg for the ih work item in the job in its onginal unit of measure such as the meld counts for an weld work item, e.g. the weId count for "G Nominal Pipe Size (Diameter), 40 Schedule Number F a l l Thickness), Butt-\Veld Type" weld is 20. Prodiictivity Measurement g(@i * ~ i ) in Equation (2) is actually the total quantity of i=l fabrication mork in Units for the job. The hrst step in esthnating a spool fabrication job is a process c de d "uni&ation" for computing the total units of one job. The es t-cor reads the quantity takeoff fiom spool drawulgs and l ook up the degree-of-difficulty factor for each work item. This task is straightfomard but tedious because the amount of work items in a job is usudy large; for esample, several jobs the Company completed contain over 1,000 spools, over 10,000 welds and over 10,000 pipe sections and fittings. The difference in the degree-of-difficultg factor benveen field and shop should be noted: First, the degree-of-difficuiq~. factor in ihe case of shop spool fabrication corresponds to each work item rather than a classification of grouped work items as in field pipe installation. Second, the degree-of-difficulty factors in the case of shop spool fabrication are held constant in the "unitization" scheme rather than variables as in field pipe installation. Hence, the focus of productivity study in spool fabrication is on the production rate directly, i-e. the P term in (2) or man-hour/unit. Dedding on P requites the expexience and judgment of the estimator. Similar to the productivity study of field pipe installation, a data warehouse was builr up to integrate the reporting and a c c o u n ~ g systems in the fabrication shop in order to obtain labor hours and quantity of fabricauon mork on each job. The data warehouse also contains a built-in computer progam, developed to automate the tedious task of quantifping about 63 fabrication jobs into "units" in a prese and consistent may. Actual production rates over the penod of investigation mere observed for W e r analysis. Input Factors Xfter consdting with e-xperienced estimators and shop superintendents in the Company, a number of quantitative and qualitative factors are considered relevant to the shop labor productivity, such as: The ma t e d components in fabrication, Le. the percentage of non-carbon steel (stainless, aluminurn, d o y steel etc.) units over the total units, because non-carbon steel spools require extra care and more tirne in storage, handling and welding cornparhg Mth carbon steel spools; The average length of pipe sections in a spool, indicated by in-Line fittings (pieces) per foot of pipe in spool. I n- Le fitangs, such as unions, couplings, swages, reducer etc are used to connect pipe sections in a saaight line without tums or branches. The complexity of spool configuration, indicated by non in-line fittngs (pieces) per foot of pipe in a spool, val ves/ suppor t s/ ~ges (pieces) per foot of pipe in a spool; The stengency of quality control, indicated by the non-destructive test requirement, which is a percentage mith respect to weld couats according to the client's specs. The quality of spool drawing indicated by the drawing revision rate. The shop workload, indicating shop's state of being busy or slow, and number of concurrent jobs handled at one time; The effect of double handliog spools between weld stations, iciicated by the percentage of multi-station roll meld inches over total roll mell inches. A meld may be done on more than one station by different welders in the shop, depending on the welding process and the welder's quaL6cation. It requires estra time to move spools benveen melding stations and lay out a weld nt different stations. The effects of rushed spools due to client's priority, late drawing issues fiom the client, and material supply problems The amounts of night s hift and overtime, and estra mork in tems of labor hours; The es~erience and profiuency of crem, indicated by apprentice ratio, repair rate and rem-orked spools. The environmental effects are not considered as signihcant factors, as in the field productivity studies, because of the conaolled shop enwonment. A couple of management factors that mere initially included n-ere dropped out of analysis after esarnining the collected data, in mhich slighr variations were observed due to the consistent management policy and management personnel dueing the 5-year period of investigation. It should also be mentioned that another factor describing the complexiq of spool configuration was idenafied by domain er;perts, i.e. the number of pipe pieces per foot of pipe in spool. The sensitivity analysis results based on coilected data reveal that the effect of the number of pipe pieces is very similar to that of the number of in- h e fitnngs. Such strong conehtion generalized by ANN model from the actual data is presented to dumain experts and hnds esplanations from domain experts: pipe sections in a spool are mostly c o ~ e c t e d by in-iine fittings such as unions, couplkgs, mages, reducer etc; both ratios, narnelp, in-line fittings (pieces) per foot and pipe pieces per foot, indicate the average length of pipe sections in a spool. To simpliQ the inputs of model, the ratio of pipe pieces per foot mas dropped out of analysis, as agreed by domah exTerts. Eventually, nineteen input factors that affect labor productivity of shop spool fabrication are identified as listed in Table 2- 4. Table 2-4: Explanatory factors to spool fabrication productivity NN Input Factor (2) :n Line Fitting @CS) per Foot of ?ipe in Spool \Ton In Line Fitting @CS) per Foo >EPipe in Spool Jdve @CS) per Foot of Pipe in ;pool Support @CS) pet Foot of Pipe in ;pool ?lange @CS) per Foot of Pipe in ;pool liIulti-Station Roll CVeld Inches / rotal Roll Weld Inches Lepair Rate Ldbgraphy Test Requirement 'lon CS Units / Total Units ;hop Work Load Drawng Revision Rate Prionty Rushed Spools Rework Spools Material Shortage Problems Late Drasving Issues Night Shih MHs / Total MHs Over Time hMls / Total hEIs Extra Work hHs / Total MHs Apprenticeship bMs / Total h Hs Remarks (3) .i ratio iadicating the average length of pipe sections n spool .\ ratio indicating complexiv of spool confguation i ratio indicating comple'ty of spool configuration i ratio hdicating cornplexity of spool configuration i ratio i n d i c a ~ g compleldty of spool configuration Multi-S tation Roll Weld requires extra handling Demeen weld stations in indes of crew's pro ficiency ln indes of quality control strngency by specs. \Jan CS component in fabrication requires estra care n storage, handling and wvelding i 5-point rathg based on shop worliload in units ind no. of concurrent jobs indicating honr busy the ;hop was. A 5-point rating based on percent of revised spool drawings indicating dranring quality A 5-point rating based on percent of mshed spool due to client priority indicating shop work schedules. & A 5-point raMg based on percent of reworked spools due to drawing enor s and quality defects L A 5-point ratng on efficiency of material supply A 5-point raMg based on percent of late spool draning issuance by client that impacts fabrication Night Shift affects labor productivity Or er T h e affects labor productivity Estra Work affects labor productivity Welder a u ~ c a t i o n system affects labor I productivity: Apprentice os. Joumeyman Data for the idenuhed factors is collected fiom the company's various management s ys tems including labor cos t tracking svs tem, wveld tracking sys tem, payroU system, matenal tracking system. Because data is unavailable in cunent systems of the Company for such factors as the material shortage problems, quantity of reworked spools, quantity of rushed spools due to pnoriq, shop workload etc., a questionnaire survey was csrefdly designed and conducted wth the suppott of the compmy management The key personnel involved in the projects including shop supe~t endent s, project managers and coordinators, QC staff, and welding foremen were interviewed to help recall sorne facts and gather the needed information. Sensitivity Analysis of Influencing Factors In contrast with a rather wide distribution of the actual production rate in field pipe installation, the actuai labor production rates for shop spool fabrication are bounded Nithin a relatively narrow range. Thus, the NN modehg of labor productivity in the shop puts more emphasis on the sensitivity anaiysis of i n f l u e n ~ g factors based upon the classic back propagation NN mode4 as opposed to the uncertainty analysis of espected production rate based on the PZNN model. L e h g algorithms such as back-propagation NN do not gve information on the effect of each input parameter or influencing variable upon the predicted output variable. The NN model's sensitivity to changes in its input factor is generdy probed by t e s ~ g the response of a mature nenvork on various input scenaros. The relationships between an output variable and an input parameter were sorted out based on the NN algonthm so as to define the input sensitivity of a back-propagation NN model in esact mathematical ternis in light of both normalized data and raw data (Lu et al, 2000). For a three-layer BPNN using Siaomoid transfer hcr i ons and linear normdkation procedures, ' N n R the input sensitivity with respect to the change of 10/o input relevant ranges (- ) is as, R aNn - - - MAX, -MIN, - 2 ~ W - Nc t ( l - Nc l ) *N, ( l - N, ) 3% 10 FI =ln i =I Where, subscript p stands for a node in die input lager of the network; Subscript c stands for a node in the middle layer of the network; C stands for the total nurnber of nodes in the middle layer; Subscript n stands for a node in the output layer of the nenvork.; Wii stands for the weight of connection benveen node i and node j; S stands for the input signal to a node; N stands for the output signal fkom a node; hW& is the maximum value in the data set corresponding to output node n; MIN, is the minimum value in the data set corresponding to output node n. From Equation (3), for a mature network, the sensitivity of an input parameter over an output variable is dependent on the curent input values. A Monte Carlo simulation can be performed at the NN input space in order to observe the statistics of input sensitivity. In our research, s taus tical analysis of simulation resulrs involves calculating 5 percentiles of the slope variable for each input parameter, i.e. the loch, 25", 52 50", 75", and 9 0 ~ . The input sensitivitg of all input parameten is summarized and presented in a tornado-like graph as illustrated in Figure 2- 2 for the piping fabrication labor productivity NN model. The horizontal a'ris represents the relative input sensiwty as detennined by (3), i.e. output response (negative or positive) nrith a change of 10/o relevant range in an input parameter. The vertical avis is the baseiine conespondkg to no output response or zero change in output. Five short vertical bars correspond to each input parameter, representing respective. the five percendes fiom left to right in an ascending order and reflecting the central trend, the spread, and the shape of the observed slope data distribution fiom simulation. In short, statisticd analysis of input sensitivity based on Monte C d o simulation enables the modeler to understand the rationale of NNYs reasoning and have pre-knomledge about the effecveness of model implementation in a probabilistic fashion, as illustrated by the spool fabrication productivity model next. A total number of 70 records mere compiled and used to train a NN model with 19 input nodes at the input layer correspondlig to 19 input parameters, 19 hidden nodes ac the middle layer, and 1 output node at the output layer that is the unit labor hours. The number of hidden nodes can be determined based on mals; NN learning is found to be wusceptible mhen to the number of hidden nodes is close to the number of input nodes. The learning rate is 0.4, the momentan is 0.1, and sigmoid transfer hrnctions are used in hidden and output nodes. After satisfactory training (standard enor of the output is 0.00143), the Monte Carlo based sensitivity analysis is performed on the rnamed nenvork for 10000 simulation runs. I In Line Fitting per Ft !, Non In Line Fitting per Ft i. Vaive per Ft C . Support per Ft i- Flange per Ft i. Mlt Stn RW % '. Repair Rate I. RT Rate ). Non CS % (Un) IO. How Busy 1, Drawing Revision 12. Priority Rushed Spools 3. Reworked SpooIs 14. Material Problems 5. Drawings Late 6. % night shift '7. % overtime 8. % extra 9. % apprentices Figure 2-2: Sensitivity Analysis of Spool Fabrication BPNN Mode1 Several independent Mals from NN training to the sensitivity andysis mere conducted on the sarne data set. The best mal, in which the input sensitivity of most hctors followed the same trends, as detennined by esperienced domain e'lperts, is shovm in Figure 2- 2. An esamination of Figure 2- 2 reveals the relationships berneen the influencing factors and the fabrication productivity, which me generalized by NN through obserrring historical project data in the past 5 years. For example, factor 1 is about in line fitang pieces per foot of pipe in spool, st-hicb indicates the average length of pipe sections in spool. According to our domain experts, in h e fittings, such as unions, couplings, smages, reducer etc are used to connect pipe sections in a straight line without nirns or branches. Thus, the more in line fitting pieces in spools, the more s md sections of pipe in spools, and the easier to handle the work. From Figure 2- 2, BPhW determines the chances to decrease hbor hours per unit wirh the increase of this ratio are about 78% and agrees with the &end identified by domain experts. Factors 2 to 5 are four ratios i n d i c a ~ g the complesity of spool configuration. By our domain esTerts, the higher such ratios, the more comples the spools' configuration, and the tougher to hbncate the spools. From Figure 2- 2, the dominant trends of the four ratios are all on the plus side, which matches the judgment of our domain experts. It is also observed from Figure 2- 2 that factor 18 (extra work percentage) is relatively tighdy enveloped around the baseline, which indicates that extra work is not as dominant as other factors in c ont r i bu~g to the variance in unit Iabor rates. The e-xplanation c m be partly atmbuted to the fact that the amount of extra work impacts the efficiency of administration or management more directly chan the productiviq of crew on the shop floor. Other input factors can be interpreted and validated in a similv muiner, and are not elaborated further due to space limit. In particular, the effect of ma t e d type of spool fabrication on the labor productivity was tested based on the BPNN model, because ma t e d type (carbon steel, stainless steel, aluminum etc.) is a major consideration of an lndustnal estimator in adjuscing unit labor hours of spool fabrication. The labor rate of non-carbon steel fabrication is empiticaUy 1.5 Urnes the rate of carbon steel in the company's business guideline. 24 records in the data set wth OO/o non-catbon steel component (100% carbon steel fabrication) were selected as t e s kg records. Nest, for each t es hg record, the input parameter of non-carbon steel component mas changed fiom 0% to 100%, with odier parameters intact. Those testing records were fed to the netsvork and let NN r ecd the output, i.e. the unit labor rates for non-cubon steel fabrication. ni e output fkom NN was compared agauist the onginal output of each record, i.e. the unit kbor rate for carbon steel fabrication. Based on the test results in Figure 2-3, NN increases the unit labor hours on 75% of the records; the amount of decrease for 5 records, i.e. No. 1, 2, 5, 6, 9, is rehtively smalI c o mp a ~ g mith the amount of increase for others. If the sample size is large enough, the percentage should corne close to about 90%, as observed fkom Figure 2- 2 for factor 9. On average, the ratio of non-carbon steel Iiibor rate over carbon steel labor rate is 1.4, which is dose to 1.5 as in the guideline. Test NN Sensitivity By Changing Material Component from 100% CS to 100% Non-CS: 75% 1 Records increase, Avg. Ratio 1.38 1 +Actual (100% CS, 0% Non-CS) 0 NN Output (0% CS, 100% Non-CS) ) O 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Rec. No. Figure 2-3: Testing Sensitivity of BPNN to Material Type that the guideline gives only, while NN is able to an average nurnber (1.5) in consideration of figure different numbers for different scenarios taking into account 19 relevant factors. In short, such a NN-based decision support tool d be more sophisticated and intelligent than the traditional business guideline to assist estirnators in deding CONCLUSIONS on the labor production rate of spool fabrication. Special methods are utilized in practice for the quantihcation and measurement of labor productivity in i ndust d construction. Estirnating labor productivity is one of the most difficult aspects of preparhg an esmate, or a control budget based on the estimate for labor-intensive activities in indusnid consrnction. ihtifcial neural nenvoks are capabIe of sorng out hidden patterns and estracting predictive information from comples data sets, and are proven to be effective in both uncertainty analysis and sensitivity analysis of construction labor productivity. The NN-based decision support tooIs are developed to assist estimators in deciding on labor production rates for new jobs; such tools can be more sophisticated and intelligent than traditional business manuals or guidelines. Meld, L. E. (1 9 8 8). Corn-tn~ction prodz~cfi~ity - on - d e rnensireme~zt md rnnn~zgemerrt., PvlcGraw-HiU , New York, NY. Gerwin, E. (1992). "Fabrication and installation of piping systems". P@hg Hdbook, uth ed., Nayyar, M L. eds., McGraw-W , New York, NY, 297-361. Knowles, P. (1 9 9 7). Predicting L b o r Prodz~ctiuip U~it'g Arez~rid Netl vo r k ~ hLas ter of Sciences Thesis, University of Alberta, Edmonton, AB. Lu, M., AboURizk, S.M. and Hermann, U.H. (2000). "Estimating labor productivity usuig probability inference neural nebvorks.", J. oJ Corzpthng i f 1 Cid Engineenkg, ASCE, 14(4), 241-248. Page, J.S. and Nation, J.G. (1 982). Edinoloor'r p@ing man hout- manrral, 3d ed., Gulf Publishing Company Book Division, Houston,. Puker, AD., Barrie, D. S., and Snyder, R. M. (1984), Pfamrig rzm? Estimuti~rg Hemy Cot;l~'tna*tr;?n, McGraw-Hill, Inc., New York, NY. Portas, J., and AbouRizk, S.M. (1997). N e d Netmork Mode1 For Estimating Construction Pr oduct i ~v. 1- of Con.r/r. Engrg. e9iVlgmt-, ASCE, 123(4), 399-410. Thomas, H. R., and Zaorsia, 1. (1999)- "Consmiction baseline productivity: theory and practice." J. Consk Engg. Ami hlgmt., ASCE, 125 (S), 295-303. Chapter 3: Estimating Labor Productivity Using Probability Inference Neural Networkl Estirnating labor production rates (m-hr/unit) is both an m and a science. In generd, the estimator develops the rate for a given project by s t k g wth a "base rate" and modifying it to reflect the spefic conditions he/she expects to encounter in the project being estimated. The base-rate is often detemiined statisticdy from past historical data, or fiom industry standards. In the context of indusmd productivty e s t b a ~ g , the estimator accordingly adjusted the rate up or down by applying a d i f f i c d ~ multiplier to reflect overall favorable or unfavorable conditions. In determinhg the difhculty multiplier, consideration is only given to a couple of major factors that are thought to affect job productivity, such as installation location of pipe (inside a fabrication shop or on the job-site), and material type of melduig (e.g. carbon steel or s tainless steel). 1 ,A version of this chapcer has been published. ASCE, Journai of C o m p u ~ g in Civil Engineering, October/2000, Vol 14(4), pp 241 -218. The chalienges of this approach include the fact that it is not straightformard to create a conventional mathematical model so as to accommodate the impacts of numerous factors on the target rishy variable. The deasion process relies heavilv on individual's e'rperiences and the results are often inconsistent refleckg the esperience and disposition of the es hat or . Llrti&d neural nenvorks have been proposed by many as an alternative for sneamiining the process and reducing the subjective nature of the work. Most models, however, were based on point predictions of production rates wth which estimators were uncornfortable. The point prediction by NN can be de he d as a single value predicted by neural nenvork models wthout any bachp information on t he nsks of taking chis value as correct. The new NN model presented in this paper aises out of the need for accurate prediction in the form of a distribution at the output range. The estimators dl be able to make a decision for a h t u r e scenario based on the results recded by the NN model and personal preferences and esi-eriences. In the folloming section, previous NN applications in the problem domain are k t reviewed. Review of NN Applications Mosehi, Hegazy, and Fazio (1990) cite the prediction of a realistic productivity ievel for a certain trade as an aspect of constniction that can be modeled +th neural nenvorks. Factors such as job size, building type, overtime work and management conditions are typically considered by an estimator and can easily be manipulated for use as neural network inputs. k s h e n a s and Feng (2 992) analyzed earth-movng equipment produccivity with a neural network application. ,A modular neural nent-ork structure was used to m&e it possible to add specikations of new equipmeat mith only a brief training session. Each module represents a distinct type of equipment that w-as trained with trvo inputs, four hidden nodes, and one output within a back propagation training algorithm. Wales and AbouRizk (1993) used neural networks as a rneans of applying the effects of environmental site conditions to the labor production rate on an activity. Daily average ternpeIatue, precipitation, and cumulative precipitation over the previous seven days were identified as three key environmental site conditions and used as inputs uito a feed forward back propagation neural nemork training algorithm. The output was a productivity factor such thac a value larger than 1.0 indicates that environmental site conditions produce a greater than average productivity. On the other hand, a productivity Factor of less than 1.0 indicates that the environmentai site conditions resdt in below average productivity. Chao and Skibniewski (1994) performed a case smdy in which a neural network was used to predict the productivity of an escavator. They idenafied IWO main factors that affect an escavator's productivity: job conditions and operation elements. Job conditions include the characterisucs of the environment such as soil conditions, and speufic characteristics of the excavator and excavation such as the vertical position of the cutting edge. Operational elements, in contrast, include characteristics not directly related to the escavating operation; for esample, the effect of wait time for trucks and e'rtra tasks other than excavating. Two neural networks mere used for the purpose of this case study. The hrst was used to estimate the excavator cyde time. Four key factors were identihed as having an influence: cycle time (including swing angle), horizontal reach, vertical position, and soi1 type (job conditions). The output of the hrst nenvork was then incorporated into the second nenvork, which esamined the effect of the operational elements on the productiviq. Portas & AbouRizk (1993 proposed a feed fonvard back propagation neural nenvork model for estimating construction production rates of formork. The network outputs a single point prediction dong Mth a nurnber of output zones, with equd likelihood of the production rate being in an); one zone. The output zones are symmetric and divided evenly across the range of likely production rate values. During training, the output zone whose output coincides mith the actual production rate is remarded with a primary score of 1.0, representing strong certainty. A certain degree of fuzziness is considered by remarding the 2 adjacent output zones mith s econdq scores of 0.5, representing weak certainty. LU the other output zones are assigned a score of O. Once the NN is trained and inputs are entered and the NN ~vi predict a point value as well as the likelihood of production rates being within the output zones. This mode1 achieved limited success and its limitation was overcome by the work discussed next. Knowles (1 997) ~resented a wo-s tage NN model in predicting pipe-ins dat i on labor productivity. The input factors are used to invoke a LVQ classification process and then a predictive one. With the classification, the model predicts whether the output is likely in a cppical or non-typical range. The proper feed-formard back-propagation netmork is then esecuted- The drawback of this method is that a budd-up of errors occurs when the dassification fails. For instance, if the classScation accuracy is 90% at the Fust stage of NN, and the prediction accuracy at the second stage of NN is 8S0/o, the prediction accuracy of the whole NN is only 76.5% (90% times 85%)- This problem motivated the development of the model descsbed in chis paper by t;iliiog a different probabilistic approach, mhich is more direct and more meanligfd in texns of giving a point prediction and quantifymg its assouated probability. Introduction of the PINN Model Specht (1991) revisited Probabilistic Neural Network (PNN) and Generd Regession Neural Netwok (GRNN) dgonthms with the objective of integrating statistics and neural training. GRNN/PNN is a memory-based feed fonvard neural nenvork model, where the training is performed in one pass, thus requiring less training tirne. GRNN/PNN is able to identify a posterior distebution over the NN weight vectors and a point-value prediction is generated based on the predicted dkmbution. However, based on es,-perimentaons and observations, GRNN/PNN is not quite tolerant of noisy data (inaccurate or incomplete records) and imposes a demanding standard of data qu;ility that is hard to achieve in realiry. The memory demand and cornputing time for GRNN/PNN increase very rapidly when the dimension of input vector and the quantitg of training samples increase. The PlBN model uses similm topology as the GR.NN/PNN model but is a rehement of i t This is because PINN generalizes the underlying statistical patterns \vivithin training data and codes those pattems into a lLnited number of weight vectors through iterative l e h g . As a result, the number of weight vecors d not propomondly increase with the increase of dimensionaiity and quantity of ualliLig data. Weighred Average Pro bability Density Graph Figure 3-1: Topology of PINN Mode1 The PINN mode1 imbeds the output zone concepts described in Portas & AbouRizk (1997). In the application domain of industrial labor productiviry estimating, the profile of acnial historical productivity data reveals a spread range. The range of NN output value, i.e. the production rate multiplier, is evenly divided into a number of sub ranges, or output zones, which are actually some discrete clusters with continuous boundares. The higher the multiplier value, the more diEhcult and more demanding the job is, and the Iowa the productivity for the job. Thus, each output zone gives an indication of the relative mork difficulty and productivity level, for instance, output zone [O-0.q stands for easier mork and higher productiviq level cornparhg with output zone p.7-1.41. The median of each sub range can be used to represent the spical value for each output zone and to derive a NN predicted value. The PLNN uses the same strategy as the model desaibed in -\bouRizk e t al. (1999) by incorporating 1<ohonenys LTTQ in NN learning. The main difference is that the classificarion and prediction networks are combined in an integrated netsvork, which required the development of a different aalung and recall algorithm. Mmz a and Fisher (1993) unlized Kohonen's unsupervised learning algorithm called self-organiung map or SOM for modular construction dedsion making. Kohonen's L e d g Vector Quantization (LVQ) combines unsupervised and supervisecl leaning and is recomrnended for statistical pattern recognition problems (Kohonen, 1995). Three options for the LVQ-algorithms (LVQ1, LVQ?, and LVQ3) mere proposed. ICo honen's research shows that each of the three LVQ variations yields simiTar accuracy in most stastical pattern recognition tasks, although a different philosophy underlies each algorithm. LVQl was utilized in the l e d g process of the PINN model. O v e ~ e w of the PINN Topology and Process The topology of PINN model is &en in Figure 3- 1. It is composed of four iayers. The middle layers are a Kohonen classifier and a Bayesien Iayer. The outcome of the PINN model at the output layer is a probability density function or a distribution r e f l e c ~g the Likelihood of the target variable occurring in a given zone- The mode of the distnbution and its mean can serve as point predictions. The PINN process consists of four stages as follow: (1) Preparation, mfiich deals with Scaling data at the input layer, which d be discussed in the subsection tided "Data Pre-Processing"; Setting up output zones at the Kohonen layer, which d be addressed in the subsection titled "Output Zone Setup"; and What are Processing Elements and how they h c t i o n at the Kohonen Iayer, mhich d be discussed in the subsection titled 'Trocessing Element at the Kohonen Layer". (2) LeamLig, which takes place between the Input layer and the Kohonen layer uslng the LVQ algorithm. This does not involve the Bayesian layer or the Output layer. This di be discussed in the subsection "NN Learning Process". Once leaming is achieved, the input-output patterns are coded into the weight vectors of the processing elements at the Kohonen layer. (3) Investigating whether the neural network has been successhlly uained. This is accomplished through the followng steps: Feed the input vectors of the training and t e s ~ g records into the input layer of the PINN model. Project the input vector of one record onto the Kohonen layer by using the results of stage (3). The Eudidean distances between each processing element's weight cector and the scaled input vector are caiculated. -ln in-zone cornpetition occurs wthn every output zone at the Kohonen layer, which is detailed in the subsection titled "In-Zone Cornpetition Stcitegy ar Kohonen Layer". The processing element wth the minimum Euclidean distance value nruis. Project the minner PE at the Kohonen layer onto the Bayesian layer. The Bayesian layer holds a probabiliy deasity hncuon (PDF) approsimator. The Euclidean Distance d u e s of the winner PEs are the inputs to the PDF approlrimator. The folloMng subsection of 'TDF .ippro'dmator at Bayesian Layer" discusses the components and operations in more details. The output is mapped fiom the Bayesian layer and presented in the f o m of a probability density function at the output layer. Two point predictions are calculated in addition to the probability density hinction namely, the mode and the weighted average. The subsection "Outputs at the Output Layery' Licludes details about die NN outputs. Check the NN outpua against the achial outputs of the training and teshng records. If the results are satisfactoq-, then the neural network is dzclated to have been trained; othernrise, repeat stage (1) using different panmeters at each layer. (4) Recd. Once the model calibration is done, the nemal nework can be used to r ecd the output value for any given input vector, mhich is similm to stage (3) usiag the h a l results detemiined in stage (2) and (3). A sample cdcuiation is given in the subsection "Sample of Re c d Process". Data Pre-Processing At the input layer of the PINN model (shown by Figure 3- l), the number of input nodes corresponds to the dimension of the input ~~ect or . The dimension of the input vector depends on the number of input factors and the input data types. Three input data q.pes are used to d e h e NN input factors, i.e. "Raw", "Rank", and "Bina.$"' "Rad' is used sirnplv for quantitative input factors, like general espense ratios, winter construcon percentages, or quantities of work. " Rd " is used to conven subjective factors, like crem abiliq rabngs, into numeric format. And "Binary" is used to group testual factors into numeric formats, lilie material type and project dehnition. It shodd be noted an input hctor of the "Raw" or "Rank" type corresponds to one input node at the input layer, while an input factor of the "Binary" type corresponds to a number of input nodes depending on the number of groups for the factor. For illustration, input factors and data types for the "Pipe Installation" neural neworlc mode1 are listed in Table 3-1. A sample record for the "Pipe Instdation" NN training is also listed in Table 3-2 showing both the ram data and conrerted NN input data. The NN input data is norrnalized and scaled betsveen O and 1 at the input layes. These scded inputs dl be passed fornard Fur NN training. At the Kohonen layer all weight vectors are rmdornly initiillized betnreen O and 1. Table 3-1: Input Factors and Data Tne of PINN Mode1 NN Input Factor Project Location Adminis mauon Year of Construction Proxrince/S tate Contract Type Client Engineering Firm Pro j ect Manager Superintendent Project Definition Work Scope Project Type Prefab/Field Work Average Crem Size Peak Crew Size Uninized Equipment & Ma tend Estra Work Change Order Drmring & Specs Quality Location Classification Total Quantiy (ZearnLlg) Installation Quantities Matenal Type Method O f Installation Pipe Supports Bolmps Valves Screm-ed Joints bfisc. Components Welding Impact Season Crew Ability Site Working Conditions Inspection, Safety & Quality Overall Degree of Difaculty Data Type (2) Binary Raw Binary Binary Binary Raw Raw Raw Raw Binary B inary Binary Raw Binary Binary Bi n q Raw Raw Raw Rank Binary Raw Raw Binary Raw Raw Raw Raw Raw Raw Raw Raw Rank Rank Rank Rank Options & Remarks (3) Urban, Rural, Camp Job General Ex~ense 89-92,73-94,75-96,97-79 M3, SK Reirnbersable, Lump Sum an index derived fiom historical data an index derived kom historical data an index derived from historical data an index derived from histoncal data Chernical, Cryogenic, Gas, Refining Confincd / Scattered Upgrade Shutdown, Grass Root etc. Percentages for Prefabrication <25,25-50,50-100, >IO0 <25,25-50,50-100,100-150, >250 Yes, Ko Equip.& Mat1 Cost/ Direct MH Original Project Cost/Final Projeject Cost No. of Change Orders/Total Direct MH) 1 Poor 3 Average 5 Excellent U/G on Site, Fab Shop, .i/G on Site etc. Total Quantity In DiaInFt Qty for Size Ranges, <Y, 2"-16", >16" .ilioy,Carbon Steel, FRP/PVC,etc. Percentages of Hand Rigging No. of Pipe Supports/Foot of Pipe No. of Bolmps/Foot of Pipe go. of Valves/Foot of Pipe 90. of Screwed Joints/Foot of Pipe [nstall hfisc.Components MH/Foot of Pipe Welding Multiplier (bfiscoding on Site) Percentages of Winter & Summer LVork I Very Low, 3 Average 5 Trery High L Extreme Problems - 5 No Problem L Extremely Detailed - 5 Highly Toleranr 1 Very Low 3 Average 5 Very High Table 3-2: Innut Data SamtAe of PINN Mode1 NN Input Factor (1) Project Location Admi& tra tion Requiremen t Year of Construction Province/S tate Contract Type Client Engineering Firm Pro ject Manager Supe ~t e nde nt Project Dehnition Work Scope Project Type Prefab /Field Work Average Crew Size Peak Crew Size Uninized Equipment & Ma t c d Extra Work Change Order Dra~ving & Specs Quality Ac tivity Location Classification Total Quantity (Learniflg) Installation Quantities Material Type Method Of Installation Pipe Supports Boltups Valves Screwed Joints h/lisc. Components LVelding Impact Season Crew Abiliq Site Working Conditions Inspection, Safety & Quality Over d Demee of DXficdtv Raw Data Rual 0.235 91-95 Alberta Reirnbersable S hell Colt John Doer Bob Smith Chernical Confhed to Speciic Area PIant Upgrade No Shutdown 10/o, 90/o, 0% 25-50 50-100 Yes 9 -4 0.661 0.029 Excellent Inside <loft E-Figh 6055 210,905,4940 Carbon Steel Hand Rigging0/0, Machine Rigging ' / O 0.45 4.77 1.59 O 3.18 1.25 WinterO/o, SumrnerO/o Average Many Problems De tailed Low NN Input Data (3) Output Zone Setup -1s discussed in the section "Introduction of the PINN rnodel", the likely range of output values is evedy divided into a number of output zones. The output zone boundary semp is important for PINN l e d g and recall. \Vide zones are generally not helpfd to the decision-maker and hence should be aroided. Zones that are unacceptably tight may prevent PINN h m l e d g . It requires some aials to obtain reasonable output zone boundaries and the following two aspects should be considered: 1. Presion requirement of the user, i.e. the zone width or sub-range that suffices for the user to make dedsions. 2. Dismbution of actual output data over the output zones. h uniform distribution of actual output data over aU zones generaily yields better results. Processing Elements (PE) at Kohonen Layer At the Kohonen layer, each output zone contains an equal nurnber of processing elernents. Each processhg element is associated with a weight vector (also referred to as a codebook vector (Kohonen, 1995)). Visually, a weight vector is a set of links that emanate Lom one processing element and end at each input node as illustrated in Figure 3- 1. Th s the dimension of a weight vector is equal to that of the input vector (the number of input nodes). An output zone at Kohonen layer can be visualized as a chip containing a number of pins (PEs). DuPng NN trnining the orientation of those pins is gradually he-tuned to capture the underlying s taustical patterns Mthin the training data (Kohonen, 1995). Our expenence uidicates that the nurnber of processing elements nssigned to each class should be close to the average frequency in the histogram of training data output values, i.e. the average number of naiaing samples in one output zone. NN Leaming Process Data of al l the training records is scaled at the input hyer. The scaled input data is fed into the mode1 to calibrate the weight vectors of the Processing Elements (PE) at the Kohonen layer, using the LVQ aIgorithrn suggested by Kohonen (2995). The leaniing process involves a number of iterations, each of which is comprised of the following: 1. The Euclidean distances between the input vector of a training record and each PE's weight vectors are calculated. The PE that has the smallest Euclidean distance vdue is declared to be a global winner. If the global \vinner PE does not belong to the same output zone that the actual output value of this training record Falls into, the weight vector of the global Ninner PE is pendized according to the Foliowing equation (1): RR stands for "Repulsion Rate", which is a Ieaming rate to penalize the global winner PE;. X, is the input vector of the training record, and Wii is the m e n t weight vector of the global mnaing PE,. Wii' is the updated weight vector of the global whi ng PE,. The repulsion rate is iniaally set berneen O and 1, and is reduced gradually und it approaches O at the end of learning. 2. Followhg the global competition, an in-zone competition among processug elements occurs only at the output zone into whkh the acnial output value of the training record fds. Prior to the Li-zone competition, a "conscience" value is added to each PE's Euclidean distance and adjusted over the learning iterations, so as to effecuvely prevent one PE nithin a specific output zone from whi ng all the cime and activate as many PEs as possible in the leaming process. The formulas to calculate the conscience Euclidean distance c m be found throughout the pertinent literature. The interested readers can refer to Appendk II for details. The The method we adopted is as Follosvs: cc conscience" Euclidean distance between each PE's weight vector and the input vector is calculated. The PE wth the shortest "conscience" Euclidean distance value is declared to be an in-zone m e r . Onlp the in-zone minner PE is remarded u s h g equation (2): Where: AR stands for "-Attraction Rate", which is a leaming rate to remard the in-zone +er PEi- X, is the input rector of a training record, and W, is the current weight vector of the in-zone %&mer PE,. W,' is the updated weight vector of the in-zone whi ng PE,. Wie the repulsion rate, the attraction rate is initially set benieen O and 2, and is reduced gradua* u n d it approaches O at the end of l e k g iterations. -1 sample calculation of one learning iteration is presented nest to illustrate the learning process. As shomn in Figure 3- 2, the dimension of input vector is 12, and the output range ([O*]) is divided into 1 output zones, Le. [O-11, [l-21, [2-31, [3-4-1. Each output zone contains 3 processing elements. Note that this sample is a simple mode1 and serves for illustration. Problems encountered in a r ed situation, wliich are suitable for the PINN mode1 to solve, are mostly high dimensional; the number of input nodes mav esceed 100. The input vector of a training record is scaled benveen O and 1, and the weight vector of a processing element nt the Kohonen layer is randomly initialized between O and 1. Table 3-3 shows the input vector XI and the meight vectors of the 3 processing elements in zone 1. Table 3-3: Scded Input Vector and Initial Weight Vectors Input Vector The Euclidean distance (ED) is calculated berneen the input vector X, and each weight vector Wi, as ED,; = 1.4163, ED2, = 1.5746, and ED,, = 1.2963. Suppose that PE3 in zone 1 is the global e e r PE among d the processing elements, svhich gives a minimum ED value of 1.2963. If the actual output of this training record does no t Ed into zone 1, i.e. outside the sub-range [O-11, then the meight vector of PE, (W,,) is updated by equation (1) as shown in Table 3-4. Notice that the Repdsion Rate is set to be 0.8 at the staa of learning in the sample calculation. Kohonen (1995) recommends smaller initial value such as 0.06 for the Repulsion Rate and -Attraction Rate for obtaining better resdts. Table 3-4: Updating Weight Vectors in First Learning Stage Input Vector If the actual output of this training record does fd into zone 1, i.e. nrithin the sub-range [O-21, then no weight vector is updated in the global cornpetition. The learning process steps into the second phase. In the hrst training iteraon, the conscience value for every processing element in zone 1 is detemiined to be equal to O (see hppendk I). So the "consuence" Euclidean Distance value is equal to the onginal Euclidean Distance value. PE3 is the in-zone Figure 3-2: Operations at Bayesian Layer in Re c d competition b e r , which gives the minimum "conscience" ED value of 1.2963, and its weight vector W3, is updated by equation (2) as shown in Table 3-5. Table 3-5: Updating Weight Vectors n Second Leaning Stage Input Vector The above learning process d iterate through al1 the training records for a suffiuent number of m s . Notice that during the leaming process the Repulsion Rate, Attraction Rate, the conscience value are dynamically updated to calibrate the weight vectors. In-Zone Cornpetition Strategy a Kohonen Layer The in-zone cornpetition at the Kohonen layer occurring in the r ecd stage differs from that o c d g in the leaming stage. Once adequate trainhg is complete, the PINN is capable of mapping the input ont0 the output. At the Kohonen layer, for one output zone, the PE that has the shortest global Euclidean distance (no conscience value) between its meight vector and the input vector is declared to be an in-zone wnner PE. Only the in-zone ber PE advances to the Bayesian Iayer. UnWre the GRNN/PNN, which takes the average of PE's Euclidean distance within one output zone as the panmeter to pass fornard (Specht 1988), PINN takes the minimum of the PE's Eudidean distance wtbin one output zone as the parameter to pass onvard. The reason for the difierence is that in GRNN/PNN, each PE corresponds to one training record, and differenc numbers of PEs Lie in different output zones. In the proposed PINN, the PE does not match the training record, and an equd nurnber of PEs dwell in each output zones and mode together in the Kohonen layer of PINN to generalize the underlying patterns within the 6 n i n g records by mplementing LVQ. PDF Approximator at Bayesian Layer As illustrated in Figure 3- 2, at the Bayesian layer each output zone only contains the minner PE feom the in-zone competition at the Kohonen Layer in the recall stage. The main components at the Bayesian layer are a kernel function and a Sotmas Activation fnction that are used co approximate the probability density of one input vector being wvithin each output zone in the steps as oIlow: 1. The square of Euclidean distance value of the minner PE from each zone is passed into the kemel function, which is the Gaussian hinction of Bayesian methods in sratistics as described in Specht (1988) and shomn in equation (3). If the number of output zones is N, then for each input vector Xj, the kernel function is evduated for N times and output one "q" value for each zone. where: i =1,2,3,. . .N, N is the nurnber of output zones; Xj is one input vector fed into PINN at the input layer; (QUii-XJT(~,,-X,) is the square of the Euclidean distance value between the input vector X, and the winner PEys meight vector Wii in the output zone i, i =1, 2,3,. . . >N. (T is a smoothing factor, and is the only adjustable parameter of the Gaussian hc t i on (3) and controls the shape of the probability density hct i on. The grearer a, the more dispersed the probability density graph. CT is critical to PINN1s predicting capability and can be detemiined through iterative adjustments. In regard to industrial productivity application, 0 should fd in the range between 0.8 and 1.2. The Sofmia-u Activation function (Sarle, 1997) as shomn in equation (4) makes the surn of the calculation results (q values) from (3) equal to one, so that the final output Erom the Bayesian layer can be inrerpreted as posteior probabilities ("p" values). where: qi is the output value 6om Gaussian h c t i o n (3) for output zone output zone i, i =l, 2,3,. . .,Ne N is the nurnber of output zooes. Outputs at Output Layer At the output layer, the probabiliq distribution predicted by the PINN is presented in the form of a Probabiliy Density Function graph, which portrays the uncertainq of the output value. In addition to the predicted distribution, PINN calculates two point prediction values: 1) Mode value: the median of the output zone or sub-range that has the greatest probability. 2) Weighted .Average Value: the sum product of the median and the probability of each output zone. The user should neat ths point-value prediction carefully by checking the probability density hinction grap h. Sample of Re c d Processing h sample PINN recall calculation is given in the folloming sections for iuus tration: Suppose that the NN is mained and ready to recall the output for an input vector. A s shomn in Figure 3- 1, the dimension of input vector is 12, and the output range ([O- 41) is divided into 4 zones, ie. [O-11, [l-21, [2-31, (3-41. Each zone contains 3 processing elements. Table 3-6 lists the scaled input vector X, and the weight vectors of the 3 processing elements in zone 1. Table 3-6: Trained PI NN Ready to Recail for A Given Input Vector Input Vector The Euclidean distance (ED) values between the weight vectors and the input vector are calculated as ED,, ~0. 9513, ED2, = 1.1670, ED,, = 1.3249. At output zone 1, processing element PE, with a minimum ED value (0.9513) is the in-zone cornpetition whner, and proceeds to the Bayesian layer. Suppose the b e r PEs fiom the other 3 zones are also detemiined in the similar manner and proceed to the Bayesian layer. Table 3-7 lists their Euclidean distance values and outputs fiom the Gaussian function and the Sofma-.: function. Table 3-7: Recail Calculations at Bayesian Layer From Table 3-7, the probability of output being within zone 1 is 0.731, hence the mode output value is found to be the median of zone 1, i.e. 0.5. The weighted average output value is obtained by calculating the sum-product of the Sofmiax output @ values) and median of each output zone, i.e. Values for Each Zone (1) Median Winner PE's ED Gaussian Output q: Sofmna,~ Output p: IMPLEMENTATION OF THE PIPIN MODEL Cornputer s o b a r e based on the PINN mode1 is developed for learning and testing in the environment of MS Access 97 and Visual Basic for .ipplications. Historical Zone 1 (2) 0.500 0.951 0.636 0.731 Zone 4 ( 5) 3.500 3.922 0.014 0.016 Zone 2 (3) 1.500 2.567 0.037 0.043 Zone 3 (4) 3.500 1.843 0.183 0.210 piping productivity data of 66 projects resulciog in 119 records of a construction Company mas collected and compiled into NN input data for three Iabor-intensive activities, i.e. pipe installation, pipe welding and pipe hydro-testing. In the following sections, pipe installation is used to illustrate the testing and validation of the PINN model. The PINN model for "pipe installation" has a total number of 81 input nodes. (The input factors and a sample data are shomn in Table 3-1 and 2). The output range is divided into 20 output zones with an equal width of 0.72. 10 processing elements are assigned to cach output zone. The attraction rate and repulsion rate are both equal to 0.06. The srnoothing factor of the kemel function is equd to 0.8. One hundred one records are used for PTNN leaming, while 18 records are reserved to test the calibrated model. The learning process takes 300 iterations. Validation of the PINN Mode1 on Testing Data The testiog of the calibrated nemork on the 18 unseen records was surnmarked in Figure 3- 3. Measured agahst the actual output values of the test data, for the mode value, the average absolute error is 0.57, and the rn~uimum absolute error is 2.02; for ehe weighted average value, the average absolute enor is 0.75, and the maximum absolute error is 2.23. Considering a wide output range of about 15, the error is reasonable and acceptable. To compare the PINN model Mth a back propagation neural network, the same training records were used to train a t hee layer feed fonvard back propagation neurai 87 network, mhich has 81 input nodes at the input layer, 40 hidden nodes at the middle layer and 1 output node at the output layer. The ~aining parameters are l e d g rate equal to 0.8, momentum rate equal to O.+ the transfer hc t i on is symmemc logisac nction. Afier training was completed, the tesMg set of 18 records preriously used to test PINN mas fed into the model. The tesng resdts of the back propagation NN model comparing with that of the PINN are shown in Figure 3- 3. From Figure observed that the PINN rnodel outperforms the back propagation NN model of point prediction accuracy, coming closer to the actual output values. PINN VS FF BP NN +PINN (Mode) 7 1 - -A - FFBP NN in terms 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Test Record # Figure 3-3: Comparison of PINN and Back Propagation NN Figure 3-4: PINN Output for the Base Case Scenario Sensitivity Analysis of the PINN Model A r ecd program based on the trained developed so as to di da t e its effectiveness PENN model for pipe and accuracy in the installation was contest of the application domain. "What if' scenarios are tested on the NN model input factors in order to understand the impact of such changes on by changuig some the output values. The response of the NN model is compared against chat of an expenenced estimator at the participakg consmction Company for the purpose of model validation- The base case scenatio is taken from one testing record. The acnial difficulty multiplier for this scenario is 1.24. The mode value predicted by the PINN model is 1.181, giving an absolute enor of ( 1 1.181 - 1.24 1 = 0.059); and the weighted average value is 1.313, giving an absolute error of ( 1 1.313 - 1 .X 1 = 0.073)- Figure 3- 4 shows the predicted probability h c t i o n or distribution, the chance of output f&g into zone 2 (10.8-1-51, median = 1.181) is 69O/0. In the following validation tests, the acmai values remain unknown, so the responses of the estimator based on personal experiences and common senses serve as a benchmark to measure the performance of the PINN model. The esmator responds widi a trend or direction insread of a precise number because there are so other input factors to take into account. The location of pipe installation is a major consideration when the esperienced estimator detemiines the pipe installation productivity. The instahtion location for die base case scenario is "Piping nrithin a fabrication shop", what if the location is changed to "Operating plant installation on the site"? The esperienccd estknator responds by increasing the difficulcy multiplier to a certain estent to reflect the unfavorable job conditions. Response of the PINN model is shomn in Figure 3- 5. It is observed the mode value increases to 1.903 and the weighted average value increases to 1.983; the chance of the output value falling into zone 3 (l1.5-2.31, median = 1.903) increases fiom 13% in the base case scenario to 78%. The PINN estunator in the decision process for this scenmio. has taken the same direction as the In the base case scenario, the job is done 100% in the winter in Alberta. m a t if NU RecaIl Probability Density Graph 1 .O Figure 3- 5: PINN Output for Scenario 1 the job is done 100% in the summer? The estbator anticipates a reduction for the diffidty multiplier, mhich rneans an increase of productivity level. Response of the PINN mode1 is shown ui Figure 3- 6. It is observed the mode value rernains 1.181, homever, the weighted average value decreases to 1.151. The chance of the output value falling into zone 2 ([0.8-1.51, median = 1.181j decreases significantly from 69% to 44Y0, while the chance of the output value falling into zone 1 ([0.2-0.81, median = 0.5) increases si@cantly from 15% in the base case scenaro to 40/o. Again the PINN chooses the sirriilar course of action as the estimator in the decision process for this s cenario. NN Rewl ! Probabifity Demity Graph i .O Figure 3-6: PINN Output for Scenario 2 The PINN model creates a meaningful representation of a comples, real-Me siniaaon in the problem domain and is effective in dealing with high dimensional input- output rnapping mith multiple influentiai factors in a probabilistic approach. The application of the PTNN model in indusmial labor production rate e s t i ma ~ g helps the estimator choose a course of action by gving a better understanding of the project information available and the possible outcomes that could occur. Bccause the probability densicy of each output zone is provided, the predicted dismbution and point- prediction values give the estimator much more confidence in the predicted result. In combination of the personal esperiences and preferences labor production rate for a new project can be deterrnined. Chao L.C., and Skibniewski, M.J. (1 993). "EstirnaMg Construction Productivity: Neural-Network-Based Approach." Joumal of C O ~ ~ U M ~ in Civil Engineering, ,lSCE, 8(2), 234315 Hermann, R. and Lu, hl. (1997) "-ipplication of Neural Networks in Industrial Estimating", Proceedings of the 27& Canadian Society of Civil Engineers -innual Conference, Edmonton, AB, 15-35. IKnowles, P. (1797). " Pr edi c~g Labor Productivity Using Neural Nenvorh." Mas tcrs of Science Thesis, University of Alberta, Edmonton, AB. Karshenas, S. and Feng, X.( 1992) "Application of Neural Networks in Earthmoving Equipment Production EstimaMg." Cornputhg in Civil Engineering, Proceeding of Eghth Conference, Dallas, Texas, 841-847 Kohonen, T. (1 995). "Self-Organiziag Maps", Springer Series in Information Sciences, S p ~ g e r , London, U.K. Moselhi, O., Hegazy, T., and Fazio, P. (1990 ) 'Teural Newodc as Tools in Consmichon." Journal of Construction Engineering and Management, -ASCE, 117(4), 606-623 Murtaza, M.B., and Fisher, DJ- (1993), "Neuromes: Neural Nesvork System for Modular Consmction Decision Making", Journal ComputGig in C i d EngineeSng, ,%SCE, 8(2), 221-333. Portas, J., and AbouRitk, S.M. (1997), c%Jeucal Nenvork Model For Estimating Construction Productivity." J. of Constr. Engrg. & hlgmt., ASCE, 123(4), 399-410. Sarle, W.S., ed. (1997), Neural Network F-IQ, part 1 of 7: Introduction, penodic posting to the Usenet newsgroup comp.ai.neur-al-ne ts, URL: fip://ftp.sas.com/pub/neural/FAQQhtrriI Specht, D. F. (1 988). Trobabilis tic Neural Networks for Classihcation, Mappings, or =lssociative Memory." IEEE International Conference on Neural Nenvorks, 1988,1,525-532. Specht, D. F. (1991) "Generd Regression Neural Nenvorks.'' IEEE T.cansactions on Neural Netsvorks, Nov. 1991,2, 535-5176. "Conscience" Euclidean Distance is defined as (5): Di' = Di + Ci Where: Di is Euclidean distance, and Ci is conscience value (6): Ci =cf x ~ x ( n x w f -1) Where: D is the r n~&um Euclidean distance out of the global cornpetition in the previous s u p e ~ s e d leamhg stage, cf is a Conscience Factor, mhich is initially set berneen O and 1 by the user, and n is the number of PEs per output zone, wf is dehned as "Win Frequency", and the initial estimate of the Win Frequency value (do) is set to the reciprocal of PE Number per Output Zone for dl the PEs, i.e. 1 /ne With NN learning ongoing, both Conscience Factor (cf) and Frequency Estimate (fe) are reduced graduaiiy und it rpproaches O at the end of leaming. During the uns upe~s ed l e h g stage, for the in-zone whner PE, its mf value is updated as (7): For the in-zone loser PEs, th& mf values are updated as (8): Basically, the above formulas are intended to increase the wf values for the wimer PE and hence increase its conscience value so that the whmer PE d l have less chances to again than the other loser PEs in the following learning iterations. Ar t i f i d neural netrvorks (NN) mimic the cogniuve learning process in the human brain, and deal effectively wdi dl-stmctured problems, in which the algorithms required to solve them cannot be given in a precise and e~;pliut fashion, or the datx for a particular problem are either not complete or cannot be specified precisely (Widman et. al., 1989). NN has been found to be capable of performing parallel computations on different tasks, such as pattern recognition, Lnear optimization, speech recognition, and prediction @Iukherjee and Deshpande 1995). A version of this chapter has been subtnitted for publication. ASCE, Journal of Computiag in Civil Engineering. In recent years, Back Propagation NN (BPNN) has been researched and applied as a convenient decision-support tool in a variety of application areas in civil engi nee~g, induding moduIar construction decision making (Murtaza and Fisher, 1993), structural analysis (Flood and Kat h, 1994), e s t k a ~ g construction productivity (Portas and AbouRizk, 19977, mode choice analysis of fieight transport market (Sayed and Ra z a i 1999), construction md x p estmating (Li et al 1999), measuring organizational effectiveness (Sinha and blcI(im, 2000), and predicting settlement during runneling (Shi, 2000). The speud l e d g aigorithms of BPNN are capable of perfonriing high dimensional, non-linear input-output mapping and e s t r a c ~ g hidden patterns and predictive information from observing the l e d g esamples. However, Leamkg algonthms such as BPNN do not attempt to infer causality, hence, classification or prediction is based on blind correlation of new esamples with preriously analyzed esamples, nrithout giving information on the effecr oE each input parameter or inauencing variable upon the predicted output variable- In the repoaed NN applications, model validation has thus far relied upon measuring accuraq of the calibrated network to an independent testing data set that are hidden from the neural nenvork in learning. The model's sensitivity to changes in its paramerers is generally probed by t e s ~ g the response of a mature nenvork on various input scenarios. In short, a NN mode1 funcrions like a "black box" package, giving no clue on (1) how the ansmers or model outputs are obtaked; (2) how the input parameters affect the output. Widrnan et. al (1989) pooulted out that the credibiliry of an AI program frequently depends on its ability to explain its conclusions. Lack of interpretability is a pi t f d of the neural netmork models recognized by many and has inhibited NN from achieving its full potential in real-world applications. Dhar and Stein (1997) argued that because NN algonthms such as the back-propagation NN are non-linear, high dimensional fimctional equations feaniring pardel distributed data processing, it is hard to e-xplicitly interpret which parameters cause what behavior in the NN model. i'vhile mathematical and operational methods do esist for the analysi of neural networks, the methods are fairly involved, and are Iess than satisfping because of th& theoretical assumptions. They stated that "unlike most statistical methods, it can be difficult to say, even in general, which variables are signrficant in what respect." (Dhar, et.al. 1997) Our research intends to address the idenfied issue by concent r a~g on sensitiviy analysis of BPNN. Similar to regression analysis, the sensitivity of an NN input pxrameter could be expressed as the hrst-order partial derivative benveen an NN output variable and the input paiameter. In the "Literanire RevieW" section, we bnefly introduce several related methods for knodedge eirplanaaon and factor analysis of NN found in iiterature. In the section entitled "BPNN Algorithm and Input Sensitivity", the back-propagation NN algorithm is described kst , followed by the derivation of mathematical relationships between an hTN output variable and NN input space in lighr: of both normalized data and original data. The follonring section "BPNN vs. Regression Analysis" discusses the difference between BPNN and regression analysis of statistics and demonstrates the sophistication and superi~rity of BPNN over regression analysis in a case study based on a s md data set. Nest, statiscd analysis of input sensitivity based on Monte C d o simulation is described in the section entitled "Statistical Analysis of Input Sensivity" to understand the rationale of BPNN's reasoning and the effectiveness of model implementation in a probabilisuc fashion. In the "Industrial Application" section, the new approach is applied to estimate the labor productivity of spool fabrication in an industnal setting, and important aspects of the application including problem definition, factor identification, data collection, and the tesbng resuln bascd on real data set are discussed and presented. Li et al (1999) realized the inability of BPNN to provide e.xplanations on its output negatively affects the user-acceptance of BPNN. key investigated the use of KT-1 method for automatically estlacting d e s from a mature BPNN in an atternpt to explain why and how BPNN makes a p a r t i d u recommendation in developing a decision support tool to e s h a t e the consmiction markup. KT-1 method is a heuristic approach to generating conhmng/disconfimiing d e s fiom each hidden or output node based on the weighted connections and the threshold value of each node, and is constrained by the complesiv of network structure. As pointed out by the authors, such automatic de-extraction systems as KT-1 cannot marrant a N y informative euplanation faality because KT4 lacks the associative knowledge (Le. cornmon sense, professional knomledge etc.) in its de-esnacting process (Li, et. al., 1999). Sinha and bIcIGn (2000) utilized BPNN to measure organizational effectiveness of construction h s . They have applied statistic analysis methods, such as Principal Component Analysis (PCA), stepwise regression and conelation analysis on n BPNN model in an attempt to identify the dominant factors that influence the target output variable, and further to reduce the dimension of input space. Homever, the theoretical underpinning of such statistical techniques requires carehil study of rheir applicability in sotring real problems. For instance, lacking n amareness of the assumptions of least squares regression (nomality, homoscedasticityy independence of errors, and linearity) may cause the rnisuse of regression and correlation analysis (Levine et. al, 1998); use of PCA, which assumes linear relationships betsveen variables, Wht bias the selection of deteminant factors bp escludng those that have non-linear relationships with the target output variable @Zefenes, et. a1.,1995). Alternative approaches to factor analysis include using auto-associative back- propagation neural nenvorks to perfonn non-linear dimension reduction and sensitivity analysis (Cano1 and Ruppert, 1988). The neural nemork has one hidden layer mith k hidden processing elements, where k is less than the dimension of input space IZ. The output space is a repiica of the input space. Analogous to the rationale of PCA, this method is to compress data by representiog many variables by a few components: if we can reproduce the input space using k (k < n) hidden processing elements nrithout loss of information, then the activation values of the 6 processing elements in the hidden layer d compute the 6rst k principal components at the input space, under appropriate conditions. One pidall of this method obseved by Refenes et. al (1995) is that the stochastic nature of the data-generating process at the neural nenvork input space may cause high variance in the analysis results. It is also noted that the output variable is excluded from the auto-associative neural network analysis, hence, such analysis of input parameters or influenchg factors does not take into account the relationships behveen the input parameters and the output variable. Explanations on the importance of input parameters can also be obtained by examinkg the weights of a mature network so as to chwacterize the strengths of the relationship betmeen inputs and outputs. Knowles and AbouRizk (1997) added up the absolute value of weights from one input node to every hidden processing element in a trained BPNN model wth only one hidden layer for esUmating pipe installation producuvity. The total weight vdue of an input node may iadicate the intensity of the connection from the input node to the hidden layer of the nemork; the higher the sum value, the more signihcant the input parameter is. rilthough this heuristic approach is straightfonvard to understand and easy to use, the resdts may bc unstable or inaccurate due to the fact that it fails to take into account the co~ect i ons between the hidden layer and output layer. In modeling the behavioral mode choice of the US. Geight transport market, Sayed and Razavi (1999) combined the leaniiag abilitp of BPNN and the transparent nature of hzzy logic in order to explain the knowledge contained in a BPNN model, which is stored in the form of a weight ma& that is hard to interpret. The neuroftxzzy rnodel Ocilitates the selection of si pfi cant variables that affect the output and displays the stored knowledge in terms of hzzy linguistic rules (Sayed and Razati, 1999). Based on the above methods found in literanire, the effect of each input parameter on the output variable in t e r ms of magnitude and direction s di remans unknown, i.e. the input sensitivi~. In the follonring section the algoPthmic aspecrs of the BPNN model are studied in order to de he the input sensitivq in an exact mathematical term. Figure 4-1: Structure of Back-Propagation NN Model BPNN ALGORTTHM AND ~[NPuT SENSITMTY Frorn the biological perspective, BPNN is origindy proposed as an AI model to simulate the cognitive Ieataiag process in human brain, in which millions of murons are ktercomected and interact with one and another through cornplex electrochemical reactions and signal processing. An d c i a l NN model such as BPNN is merelv an over-simpEed representation of the real NN in tems of mechanism and structure. h typicai BPNN has a multi-layer structure. Each layer contains a number of processing elements (PE) or nodes, mhich are M y interconnected bemieen layen (Figure 4-1). The ntensity of connection betmeen trvo processing elements is represented using a weight. Put into the perspective of mathernatics, BPNN is essentially a gradient-decent optimization algorithm to search for the optima in a high-dimensional weight space with the objective of minimizing the global eirror benveen NN output values and acmd output values. An iterative weight-adjus~g scheme is used to modiq the weights of alI the connections in the NN structure in a stepwise fashion. BPNN Aigorithm The basic formulae to describe signal processing of a PE in BPNN are simply as: Where: Subsapt c stands for a processing element in the hidden laper or output iayer of BPNN; Subscnpt i is the node index at the previous Iayer of BPNN; W, stands for a weight value betnreen node i and node c; S stands for the input signai to a node; N stands for the output signal from a node. Equation (1) shows that a processing element receives a weighted linear combination of input signals rom the previous layer. Equation (2) is Sigmoid (iogistic) h c t i o n and is the most comrnonly used transfer (squashg) h c t i o n in BPNN, through which a processing element transforms the input signal into an output signal. Note that a bias node Mth constant input value -1 Erom the previous laper is aiso connected to a processing element and involved in the calculation, representing the activation threshold of a processing element (Figure 4-1). The digital s b a l s flom through the BPNN foilowing (1) and (2) from layer to layer und the output layer is reached. The global error (E) of the BPNN optimization search is expressed in (3): Where: N stands for the output signal Gom BPNN; D stands for the target d u e ; Subsaipt i is the index of records in the training data set and T stands for the size of training data set. The weight of the BPNN is adjusted using the delta d e to move to the opposite aE direction of - as (4): awpc aE =- h. N .- as, h is a gain ratio in (0,1), also called learnulg rate, which sets the pace of BPNN Iearning; Subscrip t p stands for a processing element in the previous layer of the nenvork; Subscript c stands for a processing element in the m e n t layer of the network. For a processing element For a processing element at the output Iayer of BPNN, at the hidden layer of BPNN, T -- In (6), subscript n is the index of processing elements in the nest layer, and J is the to tai number of processing elements in the nest layer. Usudy, a momentum tenn is added to the weight adjusting scheme to take into account the meight change in the previous step as shomn in (7). dE ? AW,, =-A- N O - P as, +CL * q c 9 AWpc s the weight change in the previous step, and pi s the momennun ratio vhich is usudy less than h . BPNN adjusts meights following (7) by observing the training data set repeatedly, until the global error E is reduced to an accepted l m1 to declare the BPh' N to be crained. The BPNN shodd have at least three layers: the input layer, one hidden Inyer, and the output layer. The three-layer-strucnired BPNN ~vith Sigmoid transfer functions has been found by many to be adequate in solving non-linear op timization problems. In the following sections, we use the three-layer Sigrnoidal BPNN to illustrate the mathematical inferences of input sensitiviq for simpliciq- of representation. However, interested readers can readily estend the derived input sensitiviq to BPNN with more comples structures and other trans fer hmctions. Input Sensitivity Based on Normalized Data Based on the BPNN algorithm presented in previous sections, me can soa out the relationships between an output variable and an input parameter to de hne the input sensitivity of BPNN in an esact mathematical tenn. Notations used in the followkg mathematical formulae are lis ted as belom: Subscript p stands for a node in the Previous layer of the netntork- Subscnpt c stands for a node in the Current layer of the nemork; C stands for the total number of nodes in the cunenc layer. Subscript n stands for a node in the Nest layer of the network. LVii stands for the weight of connection between node i and node j. S stands for the input signal to a node. N stands for the output signal from a node. If the curent node is an input node (in the hrst layer of the nenvork), S is the normalized input data in range (0,l). Previous Layer Curren t Layer Next Layer Figure 4-2: Illustration for Node and Layer Representations If the curent node is not an input node (in a hidden layer or an output layer), Note thac equation (9) is the same as (1) escept thac (9) di s~gui shes one node or processing element p fiom others n t the previous layer. The relationship betrveen the output signal N, and the input signal Sc is defined in (2), fiom which, we have: As shown in Figure 4-2, in the three-Iayer BPNN, node p is an input node in the input layer, node c is a hidden node in the middle layer, and node n is an output node at the output layer. The focus of BPNN sensitive analysis is on investigating the &SE-order partial derivative of the output signal Erom node n (Nd over the input signal to node p (SJ. By (8), we have S, = Np. fiom (IO), we know, From (1 l), we know, So, (12) cari be expressed as: Because more than one processing element exists in t he current Iayer (hdden Iayer), assume, the number of processing elements at layer c is C. A general form of input sensiuviry for BPNN is then expressed as (1 4): Input Sensitivity based on Original Data In daivuig (13), me assume aIl data including inputs and outputs has already been normalized in the range (0,l). From the perspective of real applications, usually it is convenient and straightfomard to probe the sensitivity of BPNN based on the original or raw data instead of scaled data. Various linear or non-linear normahation methods can be used to uansfonn raw data- Shi (2000) retiewed the established data transformation methods for BPNN and proposed a new one calIed ccdisabution transformation", which fits statistical distributions to raw input data and ualizes the resdtant Cumulative Densiq Functions (CDF) to scale inputs to [0,1]. The theoretical underpinning of such transformation is relativdy weak as pointed out in Shi, 2000, but such transformation does complicate the application of NN. The conclusion about the superiority of ccdistribution transformation" scenario over the traditional linear transfomation scenmio is b v e d at ernpiricdy based on independent esperiments on each method. Due to a number of variable factors (such as learning rates, momennuri) and stochastic phenornena (such as the initialization of network weights and the esistence of multiple local optima in the searching space), the improvement of netnrork performance may not be attributable or only p d y attributable to the input transformation methods. The non-hear mapping capability of BPNN is mainly owhg to the non-linear transfer functions in hidden and output PE S. Xccording to our experiments on BPNN, a good selection of hidden layer smctures and transfer functions based on mals generally results in improvement of BPNN's performance. Thus, we recornmend using such robust, undistorted and simple data normalizabon methods as he a r transformation to normalize both inputs and outputs in BPNN and satisfy the neural cornputauon requirements. The simpliaty of BPNN will be maintained wthout sacrihcing its fwictionality, mhich can be M e r demonstrated rom sensi t i vi ~ analysis of BPNN in the following sections. If we take into consideration the data normalizauon procedure, the simplest and most comrnonly used one is a linear process as folloms: (UB - LB) ( Sp- MINp) +LB Np = (MAX, -MINp) Where, I23 is the upper bound of the nomaled interval (LB,UB), and LB is the lower bound, for sigrnoid transfer function, usudy LB = 0, and UB = i; bLL2 is the maximum value in the data set corresponding to input node p or parameter p; MIN, is the minimum value in the data set conesponding to input node p or parame ter p. A formula similar to (15) is applied to normalize the output data, in order to match the output range of the trmsfer hct i ons in BPNN, i.e. (0,l) for Sigrnoid uansfer functions. If we take N, as the raw output data, there is a scale-back process involved at the output layer, which w d cancel out the UB (1) and LB (O) in combination with the scaling process at the input layer. So w e can arrive at a more general form of (14) based on linear norrnalization procedures as: JN, - MAX,-MIN, -- . ~ W, , W, , - N, ( l - Nc , ) - Nn( l - Nn) (W as, MAX, -MIN, i=i This slope or partial derivative is dehned as absolute input sensitivity and represents the expected change in output variable N,, per unit (1) change in input parameter S,, holding the other input parameters constant. In a red-wodd problem, each input parameter may have different unit of measure, and hence various relevant range, which encompasses dl values from the s nde s t to the kgest used in training the model. Simily to regression analysis, it is important for BPNN to interpolate nrithin ie range rather than estrapolate beyond the mage in order to make sensible predictions. For one input parameter ranging from 1 to 20000, one unit change is too s m d to be considered mhile for another input parameter ranging from 0.1 to 0.6, one change is too big to occur. Thus, it is more appropriate to use a relative one-unit (such aslOO/o of relevant ranges) as the basic unit change in input parameters instead of an absolute one-unit (1). Through such transformation, the input sensitivity is undistorted and more meaningful in terms of comparing the effect of different input parameters upon the output variable. The relative input sensitivity is dehed as (1 7): relevant ranges i.e. 10% Bmes ~ ~ Y , - MINJ. Note that die input sensitiviq is independent of the relevant ranges of input parameters and represents the amount that output variable N, changes (either positive or negative) for a particular unit change in the input parameter S, , i-e. the 10% of its relevant range. BPNN vs. REGRESSION ANALYSIS The above sensitivity analysis of BPNN is analogous to the dassic multiple regression analysis in statistics, mhich predicts the values of response or dependent variable based on the values of multiple e.<planatory or independent variable and c m be dehned as (1 8): Where B,, is an intercept representing the average value of N,, when ail the es~lanacory variables S , are equal to zero, i =l to M, M is the total number of a N n explanatory variables. - is a slope for the i" esplmatory variable and its dehiniaon as pi is identical to the input sensitivity of BPNN as above. However, ody by esamliing the difference between BPNN and regression analysis c m the sophisucation and supedoety of BPNN over regression analysis be demonsaated, as discussed neat. aNn in BPNN is An examination of Equation (16) indicates that the value of - dependent on severd factors: 1. The interna1 structure of BPNN, i-e. the number of hidden nodes and number of hidden layers. 2. The BPNN data set, i-e. the relevant range of each input parameter and output variable, 3. The weight values of BPNN, i.e. the intensitg of c o ~ e c u o n among processing elements Erom the input layer to the output layer. This is actually the result of BPNN maining, and hence dependent on the training data set. 4. The cunent input values loaded at the input nodes. From (l), (2), and (1), it is evident N, and N, are hincuons of the cunent input values at the input layer and the weight values of BPNN. We can also observe that once a BPNN is trained on a data set, the &sr three factors (BPNN smicture, meights and training data set) are &ed, so the sensicivis. of an input parameter over an output variable is totally determined by the fou& factor, i.e. die cunent input values. If we treat the current input values as the coordinate values of an input point at the BPNN input space, the dimension of nihich is equal to the number of input parameters, me can conclude that, for a trained BPNN, -- - F(1nput - Po int) as, Here, F stands for a fuaction. Indeed, BPNN perfoms a multiple Iinea regression analysis at each individual data point to fit a non-lineu high dimension hyperplane to the training data set. The slope value dong each dimension, dong mirh the intercept value P,,, varies Erom data point to data point, in contrast with being constant in regression analysis. Simply put in tnro dimension space, BPNN is capable oE fitting a flexible cuve and ai l the observed data points fidi on the line; while regression analysis can only approiamate a straight h e chat strings up the data points with the mi a i mu amount of deviation based on least square method. Aside from above discussions, three other advantages of BPNN over regression analysis are worrh mentioning: (1) BPNN poses no theoretical constraints on data in contrast .Nith the assumptions of least square regression and the required residual analysis in regression analysis (Levine et al, 1997). (2) BPNN supports more than one output in input-output mapping in conttast with only one output in regression anaiysis. (3) BPNN reLz~es the requirements of data in terms of both quantity and quality in contrast with regression analysis. That means BPNN is capable of non-linear mapping with only a very limited quantitg of obserc-ed data points and is tolerant of noisy data (inaccurate or incornplete data). Table 4-1: Data Set for Testing BPNN and Regression Analysis Output In order to illustrate the cornparison of BPNN and regression analysis, we studied the input sensitivity of BPNN trained on an &cial data set with 4 inputs, 1 output and only 10 records as shomn in Table 41. The BPNN mode1 has four input parameters, 1 output variable, and one hidden layer with three hidden nodes, wliich is detemiined based on trials. The leaming rate is 0.8 and the momenturn is 0.3. Standard Error of the estimate in regression analysis is a measure of variation around the fitted line of regression and is cdculated as a measurement of accuracy to compare the performances of tmo techniques. Standard Error is actudy a slight variant of the global enor tenn E in BPNN as (3). After achieving satisfactory training (standard error of the NN output is reduced to 0.00158), we calculated the partial derivative values of the output variable over each input parameter using (16) at various input points. n i e results, as shown in Table 4-2, indicate that for a specific input parameter, the slopc value over the output variable varies svith the input points. In order to analyze such variation, a Monte Carlo stmulation is performed at the BPNN input space to observe the statistics aNn value for each input parameter. In each simulatioa w, an input point is of - randomly generated in the BPNN input space and triggers a BPNN r ecd process. A slope value of each input parameter over the output variable is calculated. If the number of simulation runs is large enough, we c m assume we d traverse the entire BPNN input space by i nt er pol a~g. A program in MS VB and Access is developed to perform the Monte C d o simulation exi~eiiments for 1000 iterations. The resultant Probability Density Functions (PDF) of slope values for the four input parameters are shown in Figure 4-3 and the statis tics ;ire sumrnarized in Table 4-3. aN, Table 4-3: Statistics of Partial Denvative (Slope) Values: (-) ~ S P 3Nl Table 4-2: Partial Derivative (Slope) (-) at Four Input Points at NP BPNN Input Space Input Factor Index @) (9 1 7 - 3 4 Input Factor Index @) (1) 1 7 - 3 4 Point (0.5,0.5,0.5,0.5) (2) -0.1594 0.8915 -0.1685 0.9974 Point (0.9,0.9,0.9,0.9) (4) 0.22 69 0.3267 -0.31 69 0.2878 Point ( 01, 010101) (3) -0.3057 0.26 1 3 -0.0398 0.4799 Maximum (2) 0.9797 1.9502 0.4833 2.2255 Point (0.2,0.4,0.6,0.8) ( 5) 0.1115 0.3917 0.3302 O.1G11 Average (4) -0.01 51 0.5769 -0.2678 0.6365 Minimum (3) -1 .O366 0.0042 -2.1053 0.003 1 Std. Dev. (5) 0.3863 0.4038 0.5489 0.5064 95% Confidence Interval (6) -0.0390 - 0.0089 0.5518 - 0.6019 -0.3018 - -0.2338 0.6051 - 0.6679 Figure 4-3: Distributions for Input Sensitivity A regression analysis is conducted on the same data set of 10 observations in MS Excel. The results are that the slope of the tirst input parameter is 0.1217, the slope of the second input parameter is 1.0141, the dope of the third input parameter is minus 0.5925, and the slope of the fourth input parameter is 0.5509; the intercept is minus 0.03054. Note that those slope values are constants in contrast with distributions as obtained &om BPNN. The standard error based on the outputs of regression analysis is as high as 0.1285 compared Mth merely 0.00158 of BPNN. In short, BPNN outperforms regression analysis by a significant mugin in our experiments, which agrees with the previous analysis and comparisons. The simulation results reveal disuibutions of slope data for BPNN, which take various shapes (Fig. 3). If the actual distribution of input sensitivty to be encountered in operations is available, cornparison of the actual distribution nrith the conesponding Monte C d o distribution obtained from BPNN can serve as an effective means for mode1 validation. However, in most real BPNN applications quantitative information is unavailable to fit such actual distributions of input sensitivty due to the complesity of the engineering or management problems being solved. This is also the reason of choosing BPNN insiead of other conventional mathematical models in the k t place. An experienced domah expert may also have difhculty figuring out such distributions of input sensitiviv on a subjective basis, because the deusion process generally relies on assessrnent of the entire input scenario and there are so many interacting factors. The domain experts may share some cornmon hunches about the probability of increasing or decreasing the output variable mith a certain adjusanent of an influencing factor. But the amount of adjusment is generally very subjective depending on the lapur scenario and persond e-xperience and temperarnent. Therefore, instead of fitang distributions, statistical analysis of simulation resdts involves calculatlig 5 percendes of the slope variable for each input parameter, i.e. the lochh, 2Sh, 5oLh, 75th, and 90". The input sensitivity of di input parameten is summarized and presented in a tornado-fie graph as ilIustrated in Figure 4-4 for the piping fabrication labor productiviq BPNN model. The horizontal asi s represents the relative input sensitiviy as d e t e k e d by (17), i.e. output response (negative or positive) nith a change of 10% relevant range in an input parameter. The vertical mis is the basellie coreesponding to no output response or zero change in output. Five short vertical bars correspond to each input parameter representing respecuvely the five percentiles f?om left to right, reflecting the central trend, the spread, and the shape of the obsenred slope data dismbution from simulation. The guidelines for i nt er pr e~g the graph and sim-dation resulcs are listed as below: The leftmost bar ( l Oh percentile) being to the left of baseline represents that the chance of the slope value for the corresponding inpur parameter being posiave is above 90%, or with increase of the input value the probability for the output value to increase is 90%. 10. How Busy 1 1- Drawing Revision 12. Priority Rushed Spools 13. Reworked Spools 14. Material Problems 15. Drawings Late 16. % night shift 17. % overtime 18. % extra 19. % apprentices 1 - - - - - - - - - - - - - - - - - - -l--.l--l - #- - - - - - - - - - - - - - - - - - - - - - - - - 1 - In Line Fitting per Ft Non In Line per Ft - - - - - - - - - - - - - - - - - - - - - - - - - I-i--t . . . . . . . . . . . . . . . . . . . . - - - - - - - - - - - - - - - - - - - - - - - - - q( - i -1- - - - - - - - - - - - . - - - - - - - . - Figure 4-4: Sensitivity Analysis of Spool Fabrication BPNN Mode1 l - per - - - - - - - - - - - - - - - - - - - - - - - , - 4. s ~ p p o n per ~t - - - - - - - - - - - - - - - - - - - - - - - - - 5. Flange per Ft 1 1 - 1 - -1 - - - - - - - - . - - . . . . - - -11 -1- -1- -1- - - - - - - - - - - . - - - - - - - - - - - - - - - - - - - - - - - - - - - - l - - 1- .l- -1- - - - - - - - - - . . - . - - - - - - - - - 6. Mlt Stn RW % The rghmiost bar ( 9 0 ~ percentile) being to the rght of baseline represents that the chance of the slope value for the conesponding input parameter being negative is above 90/o, or mith increase of the input value the probability for the output value to decrease is 90%. The 25& and ~5~ percentiles can be explained in a simiL21: manne= as the 1 OI h and 9oKh percentiles according to the relative positions of the conesponding bars to die baseline in the graph. The middle bar (50th percende) riding on the baseline represents chat the chance for the output variable to increase or decrease is 50%. An input parameter with a dope dismbution clustering around the basellie has less effect on the output variable than that with a slope disuibution distant &om the baseluie. Thus, the magnitude of input sensivty can be inferred Erom obsenvlg the absolute values of percentiles as well. Note that the statistical descriptors (percentiles) are based on simulation samples rather than the entice population. However, the sample size is assumed to be large enough (10000 runs to draw Figure 4-4) to traverse the input space of BPNN, and the confidence interval estimates are rather tight, hence the statistical descriptors based on the samples can represent those for the population. The proposed sensitivity analysis method is of stochastic nature because of independent mals for BPNN aai nl i g (such as initialization of network parameters, hidden layer structure, and local optima) and Monte Cado process. If BPNN training is achieved and the simulation iteration is large enough, the results for most input parameters are stable in temis of direction and magnitude of input sensitivity, escept for a couple of input parameters smapping sides nrith respect to the baseline fiom trial to trial. 4 semi-optimal BPNN mode1 can be determined by selecbng the nial in mhich input sensitivity of major input panmeters makes sense or is agreed upon by the domain e-xpert. In case b a t the sensitivity of one input parameter dways takes the opposite direction in the tomado graph comparing against domain es~ert ' s esperience or common sense, the dehi t i on and data collection procedures for the input parameter dong mith the data itseli shouid be carefdy reexamined for s honf ds before the input parameter is dropped out of BPNN analysis. The sensitivty analysis of BPNN as described in the previous sections is applied to andyze a BPNN mode1 for est hat i ng labor production rate of pipe spool fabrication in the fabrication facility of PCL Indusuial Constructors Inc, mhich is one of the largest and most modem pipe fabrication and module facilities in Western Canada. Spool Fabrication Basics A pipe spool is a portion of piping system consiseng of various piping components, such as flanges, elboms, reducers, tees, supports, and pipe. These items are prefabricated into d i s ~ c t assemblies thar are later assembled together as part of an industnal plant or production skid/module. Such prefabrication is usuallp performed under controlled shop environment iocated away fiom the actual project site, which doms for better productivity and quality control, and hence cuts the field labor costs. Major spool fabrication processes, such as cut, bevel, fit, weld, and handle sections of pipe and fitthgs, tends to be labor-intensive. Productivitg dam is coUected for 63 projects completed 6-om 1995 to 1999, d k g which period the technologies and machines for welding and cutang in the shop remain relativelv stable. The productivky studies of spool fabrication is suitable to the unit-cost estimating method, in which labor production rates must be independent of equipment use and v q among projects ody because of differences in labor productivity (Parker et. al., 1984). Due to the variation in size, wall thickness and coniguration of each individual spool, a special unitkation scheme is utilized in the Company to quanti9 the various work items uniformly into an abstract unit of measure called "Fabrication Unit" or "Unit" on the basis of weld inches of standard wall thichess pipe. Quantity of non- welding work items such as cutting, bevehg, handling pipe and fittings, installing supports are also converted into "Units" by applying corresponding empirical factors Ui the scheme. Factor Identification and Data Collection The labor hours per fabrication unit become the focus of investigation, wtJch ranges L-orn 0.1 MH/Un.it to 0.5 h~fH/Unit in the collected historical data. The unit labor hous fluctuate from job to job due to a number of quantitative and qualitative factors, indudiig the complelacg of spool confguration, the material components in fabrication, the stringency of quality control, spool dra~ving qualiq~, the amounts of night shift yid ovenime, extra work, crem esperience etc. The environmental efects and management factors are not considered as significant factors because of the connolled shop environment and consistent policy and personnel of management during the period of inves~ation. 19 input parameters are i denaed as listed in Table 4-4. Table 4-4: Input Factors of Spool Fabrication Labor Productivity NN Input Factor (2) In Line Fitting @CS) pe Foot of Pipe in Spool Non In Line Fitting @CS per Foot of Pipe in Spool Valve @CS) per Foot O Pipe in Spool Support @CS) per Foot O Pipe in Spool Flange (pcs) per Foot O Pipe in Spool CIIulti-S tauon Roll Welc inches / Total Roll Welc Inches Repair Rate Radiograp hy Tes, Requirement Non CS Units / Tota Units <hop Work Load Drawing Revision Rate Prioritg Rushed Spools Rework Spools lulaterial Shortage Problems Late Drawing Issues Night Skift hWs / Total blHs 3ver Time bfHs / Total bfHs Zxtra Work MHs / Total rvms Qpprenticeship MHs / rotal MHs Data Source (3) Ma t e d Track. sys. Material Track. S YS. Material Track. sys. Material Track. sys. Material Track. sys. Weld Track. Sys. Weld Track. Sys. Weld Track. S y s , Weld Track. & Matenal Track. S YS. Questionnaire Questionnaire Ques tiomake Questionnaire Quesuonnaire Payroll Sys. Payroll Sys. Payroll Sys. Payroll Sys. Remarks (4) h ratio indicatkg the average length of pipe sections in spool h rauo b d i c a ~ g comple'ty of spooi c o n w a tion h ratio indicating complesity of spool configuration h ratio indicating complexity of spool configuration A ratio indicating comple-uty of spool configuration Multi-Station Roll Weld requires extra handling between weld stations An index of crew's proficiency An indes of qualis. control sbi ngenq by specs. Non CS component in fabrication requires estra :are in storage, hancilhg and weldng A 5-point rathg based on shop workload in lnits and no. of concurrent jobs indicating how 3usy the shop was. A 5-point ratbg based on percent of revised 4 5-point ratkg based on percent of rushed ipooi due to client pnoety ~ i d i C a ~ g shop work jchedules. 1 5-point rating based on percent of reworked ;pools due to drawing errors and quality defects i 5-point 1 a ~ g on efficient); of material supply i 5-point rating based on percent of late spool irawhg issuance by client that impacts kbnction Gght S M affects labor productivity 3ver Time affects labor productivity 3 m a Work affects labor productivity Velder qualification sys tem affects labor xoductiviq: Apprentice vs. Journeyman Data is collected fiom the company's various transaction systems induding labor cost tracking systern, weld aacking system, pagroU system, material tracking systern. In order to ease the burden of data gathering and ensure high quality of data, a histoncal project data marehouse is custom-developed using Mcrosofi hccess and VBA to integrate ram data Erom different transaction systems and automate the validation of raw data and the calculauon of productivity inormauon. Because data is unavailable in cunent transaction systems of the Company for such factors as the draming revision rate, late drawing issuance, materid shortage problems, quanticy of remorked spools, quantity of nished spools due to prioriy, shop work load data, a questionnaLe survey is carefully designed and conducted nrith the support of the Company management. The key personnel involved in the projects UicludLig shop supe~t endent s, project managers and coordinators, QC staff, and welding foremen are interviewed to help recall some facts and gather the needed information. BPNN Training and Sensitivity Analysis A total number of 70 records are compiled and used to train a BPNN mode1 nrith 19 input nodes at the input layer correspondng to 19 input parameters, 19 hidden nodes at the middle layer, and 1 output node nt the output layer that is the unit labor hours. The number of hidden nodes can be detemJned based on trials; BPNN learning is found to be unsusceptible mhen the number of hidden nodes is close to the number of input nodes. The leamng rate is 0.4, the momenturn is 0.1, and sigmoid transfer hct i ons are used in hidden and output nodes. After satisfactory training (standard enor of the output is 0.00143), the Monte C d o based sensitivitg anaiysis is performed on the matured network for 10000 simulation runs. Note that Equation (17) is used to determine the input sensitivity, which is based on the change of 10% of relevant input range. Several independent tnals fiom BPNN training to the sensiuvtp analysis are conducted on the same data set. The best trial, in which the input sensitivity of most factors folloms the same trends, as determined by ex~erienced domain e-xperts, is shomn in Figure 4-4. An examination of Figure 4-4 reveals the relationships berneen the influencing factors and the fabrication productivity, mhich are generahed by BPNN through obserring histoncal project data in the past 5 vears. For exarnple, factor 1 is about in line f i b g pieces per foot of pipe in spool, which indicates the average length of pipe sections in spool. According to our domain esperts, in line f i dgs , such as unions, couplings, swages, reducer etc sre used to connect pipe sections in a straight line without nims or branches. Thus, the more in line fitting pieces in spools, the more smali sections of pipe in spools, and the easier to handle the work. From Figure 4-4, BPNN detemiines the chances to decrease labor hours per unit mith the increase of this ratio are about 78% and agrees with the trend identiiied by domain esperts. Factors 2 to 5 are four ratios indicating the complesity of spool configuration. By our domain esTerts, the higher such ratios, the more comples the spools' configuration, and the tougher to fabricate the spools. From Figure 4-4, the dominant trends of the four ratios are all on the plus side, *ch matches the elrperience of our domain experts. It is also observed fiom Figure 4-4 that factor 18 (extra morli percentage) is relatively tightly enveloped around the baseline, which indicates that estra work is aot as dominant as other factors 130 in conmbuting to the variance in unit labor rates. The explanation cm be pardy attributed to the fact that the arnount of estta work more directly impacts the efhciency of administration or management than the producvity of crem on the shop Boor. Other input factors can be interpreted and validated in a similar marner, and are not elaborated further due to space limit. Model Testing and Validation In particular, the effect of matend srpe of spool Fabrication on the labor productiv~ is tested based on the BPNN model, because material s.pe (carbon steel, stainless steel, alumnum etc.) is a major consideration of an industtial estimator in adjusting unit labor hours of spool fabrication. The labor production rate of non-carbon steel fabrication is ernpirically 1.5 mes the rate of carbon steel in company's business guidelule. 24 records La the data set wth 0% non-carbon steel component (100% carbon steel fabrication) are selected as testkg records. In the nest step, for each testing record, only the input parameter of non-carbon steel component is changed Erom 0% to 100% mith other parameters intact. Those testing records are fed to the nemrork and let BPNN recall the output, i-e. the unit labor rates for non-carbon steel fabrication. The output Erom BPNN is compared against the original output of each record, i.e. the unit labor rate for carbon steel fabrication. Based on the test results in Figure 4-5, BPNN increases the unit labor hours on 75% of the records; the amount of decrease for 5 records, i.e. No. 1, 2, 5, 6, 9, is rektively small compaEng with the amount of increase for othee. If the sample size is luge enough, the percentage should corne close to about 9O0/0, as obsemed bom Figure 4-4 for Factor 9. On average, the ratio of non-carbon steel labor Test NN Sensitiviy By Changing Material Component from 100% CS to 100% Non-CS: 75% Records increase, Avg. Ratio 1.38 1 + Actual(100% CS. 0% Non-CS) a NN Output (0% CS. 100% Non-CS) 1 O 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Rec. No- Figure 4-5: Testing Sensitivity of BPNN to Material Type rate over carbon steel labor rate is 1.3, which is close to 1.5 irr the gudeline. Note that the guideline gives an average number (1.5) in consideration of matenal type only, mhi Ie BPNN is able to figure different numbers for different scenarios taking into account 19 relevant factors. In short, a BPNN-based decision suppoa tool d be more sophisticated and intelligent than the uaditiond business guideline. The model validation approach of BPNN based on the proposed sensitivity analysis is superior to the conventional validation approach of testing the mature nenvork with an independent data set, in that such sensitivity analysis enables the modeler to understand the rationale of BPNNYs reasoning and have a pre-knowledge about the effectiveness of model implementation in a probabilistic fashion. The insight into the BPNN model gained h-om the proposed sensitivity anaiysis method gives the user more confidence in the BPNN's prediction, hence faciLitates the implementation of BPNN-based decision support tools. The success of our indusmd application in estmating labor productivity of spool fabrication esceeded our initial expectations. Not only does this new method prove to be effective in addressing problem domains in which BPNN has been applied, but also it potentially malces BPNN app ealing to new engineering or business applications. Carrol, R.J. and Ruppert, D. (1 988). TranJfomntion alrd EVe&hting in Regssion, Chapman & Hd , New York, NY. D har, V. and Stein, R. (2 997). Inte/hgent De~ikon SIlpport Sy~tems: The S~i ei m of Kplowlec&e IVork. Upper Saddle River, Prentice-Hall, Inc., New Jersey. Flood, I., and Kartarn, N. (1994). "Neural networks in civil engineering: systems and applications ." 3. Const~. Engrg. And~\/rgm&., ASCE, 124(1), 18-33. Knomles, P. (1 997). Pr*icting L b o r Pmdztctiui. Using Nermd Ne~works. Mas ters of Science Thesis, University of Alberta, Edmonton, AB. Levine, D. My Berenson, M. L., and S tephan, D. (1998). Statihir )r Ah~zugcr~ zmkgL V I ~ ~ J - O ~ EXCEL, Prentice-Hall, Inc., Upper Saddle Rive, New Jersey. Li, Y-, Shen, L. Y., and Love, P.E.D. (1999). ''ANN-based mark-up estimation syscem nrith self-esqhnatory capabilines." JomvaI of Corirtn~ctian Engizeeting U I I ~ fVImagenzent, ASCE, 125(3), 155-189. bfukherjee, A., and Deshpande, J.M. (1995), "hlodeling initial design process using yftcial neural nenvorks", joz~maI Conputing in CIMI Engimenig, ASCE, 9(3), 1 9 4 200. Murtaza, M.B., and Fisher, D.J. (1993), 'Weuromes: Neural Nemork Systern for EvIodular Construction Decision Making'', j ot ~r~zd Co7rpzlcing NI Civil Eemering, AS CE, 8(2), 221-233. Shi, J. (2000). "Redung prediction error by transfomiing input data for neural networlis", Jouniai C o m p u ~ g in Civil Engineering, ASCE, 14(2), 109-1 15. Sinha, S. K. and Mc I b , RA. (2000). "Artifidai neural network for measuPng organizational effectiveness.", Journal C o m p u ~ g in Civit Engineering, ASCE, 11(1), 9- 14. Parker, AD., Barrie, D. S., and Snyder, R. M. (1984), Planning and Es t i r n a ~g Heavy Construction, McGraw-Hdl, Inc., New York, NY-Portas, J., and AbouRizk, S.M. (1997). Neural Netsvork Modd For EstunaMg Construction Productivity. J of Corn-t~ Engrg. dN Mgmt., ASCE, 123(4), 399-410. Refenes, AN., Zaprnis, AD., Connor, J.T. and Bunn, D.W. (1995). "Neural Netsvorks in Invesment bIanagement". IfzteZhgerzt Systestns for Finance md Bz~s~Iz~J'J', S. Goonadake, and P. Treleaven, eds., John Wiley & Sons Ltd, , Chichester, EngIand, 177-209. Sayed, T., and Razavi, A. (1999). "Cornparison of Neurd and Conventional hpproaches to Mode Choice Analysis" Joiird oJ Compz~ling in Ci d Enginehg, ASCE, 14(1), 23-30. Widrnan, L.E., and Loparo, KA (1989). "i Mal intelligence, Simulation, and modeling: a critical survey", Art@&/ inteihgence, nimztiatio~ and modeLing. L.E.Widman, I<.A Loparo, and N. R Nielsen, eds., John Wilev & Sons Ltd, New York, NY, 1-45. Chapter 5: Conclusion and Recomrnendation In conjunction with a major indusuial contractor of Canada, the thesis research conducted case studies on the theoretical basis and practical considerations for measuring and analyzing labor productivity in i n d u s d consmction. Two important activities of process piping were investigated: pipe installation in the field and spooI fabrication in the fabrication shop. The p r i mq objective of research i developing ANN-based eshat i ng tools to offer estirnators valuable information about labor productivity in bidding new jobs, because estimating labor productivity is one of the most difficult aspects of preparing an estimate, or a control budget based on the estimate for labor-intensive activities in industrial construction. ArtXcial neural netsvorks are capable of sort kg out hidden patterns and e s l x a c ~ g predictive information from cornplex data sets, and nrere proven to be effective in both uncertanty analysis and sensitivity analysis of coosttuction labor productivity in the research. The thesis research has addressed: (2) how to quanti* labor productivity in industrial construction fiom a contractor's point of view; (2) how to measure acmd labor productivity in industrial construction based upon on-site control practices; and (3) how to ualize M c i d Neural analyze the labor production rates and the sensitivity of identifed influenckg factors. Productivity Studies and Data Collection The thesis research reviewed current estimating practices as applied to the involved Company and generalized special methods utiIized in practice for the quantification and measmement of labor productiviq in industrial construction. The input factors that cause the varkbility in the productivity for studied activities were identified through literature review and consultation with es~enenced domain experts at the involved Company. With the support of the company's management, two data nrarehouses were custom-developed for field pipe installation and shop spool fabrication respectkely to iritegrate the corporate management systems of es t i ma~g, production resources pIanning, quality control, and labor cost control. It should be mentioned that questionnaire surveys were carefdy designed and conducted to collect some qualitative and desuptive information that is not obtainable Ecom the company's reportkg and a c c o u n ~ g systems. Eqerienced supe~t endent s, project managers and eschators of the involved Company mere interviewed to help recall some facts and gather the needcd information. The data warehouses provide solid platforni of integrated h i ~ t ~ a d data from which to validate novel ANN models and develop ANN-based tools for productivity analysis. Probabilistic Neural Network Modeling The thesis research derived a probabilistic neural network classil5cation model cded the Probability Inference Neural Network (PINN), mhich is based on the same concepts as those of the Learning Vector Quantkation (LVQ) method combined with a probabilistic approach. The PINN model was intended to overcome limitations of other 137 neural netmork models and mas developed for predicting labor production rates for indusmal consmcon. The thesis presented and explained the topology and algorithm of the PINN rnodel in details. Portable computer sofnvare was developed to impiement the aWiing, t e s ~ g and recall for P m . PINN was tested on real historicd productivity data at the involved Company to analyze the degree-of-difficuly factor of field pipe installation productivity and compared to the classic feed fonvard back propagation neural netmork model; ths showed marked improvement in performance and accuracy. The PLNN model creates a meaningful representation of a comples, real-life situation in the problem domain and in general is effective in dealing with high dimensional input- output mapping with multiple influential factors in a probabilistic approach. The application of the PINN model Ln industrial labor production rate e s t k a ~ g gives an estirnator a better understanding of the project information available and the possible outcomes that could occur. Because the response of PINN is in the form of a probability density hc t i on (dismbuuon) at the output range, an estirnator WU be able to dede on the degree-of-difficulty factor for a future scenaro by combining the PINN's recommendation with personal judgment. Sensitivity Analysis of Back Propagation Neural Networks Validation of a NN model has thus far relied upon measuring accuracy of the calibrated netsvork to an independent testing data set that are hidden Erom the neural nenvork in leaming. A NN model's sensitivity to changes in its parameters is generally probed by t e s ~ g the response of a mature necwork on various input scenarios. The thesis research also investigated the classic back propagation NN algorithm to study the effect of each input parameter or influencing variable upon the predicted output variable. The input sensitiviv of back propagation NN is defned in esact mathematical terms in Light of both normalized data and ram data. The diffe~ence betmeen back propagation NN and regression analysis of statistics is discujsed and the sophistication and superioriv of back propagation NN over regression analysis is h t h e r demonstrated in a case study based on a smail data set In addition, statisticai analysis of input sensitivity based on Monte Carlo sirnulauon enables the modeler to understand the rationale of back propagation NN reasoning and have pre-knomledge about the effectiveness of model implernentation in a probabilistic hshion. The sensitivity analysis of back propagation NN was successfully applied to analyze the labor production rate of pipe spool fabrication at the involved Company. Important aspects of the application lnduding problem defkition, factor identification, data collection, and model testing based on real data were discussed and presented in the thesis. The model validation approach of back propagation NN based on the proposed sensitivity andysis is superior to the conventional validation approach, in which the mature nemrork is tested with an independent data set and the modei's sensitivity is probed through obserwig the output with respect to changes in input based on a lirnited nurnber of scenzuios. The insight into the back propagation NN model galied from the proposed sensitivicy analysis method gives the user more confidence in the back propagation NN's prediction, hence facilitates the implementation of back propagation NN -based desion support tools. Not only does this nem method prove to be effective in addressing problem domains in which back propagation NN has been applied, but also it potentidy makes back propagation NN appealing to new engineering or business applications. Conclusion The problems addressed in the thesis research were idendied through i nvest i ga~g the m e n t estimating pracuces in industly and understanding the real concems of indusw professionals. Emerging computer modeling techniques such as data marehouses and ANN mere researched Erom an academic perspective and implemented in industry to meet mith the challenges. The proposed novel ANN models and developed decision support tools nrere validated using real data from indus- and s uc c e s s ~y applied to assist estimators in deading on labor production rates for new jobs. The esperiences and lessons learned h m the successful, productive and mutudy beneficial collaboration betnreen academia and industry throughout the thesis research dl potentidy serve as a mode1 to guide other university-industry joint research projects in the future. There are a number of issues that need to be addressed in greater detail in the fu tue, Quantification of Textual / Descriptive data Three input data types are used to define NN input factors in the thesis, i.e. "Rad' , "Rank", and "Binary". "Raw" is used smply for quantitative input factors, like general expense ratios, winter construction percentages, or quantities of work. "Rank" is used to conven subjective factors, like crew ability rangs, into numeric format. And "Binary" is used co group texmal or descriptive factors into numenc fonnats like matenal 1.10 type and project defuition. It should be noted XI input factor of the "Ram" or "Rank" tppe corresponds to one input node at the input layer, while an input factor of the "Binary" type corresponds to a number of input nodes depending on the number of groups for the factor. %Binary" data - pe satisfies the cornputing requirements of neural netmorks for converting textual or descriptive data, however, some disadi-antages associated mith "Binaxy" data may affect the performance and sensitivity analysis of neural networks. First, increased dimension of NN input space caused by "Binary" data type increases the complesity of network smicture and the quantity of netrvork parameters. Based on experimentations and observations, the PINN mode1 is not very susceptible to the increase of the NN input space dimension, homever back propagation NN does suffer in terms of leaming t h e and generalization ability Nith the increase of input dmensionality. The generalization ability is not guatanteed to irnprove, but chances are very high that the learning me wilI increase considerably. Secondly, the input sensitivity of back propagation neural nenvotlcs is de6ned for each NN input node. A change for an input factor of '%inary" data type entails changes in more than one input nodes of NN. Thus the input sensiuvity for an input Factor of "Binary" data type must take into account the combination effect of involved input nodes. What input nodes are involved depends on hom the change is made. For esample, suppose four different material w e s are considered, conespondmg to four NN input nodes, a change kom type 1 (1000) to type 2 (0100) trggers changes in the frst and second NN input nodes; while a change Lom type 1 (1000) to type 3 (0010) %ers changes in the hrst and thitd NN input nodes. Note tliat the input sensitivity of back propagation hW for one input node is not a constant value but a distribution, such combination effect makes it difficult to esp1,a.h the sensitivity of an "Binarp" type factor. Fortunately, there is no such "Binary" type factor in spool fabrication productiviq analysi where the sensitivity andysis was tested. For the field pipe installation productivtg analysis, an esperment was conducted to treat such "Binary" factors such as material type and project type as "Rank" factors. Various groups in each factor were ranked on a 5-point scale by their relative difficulty based on the judgment of domain expert such that a unit increase in the corresponding NN input node codd represent the increase of degree-of-difhnilty factors. The results of the esperknent are satishctoiry and the input sensitivity follows the correct direction for most factors. However, the dran-back of such a heuristic method is that sometirnes even the domain esTerts found it hard or impossible to weight the relative difficulty and rank each group in a factor in a sensible way. In short, more sophisticated methods such as &zzy set theory may be researched and introduced into NN to convea textual or descriptive factors into numeSc formats. Optimization of NN Structure NN structure m d y concerns with the middle layers, for instance, the number of hidden layers and number of hidden nodes in each for a BP NN; the number of processing elements assigned to each output zone at Kohonen layer and the setup of output zones for a PINN model. The determination of NN sarucme relies heavily on a mal-and-enor based process, in which cornparhg the NN's outputs nrith actual outputs on an independent tesMg data set serves as a yardstick for j us t i wg the structure. Such optunization of NN structure tends to be hampered due to factors such as the stochastic processes involved in NN leaming, the existence of multiple local optima in search of the NN intemal parameters, noise mithin leamng data set, values of leaming rates and me s of data transfer functioas One appealing solution is to obtain the acnial distribution of input sensitioiv for key input factors (if not dl) to be encouncered in operations. Matching the conesponding Monte Cario distributions obtained fiom BP hW to the actual distributions can seme as an effective means for optimization of the NN structure, in addition to validation of the NN model as discussed in Chapter 4. Hence, gathering quntitative information to fit actual disnibutions of input sensitivity could be included as part of data collection for NN applications in the future if both data and resources are available. Sensitivity Analysis of PINN Mode1 In the thesis, the PINN model's sensitivity to changes in its parameters is sll probed by tesMg the response of a manire n e ~ o r k on various input scenarios. One approach that have been tried is to take advantage of the sensitivity- analysis method for BP NN as proposed in the thesis to infer the input sensitivitg for a PINN model under the following conditions: 1. The BP NN and PINN are trained with the same leaming data set and tested with the same t e s kg data set, and both models are satisfactorily trauied; 2. And the point-value predictions of the BP NN and those of the PINN mode1 for the testing data set are very close. Thus, it can be assurned that the tnro models would "think alike" and have common input sensitivity for a pa15cular input factor. Table 5-1 shows the resulrs of five teshng records based on the training data for spool fabrication pr oduct i ~t y after PXO models have satisfied the abore conditions. Table 5-1: P NN vs. BP NN It is noted that the &st condition is not hard to satisfy, but the second condition rnay oot be readily met The BP NN and PINN may require to be trauied repetitively using different smictures and leaming parameten in order to satisfy both conditions. BPNN 0.239 O. 156 0.221 0.1 57 0.286 In the hure, it would be perfect if an independent approach could be found to explain the input sensitivity of the PINN mode1 analyticaily. PI NN (Mode) 0.205 0.145 0.235 O. 1 45 0.265 The applications of AI\TN in es t i t na~g labor productivity of indusnial consmiction prove that -ANN is effective in addressing the complesity and requiremenrs in the problem domain. It is hoped that the contributions made in the thesis research m d l make MWappealing to more engineering or business applications in the hture. &PEND= A: USER'S MANUAL FOR PINN TRAINER Probabilistic Inference Neural Nenvork (PINN) trainer is a genenc neural nenvorli training and testkg program developed based on a new NN scheme as proposed in Chapter 3 of the thesis. Step 1. Prepare data and import data into trainer The last column in a data table must be named as "Status", which hgs the training/testing statu for each record. Status 1 stands for a training record, and Status 2 for a testing record, and Status O for an ignored record. The nest-to-last column in a data table must be named as "Output", storing Actual Output Values of the target nsLy variables such as actual production rates. Ali the rernaining columns in a data table d be the input factors and no requirements are imposed on the narnes and relative order of columns. The trainer nrill automatically count the number of total inputs and read in data. The prepared data table for PINN must be imported to the database file 'T1INN.mdb7'. Step 2. Give a unique identifier key for a new training-testing trial A unique identifier key is used to dis~guis1-i each training-testing scenario or trial, whch is de6ned by the data source table to use, training / testing records within the data, the setup of the output zones, trnining parameters, and number of training iterations. hTaming convention requires no space in the key and numbers and short tests are dowed such as cc081899aWeld". For a previous trial, user can pick out the identifier key hom the drop-domn List. Next, user may click the buttons on the switchboard to check trzining results Figure A-1: Select an identifier key of one previous trial ("Check-Train" button), check t e s ~ g result ("Check-Test" button), and check global report about training and testing ("Global Report" button) as shown in Figure A-1. For a new &al, user needs to in a new idenfier key in the Identifier Key bos first. And then click "T~ain-Test" button on the switchboard to activate the program. Step 3 Select data table, edit training / testing status and setup output zones - - Figure A-2: User select~ data table User selects one data table fiom the &op-dom list of "Data Table Name" &sst (Figure A-?), and clicks the "Edit Stanis: Trah or Test" button to Bags training and r e s ~ g records (Figure A-3). The trainer d read in data and display the maximum and minimum of the output values for user to setup the output zones. Figure A-3: Flag status of records Setting up the output zones properly- and adequately is crucial to PlNN's performance. Too wde zone widths won't be adequate to help user make deasion, while too nanom zone midths d probably sacrifice the accuracy of PINN's prediction. The ollowing two issues should be taken into account: - Pression requirement of user. Here is a heuristic fonnula to appro-uuriate zone wdth: Zone Width= 0.4*0utpurRange*Ac~~1:aqThreshold - Distribution of actual output values over the output zones. A uniform distribution generally yields better results. Two approaches are avdable to set up the output zones: 1. User speuhes the number of output zones only. The trainer will evenly divide the actual output range into the number of output zones as user has specified, and automaticdly determine lower bound, upper bound, and mid value for each output zone. 2. User speuhes the lower and upper bounds of the output range and the width of each zone as well. The trainer d start from the lower bound of output range and detemine the boundaries of each output zone, und the upper bound of output value range is esceeded. Step 4. SpeciQ structure and learning parameters for PINN Following the setup of output zones, user shifts focus to the next page to speS a number of structure and leaming parameters for PINN iccluding the scale max and 151 min, the number of processing elements per output zone, the attraction rate and repulsion rate and conscience hctor for leanllng, the smoothing factor for kemel hct i on, and the accuracy threshold for performance rneasurement Figure A-4 shows the program screen, user may take the default values or set new values for those parameters. Refer to the technical paper and online help for detailed explanatioas of Figure A-4: Setup structure and learning parameters for PINN those parameters- Step 5 Speciw training itetations and train-test PINN Folloming settng the PLNN parameters, user shifts to the nest page to spec* Figure A-5: Specify training iterations and train-test PINN the training irerations by enterkg the "No. of Train Epochs", as shown in Figure A-5. Step 6 Investigate whether the PINN mode1 has been successfully trained FoUowing training and testkg, user clicks the "Check-Train" button and the "Check-Test" button on the switch board to view the results for maining data and '1 Training Data: Actua VS N N -Rediction f Training Records Probabiliy Density Graph 10 . Mode Avg E I ~ 1 0 1 5 1 I Figure A-6: Check training results testkg data respectively and investigate whether the neural netmork has been successEully trained- The majority of training records should indicate a centralized trend and the dismbution generated by PINN f d s in the correct output zones for their actual output values, as shown in Figure A-6. Othemvse, user should repeat fiom step 2 and perform another trial using different network structure and learning parameters, or increase leamng iterations. Note that on the *ht side of Figure A-6, the accuracy scatistics of point predictions are also induded for user to judge the model' performance or maturity. User observes the tesng data in a very sirnilar manner. Testing data speaks more in iaining Datas Actual VS N N Prediction . . Training Records Probability Oensiy Graph t o Figure A-7: Detected noise in training data j udpg the nenvotk's performance thaa training data because the traines bas not seen the t e s ~ g records in the learning process. In case that after a nurnber of different train-test mal s, for a pmxicular training record, the PINN indicates a very ar-off point prediction value (mode) comparing wth the actual output value, or die PINN demonstrates a very dispersed dismbution as shown in Figure A-7, such a record is likely tu be noise in the data and the data of the record should be esarnined for errors. Step 7 RecaU based on a trained PINN mode1 Once satisfactory results are obtained for both training and testing data, the Figure A-8: Global report for a train-test trial PINN mode1 is dedared to be crained and ready for developing a recd program. User clicks the "Global Report" button to review the infomiation about this trial as shown in Figure A-8. If user does not want to keep a trial any more, select the identifier key for the trial and click the "Delete Key" button on the switchboard to delete al I the data related to the trial. An on-Iine help is dso developed givng details about how to use the crainer output M ode m Weiphted Average Probabiiity Distribution Figure A-9: PINN Trainer on4ne help program dong mith some technical descriptions about the PINN mode1 as shown in Figure A-9. By selecting a category, related topics are fltered out; user chooses one topic of interes& the description d be automatically displayed- APPENDIX B: USERS' MANUAL FOR FAISMASTER FabMaster is a histoncal project data warehousing system customized for the Fabrication Facilities of PCL Industrial Constructors, Inc. It is a n automated data processing tool to estract raw data Gom Fabrication Resources Planning System, Wcld Tracking System, Labor Cost Contxol System, and conven raw data into aggregate quantiq data at spool level and various ratios of productivity, quality control and configuration complexity at cost-center level. A cost center, de he d by project number, material type and size range of spools, is the level of detail that actud labor hours were tracked in the corporate labor cost control systern. Two levels of compilation are involved in FabMaster to convert raw data into item-coded productivity information, i.e. the spool level and the cost-center level. The codiag systems for material type and size range of spools used in the Company and Fabbhstrr are shown in Table B-1 and B-2. The item codes for spool level compilaaon arc shown in Table B-3. Table B-1: Size Range Codes ID 1 7 - G 16 30 O Description < 2" - 7 -4" 6-14" 16-34" 30-48" To td Table B-2: Material Type Codes Table B-3: Item Codes for Spool Level Data Compilation ID Description Iescrip tion JO. of pipe pieces 'ootage of pipe pieces liameter Inch Fr of pipe pieces 'ons of pipe pieces cio. of pipe pieces longer hanG ft 'ootage of pipe pieces longer than G ft >iaIriT;t of pipe pieces longer than G ft 'ons of pipe pieces longer than G ft JO. of Stub Ends JO. of Branches (Olets) JO. of Caps/Plugs \io. of EIbows JO. of Swages \To. of Blind Flanges <o. of Dummy Legs JO. of S'&'/TH Couphgs JO. of Lap Joint Flanges h. of hnchors/Shoes/Slider Supports \To. of Nipples go. of O s c e F h g e \io. of Reducers \30. of Slip-on/SW/TH FLznge <o. of Tees \To. of Unions L'o. of Vdves <o. of Weld Neck Fhge s \To. of Laterds \To. of Pvlisc Items No. of Flanges No. of In-Line Fittings No. of Out-Line Fitrings No. of Supports No. of Design \Velds Diamecer Inch of Design Welds Equivalent Diameter Inch of Design Welds 1 ItemCode 1 Description Volume of Design \Vdds No- of BW Design \Velds Diameter Inch of BW Design \Velds Equivdent Diameter Inch of B\V Design Welds \'olume of BW Design \Velds No. of S\V Design \Velds Diameter Inch of S\V Design \Velds Equivalent Diameter Inch ofS\V Design \Velds Volume of SW Design \Velds No. of OL Design \VeIds Diameter Inch of OL Design \Velds Equivalent Diameter Inch of OL Design \Velds IToIume of OL Design \Velds No. of Pressure Amchmena in Design Welds Diameter lnch of Pressure Attachments in Design Welds Equivdent Diameter Inch of Pressure Attachments in Design \Velds CToIurne of Pressure Artachments in Design LVelds No. of Non Pressure Attachments in Design \Velds Diameter Inch of Non Pressure ,-mchments in Design \Velds Equivalent Diameter Inch of No Pressure ,.\mchrnents in Design \Velds Volume of No Pressure Attachments in Design \Velds No. of Positon Welds in Design \Velds Diameter Inch of Positon \Velds in Design \Velds Equivaient Diameter Inch of Positon \Velds in Design \Velds \'olume of Positon \Velds in Design \Velds No. of hidu-station Roii Welds in Design \Velds Diameter lnch of MuIti-staaon Roll \Velds in Design \Velds Equivalent Diameter Inch of hlulti-station Roll \Velds in Design \Velds Volume of hiulti-station Roii \Velds in Design \cVelds Volume ofTig Process in Design welds Volume of &Lig Process in Design \Velds Volume of FCAW Process ia Design \Velds Volume of Stick Process in Design \Velds VoIume of SubArc Process in Design Welds Volume of Rotoweld Process in Desiga \Velds No. of Reworked Welds 3iameter Inch of Remorked \Velds 3quivaient Diameter Inch of Reworked LVelds Description Volume of Reworked Welds No. of Cut Sheet Revisions Spool LVeight in Tons RT percenc per Spec MT percent per Spec M' percent per Spec Phi1 percent per Spec P K " 1 / O per Spec FT percent per Spec BHN \T percent per Spec UT percenc per Spec No. of Accepted Weids \Veld Units in SpooI Weight of Non-pipe in Spool No of Spools in a matenal-size group Step 1 Download Raw Data from Corporate Management Systems into FabMaster Si - raw data tables of a project required for FabMaster to process are dkectly downloaded Erom the corporate databases of various management systems in electronic formats, namely RD-BranchPlant-SP (Spool) and RDBranchPlant-MT (Pieces) Grom J. D . Edsvards matcrial resources planning sys tem, plus RDBranchPlant-DG (Dra\ving), RD-BranchPlant-SC (Spec) , RDBranc hPlant-WD (Weld De tails) and RDBranchPlant-WW Weld Welders) Gom WeldTrack quality control sys tem, as s homn in the program flow chart of FabMas ter (Figure B-1). User enters the project number and clicks the "Lnk RD Tables" to import raw data tables. If al! needed tables are in phce, user kicks off the program Bow by hitting die 'Trocess if' button. JDEdwards RD-B ranchPlant-SP (SpooI) RDBranchPht-MT (Pieces) based on 23 Pass aU checks- 1st level cornpihaon & deril-ed qcy cdcuhtion 2nd level compilation based on Spool Item Code Structure 3rd level compilation based on Job-Material- Cost Code Structure Figure B-1: Program Flow Chart of FabMaster Step 2. Data Validation Based on Pre-defined Rules The foIlowing d e s were programmed in FabMaster to detect abnormalities and prompt user to scrub raw data of errors prior to processing. 1. No blank is dowed in ptem Number] in RD-BranchPlanthfT. 2. No blank is allowed in p e l d Process] in RD-BranchPlant-WD. 3. PVeld Process] in RDBranchPlant-\JCTD must be a known process-combination in table LP~WeldProcessCombo. 4. No blank is aliowed in goint Type] in RDBranchPlant-W. 5. (loint Type] in RDBranchPlantWD must be a known type in table LP-Join tType-PANP . 6. No blank or O is dowed in p e l d Size] in RDBranchPlant-W. 7. [Weld Size] in RD-BranchPlant-WD mu t be a known s i x in table LP-PipeOD. 8. No blank or O is allowed in p e l d Thickness] in RD-BranchPlant-\W. 9. No redundant spool is dowed to exist in RDBranchPlant-SP. 10. No blank or O is allowed in [Spool Weight] in RDBranchPlant-SP. 11. No blank or O is allowed in [Spool Units] in RDBranchPlantSP. 12. No blank or O is allowed in [Size Group] in RD-BrmchPlant-SP. 13. [Size Group] in RD-BranchPht-SP must be a h o wn one in table tP-SizeGroup. 14. No blank or O is allowed in pl at er d Group] in RDBranchPlant-SP. 15. platerial Group] in RDBranchPlant-SP must be a h o wn one in table LP-LateridGsoup . 16. [Spool Nurnber] in RDBranchPlant-SP must have a corresponding record in RDBranchPl ant DG. 17. No blank is allowed in [Spec] in RDBranchPlant-SC. 18. [Spec] in RDBranchPlant-DG rnust have a corresponding reference in RDBranchPlantSC. 19. [Spool Number] in RDBranchPlant-SP must have a corresponding record in R D B ranchPIant-MT. 20. PVeld Thichness] in RDBranchPlant-Cm like 3000/6000 must be able to be converted to inches by tinding a reference Li LP-PipeThickness. 21. p e l d Thichness] in RD-BranchPlant_WD must not be abnormaily large (greater than 5"). 22. PVeld Size] in RDBranchPlant-WD must be able to be converted to Equivalent Diameter Inches by hnding a reference in LP-PipeThickncss based on size and thickness. 23. For an Olet type weld, a reference in LP-OletDkn based on weId size must be found. . .- - , - . . . . , - - _ r - . nter Branch Piant Nor . - 11709286 ~hadc Q W ~ ~ M ~ T T , ~ - - - - - - - - - - - . - - - . -- - - - - - - -- -. .. - . . . , . . Figure B-2: Main User Interface of FabMaster FabMaster d hint user about the detected problem records, violated d e s and solutions (either update cross-reference tables or correct ram data tables) in its progress monitor mindows as showo in Figure B-2. To resume the program flow after hxing problems, user needs to hit the "process it" button again to continue the process from where it paused 1s t time. Step 3 Unithe a project and perform spool Ievel compilation Once dl the checks on raw data are passed, FabMaster runs its built-n prograrns to automatically unitize a project into "Fabrication units", compute the quantities of various work items in pre-specihed units of measure, and store the resdts into four temporaq 268 tables, namely RD-Spool, mePiece, RDomection, and RD-Weld. The hrt level compilation is conducted based on those temporary tables to generate the item-coded aggegate data for each spool and appended to spool s umr nq table "SMJternQty-SP". Step 4. Compile data at cost centet level and compute ratios The cost-center level data compilation and ratio computation follows the spool lerel data processing and the resdts d be appended to tmo surnmarg tables "SM-ItemQty-CC" for aggregate quantities, and "SM-ItemQy-RT" for final rsrios. Table B-4 shows samples of Sh.I_ItemQty-RT" based on one small project nrith one materid type and one size range only. A big project often has more than one material types and size ranges of spools. FabMaster d generate valid ratios only for a speufic project and automaticdy handle the roll-up of ratios to various total levels. Table B-4: Sample of FabMaster Outputs Matctial To t d To ta1 To td To tal Total Tom1 Total Total Total Total Total Total T o d T o d Total Total To ta1 Total Total Total To tal Total Total Total Total Total Total T o d Total T o d Totd Total To t d Total - Size T o d Total Total Total Total Total Total Total Total To ta1 Total Total Total Total Total Total Total Total Total Total Total Total T o d Total Total Total To ta1 Total Total Total T o d To ta1 To ta1 To ta1 - Ratio Description Total hfanHours / DiaIn*Ft Total LLfanHours / Equiv.DiaInrFt Total h l d o u r s / Equiv-DiaIn Totd Md o u r s / Volume Total PvfanHours / Unit No. of Pipe Pieces / Footage No. of Pipe Pieces / DhInFt No. of Pipe Pieces / Ton No. of Pipe Pieces / Unit No- of Pipe Pieces Over 3 ft / Footage No. of Pipe Pieces Over 3 fr / DiaInFt No. of Pipe Pieces Ov e 3 ft / Ton No. of Pipe Pieces over 3 ft / Unit No. of m g e s / Footage No. of Flanges / DhlnFt No. of Fianges / Ton No. of k n g e s / Unit No. of In-Liae Fittings / Footage No. of In-liae Fittings / DiaInJ3t No. of In-Line Finings / Ton Yo. of In-line Filtings / Unit No. of Non-In-Line Fittings / Foomge No. of Non-In-Line Fimings / DiaInFt 'JO. of Non-In-Lide Fittbgs / Ton go. of Non-In-Line FilMgs / Unit 'JO. of Valves / Foomge So. of Valves / DiaInFt 'To. of Ir&-es / Ton %o.ofVdves / Unit %o. of Supports / Footage go. of Supports / DialnFc 30. of Supports / Ton Vo. of Supports / Unit No. of PrLisc. / Footage Ratio Material Total T o d To ta1 Total Total To td Total ' Fod Total Total T o d Total Total Total Total Total Total Total Total Total T o d Total T o d Total Total Total Total Total To ta1 To ta1 Total Total Total Total Total Total To ta1 Total To ta1 Size Total Total Total Total Total Total Total Total Total Total T o d Total Total T o d To ta1 Total Total Total Total Total Total Total Total Total Total Total To ta1 Toul Total T o d Total Total Total Total T o d T o d Toral Total Total Ratio Description No. of hkc. / Di dnFt No. of LLLISc. / Ton No-ofhlisc. / Unit Pipe LVeight / Spool Weight Non-Pipe \Veight / Spool LVeight No. ofConnections (Design LVeIds) / Footage No. of Co ~ e c a o n s (Design \Velds) / DiaInFt No, of Comecuons (Design \Velds) / Ton No. of Co ~ e c u o n s (Design \Velds) / Unit No. of Multi-Station ROLL\Velds / Footage No. of hldti-Station Roll \Velds / DiaInFt No. of Mda-Station Roll CVelds / Ton No. of Mda-Station Roll \Velds / Unit No. of Repaired Welds / Footage No- of Repriired \Velds / DiaInFt No- of Repaked \Velds / Ton No. of Repair \Velds / Unit B\V DiaIn / Design \Vdd DiaIn B\V Equiv-Dan / Design \Veld Equiv-DiaIn BQ' Vol. / Design \Veld Vol. SV D d n / Design \Veld DiaIn SV7 Equiv-DiaIn / Design Weld Equiv-DiaIn SW' 1'01. / Design \Veld Vol. 3L DiaIn / Design Weld DiaIn 3 L Equiv.DiaIn / Design \Veld Equv.DiaIn 3 L lrol. / Design \Veld lTol. Pressure Atmchment / Design \Veld DiaIn Pressure Anachrnent / Design \Veld Equiv.DiaIn Pressure Anachment / Des& \Veld Vol- Xon Pressure At t achent / Design \Veld DiaIn Non Pressure Attachment / Design \Veld Equiv.DiaIn %on Pressure Amchment / Design \Veld Vol. Position Weld / Design \Veld DiaIn Posiaon \Veld / Design \Veld Equiv-Ddn Position \Veld / Design Weld Vol. holl Weld / Design \Veld DiaIn boll Weld / Design Weld Equiv-DiaIn Kou \Veld / Design Wdd Vol. khlti-Sraaon Roll Weld / Design Weld DiaIn Ratio 9.232899E-05 0.0583273 5.403945E-04 0.9404383 5.956174E-02 5.786802E-O1 9.647334E-03 6.095 192 5.6471 22E-02 2.353484E-01 3.923557E-03 2.478906 2.296676E-02 3.322566E-03 5.539 139E-04 0.3499632 3.342367E-03 0.9782972 0.9782972 0.994041 3 1.335559E-O2 1 -335559E-02 3.170402E-O3 t3.34724GE-03 8.34734GE-03 2.78841 GE-03 0.9866444 0.986644 0.9968396 1.335559E-02 1 -335559E-02 3.1704OE-O3 O O O Material Total Total Total Totd T o d Total Totai Total Total Total Total Total Total TGLZ Total Total Total Total To td Total Total To ta1 Total Total To ta1 Total Total Total Total Total To-d To td Total Total Total Total Total Total Total Size - To ta1 Total Total T o d Total Tord Total Total Tod Total Total T o d T o d To tai Total T o d To ta1 To ta1 Total To ta1 Total Total Total Total 6- 14" 6-14" 6- 14" 6-14" 6-14" 6-14" 6- 14" 6-14" 6-14" 6-14'' 6-14" 6-14" 6-1 4" 6-13" 6-14" - Ratio Description Mulu-Station Roll \Veld / Design \Veld Equiv-DiaIn hfuIu-S tation Roll \Veld / Design Weld VOL Single-Station Roii \Veld / Design WeId DiaIn Single-Station Roii \Veld / Design \Veld Equk-.DhLn Single-Station Roli \Veld / Design \VeId Vol. Ti Process iveid / Design \Veld Vol. hLig Process Weld / Design \Veld Vol. FU\ V Process \Veld / Design \Veld VoI. Stick Process \Veld / Design Weld Vol. SubArc Process Weid / Design \Veld Vol. Rotoweld Process Weld / Design \Veld 1'01. Repair Rate (No. of R / R+A) No. of Cut Sheet Revision / No. of Spool RT rate /Spool MT rate /Spool P T rate /Spool PMI rate /Spool P\VHT rate /Spool F rate /Spool B K N rate /Spool V T rate /Spool UT rate /Spool Non-Welded Spool/\VeIded Spool (Weight) Non-LVelded Spool/WeIded Spool pnits) No. of Pipe Pieces / Footage 'io. of Pipe Pieccs / DiaInFt 'JO. of Pipe Pieces / Ton Xo. of Pipe Pieces / Unit ?JO. of Pipe Pieces Over 3 ft / Footage 'JO. of Pipe Pieces Over 3 ft / DiaInFt Yo. of Pipe Pieces Over 3 ft / Ton 30. of Pipe Pieces over 3 ft / Unit go. of FLmges / Footage go. of Flanges / DiainFt 90. of FIanges / Ton 30. of Fianges / Unit 30. of In-Line Fittings / Footage 90. of l n - h e Fitnngs / DiaInFt 30. of In-Line Fittings / Ton Ratio 0.4257095 0.432653 0.5742905 0.5742905 0.567347 0.106721 1 0 0 0.893279 0 0 2.933985E-O2 0 100 100 0 1 O0 1 0 100 100 100 0 0 5.3 1 61 OSE-02 8.812613E-03 5.59941 1 5.187787E-02 5.0669 13E-02 8.M7 187E-03 5.336939 4.944609E-02 5.53761E-03 9-33 1899E-04 0.583272 5.303945E-03 l.lO752Z-O3 1 -84638E-04 0.1166544 Material Total Total T o d Tord Total Total Total Total Total Total Total Total Total T o d Total T o d To ta1 Tord Tord Total Total Total Tord To ta1 Total Total Total T o d To ta1 Total Tord Total To ta1 To ta1 To td Total Total Total Total Ratio Descripaon No. of In-Line Filtings / Unit No. of Non-in-Le Fittings / Footage No. of Non-In-Le Fittings / DiaInFt No- of Non-In-Line Fitngs / Ton No. of Non-l n-Le Filtings / Unit No. ofTrdves / Footage No. of lrdves / DiaInFt No. of Vaives / Ton No-of Valves / Unit No. of Supports / Footage No. of Supports / DiaInFt No. of Supports / Ton No. of Supports / Unit No. of blisc. / Footage No. of Ilfisc. / DialnFt No. of blisc. / Ton No-of blisc. / Unit Pipe Weight / Spool LVeight Non-Pipe LVeight / Spool Weight No. of Connections (Design \Velds) / Footage No. of Connections (Design \Velds) / DiaInFt So. of Connections (Design \Velds) / Ton 30. of Connections (Design \Velds) / Unit No. of Mulu-Station Roll \Velds / Footage 30. of hlulti-Station Roll \Velds / DiaInFr No. of Mulu-Station Roll !Velds / Ton No. of blulti-Scation Roll Welds / Unit %o. of RepaLed \Velds / Footage 'sio. of RepaLed \Velds / DidnFc go. of Repaired Welds / Ton I\To. of RepaL Welds / Unit 3W DiaIn / Design Weld DiaIn 3\V Eq~~iv.DhIn / Design Weld Equiv.DiaIn 3W Vol. / Design Weld Vol. SWDiaIn / Design Weld DiaIn ;W Equiv.DkIn / Design Weld Equiv.DiaIn 337 Vol. / Design Weld Vol. 3L Di dn / Design Weld DiaIn 3L Equiv-DiaIn / Design Weld Equiv-DiaIn Ratio 1.080789E-03 5.2053536-02 8-6779856-03 5.482757 5.079708E-02 0 0 0 0 0 0 0 0 5.5376 lE-04 9.23 1899E-05 0.0583272 5.403945E-04 0.9404383 5.956 174E-02 5.786802E-02 9.647334E-O3 6.095 192 5.64712Z-O2 2.353484E-03 3.923557E-03 2.478906 329667GE-03 3.333566E-03 5.539139E-04 0.3499632 3.242367E-03 0.9782972 0.9782972 0.9940413 1.335559E-02 1.335559E-02 3.170402E-O3 8.347246E-03 8.347246E-03 Material T o d T o d Total Tord Total Total Total Total Totaf Total Total Total Tord Total Tocd Total Total Total Total Total Total Total Total Total Total To ta1 Total Total Total Total Total Total Total Total Toul Total T o d Total AS WOY) Ratio Description OL Vol. / Design LVeld VOL Pressure Attachment / Design \Veld D d n Pressure Amchment / Design Weld Equk.DiaIn Pressure +.\nachment / Design Weld VOL Non Pressure Amchment / Design Weld Didn Non Pressure Attachment / Design Weld Equiv-Dialn Non Pressure Atmchment / Design \Veld Vol. Position \Veld / Design Weld DiaIn Position \Veld / Design LVeld Equk-Dialn Posiuon \Veld / Design Weld VOL Roll Weld / Design Weld DiaIn Roll LVeld / Design \Veld Equiv-DiaIn Roll \Veld / Design Weld Vol. Multi-Station Roll Weld / Design \Veld DiaIn blulti-Station RoU [Veld / Desip WeId Equiv-DiaIn Mula-Smtion RoU \Veld / Design T'Veld Vol. Single-Station Roll Weld / Design \Veld DiaIn SingIe-Station RoU Weld / Design Weld Equiv-DiaIn Single-Station Roll Weld / Design \Veld 1'01. T'ig Process Weld / Design \Veld 1'01. Mig Process Weld / Design \Veld Vol. FCAW Process Weld / Design Weld Vol. Stick Process \Veld / Design Weld 1'01. SubAxc Process \Veld / Design \Wd 1'01. Rotoweld Process [Veld / Design Weld TToL Repair Rate (No. of R / R+A) NO. of Cut Sheet Revision / No. of Spool RT rate /Spool MT rate /Spool PT rate /Spool PMI rate /Spool PWWT rate /Spool FI' rate /Spool B H N rate /Spool VT rate /Spool U T rate /Spool %on-Welded Spool/\Velded Spool (Weight) Son-Welded Spool/Welded Spool (Units) rotal bIaaHours / Di;iIn*Ft Ratio 2.78841 6E-03 0.986644 0.986644 0.9968396 1 -335559E-02 1 -335559E-02 3.170402E-O3 0 0 0 1 1 1 0.4257095 0.4257095 0.432653 0.5742905 0.5742905 0.567347 O. 106721 1 0 0 0.893279 0 0 2.933985E-O2 0 1 O0 1 O0 0 1 O0 1 0 1 O0 1 O0 100 0 0 1 -705658E-04 Material AS (MoY) AS (Moy) AS (Moy) AS (,.Vloy) AS (AUoy) *AS(.!!oy) AS (Moy) AS (MOT) AS (-Uoy) AS (hlloy) AS ( M~ Y ) -1s (,-Uioy) AS (Alloy) -AS (-Moy) AS (Aoy) AS (Mo);) AS (Moy) AS (Ailey) AS (AUoy) *AS(.Woy) -4s (Moy) AS (Alloy) AS (Moy) AS (Moy) AS (Alloy) AS (AUoy) AS (ruloy) AS (Alloy) -4s (-Uoy) AS (Moy) AS (Moy) AS cMoy) -AS (May) .!AS (-Uoy) AS WOY) AS W ~ Y ) AS ( M~ Y ) AS (Moy) AS W ~ Y ) Size Total Total Totd Total Total T o d T o d Total Total Total Total To td T o d T o d Totai Total Total Total Total Total Tom1 To ta1 Tocal Total Total Totai To td Total Total To td T o d Total Total Total Total Total Total T o d T o d Ratio Description Total hfanHours / Equiv.Ddn*Ft Tot d hlaaHours / Eqvlv-DiaIn Total hianHours / Total ManWours / Unit No. of Pipe Pieces / Footage No- of Pipe Pieces / DiaInFt No. of Pipe Pieces / Ton No. of Pipe Pieces / Unit No. of Pipe Pieces Over 3 ft / Footage No. of Pipe Pieces Over 3 t / DiaInFr No. of Pipe Pieces Over 3 ft / Ton No. of Pipe Pieces over 3 ft / Unit No. of Fimges / Footage No. of b g e s / DtaInFt No- of b g e s / Ton No. of b g e s / Unit No. of In-line Fittings / Footage No. of In-Line Fittings / DiaInFt No. of In-line Fittings / Ton No. of In-Line Filtings / Unit No- of Noa-In-Line Fittings / Footage No. of Non-ln-Line Fittings / DiaInFt No. of Non-In-Line Fittings / Ton No. of Non-In-Line Filtings / Unit No. of Valves / Foomge No. of Valves / DhInFt No. of lrdves / Ton No-of Valves / Unit No. of Supports / Footage No. of Supports / DiaInFt No. of Supports / Ton No. of Supports / Unit No. of bLisc. / Footage No. of blisc. / DhInFt No. of PvL-c. / Ton No-of PvLisc. / Unit Pipe Weight / Spool LVeight Yon-Pipe Weight / Spool Weight 30. of Connections (Design Welds) / Foomge Ratio 1 -705658E-04 0-6 160267 2.657094 0.1394056 5.316105E-03 8.863633E-03 5.59941 1 5.1 87787E-02 5.0663 13E-02 8.U7 187E-03 5.336939 4.9UG09E-02 5.5376 1 E-03 9231899E-04 0.583272 5.103945E-O3 1 .IO7533E-O3 1 -84638E-O4 0-1 166544 1 -080789E-03 5205353E-02 8.677985E-03 5.482757 5.079708E-02 O O O O O O O O 5.53761E-04 9.232 899E-O5 0.0583772 5.403945E-04 0.9404383 5.956174E-O2 5.786803E-02 Material As (MoY) AS (Moy) AS Woy) AS (Moy) AS (-Uoy) AS (MoY) AS (Alloy) AS (MoY) AS (Moy) AS (Moy) AS (Mol-) AS (Moy) AS (Moy) AS (UOY) AS (-Uoy) AS (Moy) AS (rUoy) AS (Moy) AS (Noy) -4s (i-uloy) AS (Moy) AS (Moy) AS (iUoy) AS (*AlIo).) AS (hlloy) AS (Alloy) AS (LUoy) AS (-Uoy) AS (-AUoy> AS (Moy) AS (Moy) -AS (rvloy) AS ( M~ Y ) AS (Moy) AS Wo y ) AS (May) -4s (Moy) AS (Ailey) AS (Moy) Size - Total Total Total Total Total Total T o d T o d Total Total Total Total To tai Total Total Total T o d Total T o d Total Total T o d Total T o d T o d To ta1 Totai Total Total Total Total Total T o d Total Total Total r o ta1 Total rotal - Rao Description No. of Connecuons (Design LVelds) / Dk-JInFt No- of Co~e c t i ons (Design \Velds) / Ton No. of Co ~ e c o n s (Design Welds) / Unit No- of Multi-Station Roll Welds / Footage No- of Pvfulti-Station Roll WeIds / Dd n F t No. of Mulu-Station Roll LVelds / Ton No. of Mulu-Station Roll [Velds / Unit No. OF Repaired !Velds / Footage No. of Repaired Welds / DiaInFt No. of Repaired Welds / Ton No. of Repair \Velds / Unit BW' DiaIn / Design Weld DiaIn BW Equiv-Diain / Design Weld Equiv-DiaIn BK' Vol. / Design WeId Vol. SLVD a n / Design Weld DiaIn S\V Equiv-DiaIn / Design \Veld Equiv-DiaIn SV7 1'01. / Desiga Weld Vol. OL DiaIn / Design Weld Di ah OL Equiv-DiaIn / Design \Veld Equi vDkIn OL Trol. / Design Weld Vol. Pressure At t achent / Design \Veld DiaIn Pressure Attachment / Design LVeld Equiv-DhIn Pressure Attachment / Design \Veld Vol. Son Pressure Amchment / Design \Veld DiaIn Non Pressurc Attachment / Design Sreld Equiv-DiaIn Non Pressure Attachment / Design \Veld Vol. Posion \Veld / Design \Veld DiaIn Position \Veld / Design \Veld Equiv-DiaIn Position \Veld / Design Weld Vol. Roll Weld / Design \Veld DiaIn Roll Weld / Design Weld Equiv.DiaIn Roll Weld / Design Weld Vol. Mulu-Staaon Roll Weld / Design \Veld DiaIn bluia-Station Roll 'LVeld / Design WeId Equiv-DiaIn Uula-Station Roll Weld / Design Weld Vol. Single-Station Roll Weld / Design Weld DiaIa Single-Station Roll Weld / Design Weld Equv.DiaIn Single-S tauon RolI \Veld / Design \Veld TroL 'ig Process Weld / Design WeId Vol. Ratio 9.647334E-03 6.095 192 5.647 133E-02 2-353484E-O2 3-923557E-O3 2.478906 2296676E-02 3.333566E-03 5.5391 39E-04 0.3499633 3.2423G7E-03 0.9782972 0.9782972 O.9WO413 1.335559E-03 1.335559E-02 3.170402E-03 S.34724GE-03 8.347246E-03 2.78841 GE-O3 0.9866-l-N 0.9866CW 0.8968296 1.335559E-02 1.335559E-02 3.1 70402E-03 O O O 1 1 1 0.4257095 0.4257095 0.432653 0.5743905 0.5742905 0.567347 0.206721 1 Material AS (Moy) AS (Noy) AS Woy) AS (Moy) AS (Mo).) AS (Mo).) AS (Moy) AS (rvloy) AS (Moy) AS (Aioy) AS (Moy) AS (Noy) Lis (Moy) AS (-Uoy) Lis (hlloy) Lis (Moy) AS (Uoy) AS (rvloy) AS (Moy) AS (Noy) AS (May) AS (Moy) AS (May) ,AS (Moy) AS (Moy) AS (Moy) AS W ~ Y ) AS (AUoy) AS (Moy) AS (ALioy) AS (LUloy) AS (Alloy) AS (Alioy) AS (Uoy) AS (May) AS (Alloy) AS P ~ Y ) AS (Aiioy) AS (Alloy) Size - To tai Total Total Total Total Total Total Total Total Total Total Total Total Total Total Total Total Total 6-14" 6-14" 6-14" 6-1 4" (5-14" 6- 14" 6-14" 6-14" 6-14" 6-14" 6- 14" 6-14" 6- 14" 6- 14" 6-14" 6-14" 6-14" 6-14" 6-14" 6-14" 6-14" Ratio Description Mig Process \Veld / Design Weld Vol. F=\V Process Weld / Design Wdd Vol- Stick Process !Veld / Design Wdd Vol. Subhrc Process \Veld / Design Weld VOL Rotoweld Process Weld / Design Weld 1701- Repait: Rate (No. of R / R+A) No. of Cut Sheet Revision / No. oESpool RT rate /Spool MTrate /Spool PT rate /Spool Ph11 rate /Spool PWHT rate /Spool FT rate /Spool B HN rate /Spool ''T rate /Spool UT rate /Spool Non-LVelded Spool/Welded Spool (Weiglit) Non-LVelded Spool/LVelded SpooI (Units) No. of Pipe Pieces / Footage No. of Pipe Pieces / DialnFt 30. of Pipe Pieces / Ton No. of Pipe Pieces / Unit No. of Pipe Pieces Over 3 Ft / Footzge 'JO. of Pipe Pieces Over 3 ft / DaInFt No. of Pipe Pieces Over 3 Ft / Ton So. of Pipe Pieces over 3 ft / Unit Mo. of Fianges / Footage 'To- of Fhnges / DiaInFr 30. of Fhnges / Ton No. of Fhnges / Unit Mo. of In-Line Fitlingj / Footage No. of In-Line Fittings / DialnFt No. of In-line Fittings / Ton qo. of In-Line Piltings / Unit go. of Non-In-Lne Fitaogs / Footage qo. of Non-la-Line Fittligs / DiaInFt go. of Non-In-Line Fittings / Ton go. of Non-In-Line Filtings / Unit \JO. of Valves / Footage Ratio Materid AS (Moy) AS (Moy) AS (Alloy) AS (Ailey) AS (r\Uoy) AS (AlIoy) AS (Moy) AS (,-Vloy) AS (~Uoy) AS (AUoy) AS (,-Vloy) AS (Alloy) AS (Moy) AS (-Uoy) AS (Moy) -4s (Alloy) AS (-Uoy) AS (Moy) AS (Auoy) AS GWoy) AS (Moy) AS (Moy) AS (AUoy) AS (Alloy) AS (,illoy) AS (Moy) AS (Moy) AS Woy) AS (Alloy) Lis (Auoy) AS (Lioy) AS WOY) AS (,.Vloy) AS P O Y ) AS (Moy) AS Woy) AS (Mo y) AS (May) AS (May) Size - 6-14" 6-14" 6- 14" 6-24" 6-14" 6-14" 6-14" 6-1 4" 6-14" 6- 14" 614" 6-14" 6-14" 6-14" 6-14" 6-14" 6- 14" 6-1 4" 6-24" 6- 13" 6-14" 6-13" 6-1 4" 6-14" 6-14" 6-14" 6-14" 6-14" 6-14'' 6-14" 6-14" 6-13" 6-13" 6-14" 6-14'' G- 14" 6-14" 6- 14" G- 14" Ratio Description No. of Valves / DialnFt No. of Valves / Ton No-of Trdves / Unit No. of Supports / Footage No. oESupports / DiaInFt No. of Supports / Ton No. of Supports / Unit No. of Mise. / Foomge No. of hlisc. / Di aI f i t No. of bLisc. / Ton No-of hiisc. / Unit Pipe Weight / Spool Weight Non-Pipe \Veight / Spool LVeight No. of Connections (Design \Velds) / Foocage No. ofcomections (Design \Velds) / DiaInFt No. of Co~ect i ons (Design \Xelds) / Ton No. of Comrctions (Design Welds) / Unit No. of hldti-Station Roll \Velds / Foomge No- of bfulu-Scation Roll \Velds / DiaInFt No. of Mdt i - kaon RoU \Velds / Ton No. of hfulu-Station Roll \Velds / Unit No. of Repaired \Velds / Footage No. of RepaLed Welds / DiaInFt No. of Repaired \Velds / Ton No. of RepaL \Velds / Unit B\V DiaIn / Design \Veld DiaIn B\V Equiv-DiaIn / Design \Veld Equiv.Dia1n B\V Vol. / Design \Veld Trol. S\V DiaIn / Design \Veld DiaIn SWEquiv.Didn / Design Weld Equiv-Ddn S\V Vol. / Design \Veld Vol. OL DiaIn / Design \Veld DiaIn OL Equiv.DiaIn / Design Weld Equiv.DiaIn OL Vol. / Design \Veld Vol. Pressure At t achent / Design Weld Dialn Pressure Attachment / Design Weld Equiv.DiaEn Pressure Attachrnent / Design \Veld VOL Non Pressure Attachmenr / Design 'WeId DiaIn Non Pressure Attachment / Design WeId Equiv.Dialn Ratio Material AS (LLUoy) AS (U~OF) AS (Moy) AS (Moy) AS (MoY) AS (Uloy) AS (Moy) A4S (,.Ulo:.) AS (Moy) AS (AUoy-) -4s (-Uoy) -4s (Moy) AS (+Uoy) AS (Uloy) AS (Moy) AS (-Uoy) -4s (Moy) AS (,-Uioy) AS (Uo) AS (Alloy) '4s (*Uoy) AS (Moy) AS (Noy) AS (Alloy) AS (Alloy) AS (-Uoy) AS (Noy) ,AS (Mo).) '4s (-Uoy) AS (-Uoy) AS (Moy) AS (floy) Size - 6-14" 6-14" 6-14" 6- 14" 6-14" 6-14" 6- 13" 6-14" 6-14" 6-14" 6-14" 6-14" 6- 14" 6-14'' 6- 14" 6-13" 6- 14" 6- 14" 6-14" 6-14'' 6-14" 6-14" 6-14" 6-14" 6-14" 6-14" 6-14" 6-14" 6-14" 6-14" 6-14" G-14" Ratio Description Non Pressure Attachent / Design \Veld Vol. Position !Veld / Design \Veld Diain Position LVeld / Design Weld Equiv-DiaIn Position \Veld / Design \Veld Vol. Roll \Veld / Design \Veld D d n RoU\Veld / Design LVeld EqukDiaIn Roll Weld / Design \Veld VOL blulti-Station Roll \Veld / Design \Veld DiaIn hlulti-Staaon Roll \Veld / Design Weld Equiv-DiaIn hlulti-Station Roll Weld / Design \Veld Vol. Single-Station Roll \Veld / Design \Veld DiaIn Single-Station Roll \Veld / Design Weld Equiv-Dialn Single-Station RoU \Veld / Design \Veld Vol. rig Process Weld / Design \Veld Vol. 4% Process Weld / Design Weld Vol. FCALVProcess \Veld / Design Weld VOL Stick Process \Veld / Design \Veld Vol. SubArc Process \Veld / Design Weld Vol. Xotoweld Process Weld / Design \Veld Vol. XepaL Rate (No. of R / Rt-\) go. of Cut Sheet Revision / No. of Spool XT rate /Spool b1Trate /Spool ?T rate /Spool 'MI rate /Spool ?\VHT rate /Spool T' rate /Seo01 3HhTrate /Spool 4T rate /Spool JT rate /Spool \ion-Welded Spool/CVelded Spool (YVeight) \ion-ivelded Spool/Welded Spool (Uni=) Ratio 3-170402E-03 O O O 1 1 1 0.1257095 0.4257095 0.432653 0.5742905 0.5732905 0.567347 0.106721 1 O O 0.893279 O O 2.933985E-02 O 100 100 O 1CO 1 O 100 1 O0 100 O O FabMaster processed project data individually and warehoused the item-coded project information in an easy-to-access format. Fab-OLAP provides the functionaliq of vieming and analyzing productiviq-related information across dl projects that have been processed bp FabMaster. Fab-OLAP is an On- he Andytical Processing System custom-developed for the Fabrication FaUlities of PCL Indusmal Constructors, Tnc. The system feanires dynamic query, graphic presentation, and the functionality of statistical anaiysis on 105 ratios of labor productivity/spool configuraton compleSq/qualiq control. It is an advanced decision-support tool for management to grasp the trend in the historical project data and identifg exceptional problems in the work at hand. Figure C-1: Select one ratio Step 1. Load the program and select one ratio User selects one ratio from the "Select Ratio" dropdomn Est, which includes all the 105 ratios computed in FabMaster, as shown in Figure C-1. Step 2. Apply Filters on Material Type and Size Range Fab-OLAP uses the standard codes of the Company for the material types and size ranges. Fab-OWP helps user e-xplore data in decision-oriented mays and allows user to view data and get at them fomi different perspectives dong the dimension of 181 Figure C-2: Trial on ccnumber of pipe pieces per foot" material and size. The histogram dong &th ~ t a t i ~ t i d analysis results for the selected ratio is presented on screen and updated automatically. Figure C-2 shows the nial based on "carbon steel 6-14 inch spool, nurnber of pipe pieces per foot of pipe". Step 3. Drill into details of data Fab-OUP doms user to drill dom to details of data by clicking the Ti e w Data" button. Figure C-3 shows the data behind the selected ratio. 1 7250 / CS (Carbon] 6-1 4" j No- af Pipe Pieces / Footage i 0.13346798! - 1 700204j CS (Carbon] 6-1 4" I + No. of Pipe Pieces / Footage : 3.660536~-02; 1 7002551 CS [Carbon) 1 6-1 4" i No- af Pipe Pieces / Faotage 1 O.OU4265; 1 7002651 CS (Carbon) 1 6-1 4" !No. of Pipe Pieces / Footage ' 4.8767 1 7E-02; 17002341 CS (Carbon) 1 6-1 4" 1 No. of Pipe Reces / Foatage 0.0488468 3 17004781 CS (Carbon] 16-1 4" No. of Pipe Pieces / Foatage 6.061 01 9E-M! I 1 700205; CS (Carbon] / 6-1 4" ' No. of Pipe Pieces / Footage O. 0656395 3 i 1 ' 1 700474[C [Carbon] 6-1 4" j No. of Pipe Pieces / Footage p 6.587098~-021 1 17004621 CS [Carbon] 6-1 4" : l No. of Pipe Pieces / Footage 16.762063E-021 1700466 j CS [Carbon] 6-1 4" i No. of Pipe Pieces / Footage 7.231 822E-Mf 1 700242! CS [Carbon) / 6-1 4" i No. of Pipe Pieces / Fooiage / 7.61 0802E-021 1700206j CS [Carbon] 1 6-1 4" b No. of Pipe Pieces / Footage 7 2931 4~-021 1 1700481 j CS [Carbon) i 6-1 4" No. af Pipe Pieces / Footage 7.750466E.O2] ' 1700211 i CS [Caban) j 6-1 4" No. af Pipe Pieces / Fmtage .8.051168E-021 i 1700464 [ CS [Cabon) 1 6-1 4" 'No. of Pipe Pieces / Footage 8 496705E-Mi 1700491 ! C (Carbon) 1 6-1 4" ' No. of Pipe Pieces / Footage 8 67351 4E-02: 1 1 70021 3; CS (Caban] 16-14'' :No. of Pipe Pieces / Footage 1 8.859606E-021 Figure C-3: View details of data Step 4. Prht out the trial and statistical analysis resdts User clicks the 'Trint out" button to psint a hard copy of the current trial and stausti cd analysis results including the histograrn for record. &PENDIX D: USER'S MANUAL FOR PIPINGMASTER PipingMaster is a historical project data =-arehoushg system customized for the field construction systems of PCL Industrial Constructors, Inc. It is an automated data processing tool to estract ram data from Labor Cost Control System, Estimating System, and Quality Contcol Spstem, and convert raw data into useful pruductiviq information based on embedded expert de s . Pipe handling and welding are processed by Pipinghlaster independendy, but in simila fashions including the user interfaces and prograrn logic. Thus, Pipe handling is selected to illustrate the program flow in the following steps. Step 1. Impoa raw data in standard format User irnports three ram data tables for each project into the database manudy to d o w for PipingMaster to calculate the quantity of piping work, namely, RDJroject#Hand table for pipe hanrlling, RD-Project#Detl for pipe work components, RD-Project#Weld for pipe melding. The table structures are shown in Figure D-1. RD-Pro j#Hand Project # Nominal Size Schedule Ciassifcation hhterial Type Length (fi) Es tUnitPvLH RD-Pro j#Detl Project # Detl Type Nominal Size Classifca tion Mzterial Type Quantity RD-Pro j#Weld Project # Nominal Size Schedule Joint Type Classification Material Type # \cVelds Es tUnitlhIH Figure D-1: Structures of Raw Data Tables for A Projcct The detailed quantity take-off (in footage) for pipe handling of one project is available in the project estimate only. L'sually information is known and complete o n the size, the thickness, the material tgpe, and the location classification of each individual pipe section. The detailed quantity take-off in number of welds for pipe melding of one project is availaole either in the project estimates or in the field quality control system. In most cases the pipe ske, pipe thickness, pipe material type, location classification and meld joint type are knomn for each individual weld. Installation of other piping moxk components (or piping details) includes pipe supports, bolt-ups, valces, screw joints, and misceUaneous items like flanges, specialties, 185 elbows, cuts and bevels. The number and type ofwork components and estimated unit man- hours for one project me available in the project estimates. However, information on the size, material type, location dassihcation may not be found in the estimate. Therefore, we need to check the ram data integrty of the piping work components prior to processing. Step 2 Raw Data Integrity Check The raw data integrity check is controlled by the entered project setting regarding the raw data integiq and methods of actual mm-hour cost coding as shown in Figure D-2. User Mat ui d Type - ---- Sue Ra ge [<TT'-16'>16'1 y= OIoaae Ho If Co& To Total ri NO Lwd Figure D-2: Main user interface of FabMaster enters the project number to be processed and answer a number of Yes/No questions about the project Nesq user clicks "Check Raw Data Integritg" button to s t m the program. User dl be prompted to correct any problems due to failure to pass the checks. The PipingMaster is capable of identfyng missing data or incorrect data in the raw data tables. For esample, if actual labor hours in the labor cost system were tracked to the level of various classifications of location, then a null in the "Classihcation" field of the raw data tables d be detected as invalid data and must be corrected for W e r processing. Three valid types ofmeld joinh i.e. BW putt Weld), SW (Socket Weld), OL (Olet Weld) and five valid types of piping work components are allowed lo the RD-Project#Ded table, 1.e. bolt-up, valve, screw joint, support, and misc. Step 3 Check cross-reference Data for Quantity Calculation User hrst chooses one of four options and then click the "Check Cross Reference Integriy" button to perform the check for the selected option. Four options should be checked through one by one. User d be prompted to correct raw data or update cross reference tables in case Pipinghlaster finds a problem. The "Action" button d only be activated when aLi the ram data checks and cross reference checks are passed. In PipingMaster, a number of cross-reference tables are involved to cal dat e the quantitg in various units of measurement, i.e. five units of measurement for pipe handling: DiametePLength (Tnch*Feet), Equivalent Diameter*Length (Inch*Feet), Length (Fee t), Weight (p.lI.T), and Base Manhours (MH); five units of measurement for pipe welding: Diameter (Inch), Equivalent Diameter (Inch), Volume (Cubic Inch), Volume/Thickness (Square Inch), and Base bfanhours ($El). Cross-reference integrity check is performed to ascertain that each record in the raw data table can h d the needed information in the conesponding cross-reference tables so as to calculate accurate quantities. The forrnulae 187 used are commonly found in an industriai maoual or piping handbook- The relationships between raw data tables and cross-reference tabIes are shomn in Figure D-3 and Figure D-4. NominalSize Schedule RD-Proj#Hand Project # Nominal Size Schedule Classiacation Ma t e d Type Length OuterDiameter (inch) Figure D-3: HandLuig: X-Refercncc Information Integrity Check Schedule Thicknes (inch) Project # Nominal Size Schedule Joint Type Classification Ma terid Type # Welds Es tUnitkfH - 1 Schedule JT 1s OL tblOletDim Nominal Oudet Dimension B Figure D-4: Welding: X-Reference Information Integrity Check Step 4 Generate Aggregate Cost Codes and Calculate Quantities User hits the "Action" button to generate aggregate cost codes to the level of project nurnber, classiication of location, material type, size range, activity, and unit of measure. The total quantities and quantities breakdown for size ranges, dong with the generated cost codes 1 . be appended a summary table called "tblQuanutyLMaster". Table 1 shows sample records in the summary table for one relatively s md job. Table D-1: Sample of Quantity Calculation Surnmary Table in PipingMaster Pro ject# Material CS CS CS CS AS AS SS SS TOT TOT Class 41 0 41 0 460 460 460 460 460 460 31 O 460 CostCode Description Welding Total Volume/Thickness Handling Total Feet Handiing Total Feet Welding To ta1 Volurne/Thickness Handling Total Feet Welding Total Volume/Thickness Handling To tal Feet LVeIdrng To t ri Volurne/Thickness Hand Tot Mati Tot Size Ft Hand Tot Mat l Tot Sizc Ft Step 5 Enter ActuaI Hours and Compute Actual Degree-of-difficulty Factors Folloming generating the cost codes and calculating the quantities, PipingbIaster Show Gnnpkd Records aid A d ln Actual Mhs h m LCS Report Proje& 1 Materiaflypa 1 Classification 1 CostCode 1 QtyTotal 1 BaseMH 1 Adual Mhn 1 Statris [ Shan U35 Act MHr la Girrcnt Roiect Show Campikd Han- MuPipiicrt: 1 Proje- lateria4 Classl CostCode 1 QtyBetow2 ( Qty2TolS 1 QtyAbovelS 1 QtyTotal 1 ES!MUI~ M u l t l Statur 15L30486 CS 410 302151-02 11 5623 O 5634 100 388 No Figure D-5: Productivity Analysis Page for Pipe Handiing sliifis focus to productiviry analysis page as shown in Figure D-5. User reads acrual manhours and enters into the "actuai manhours" column for corresponding records. Nesr, user hits the "Analyze" button to let PipingMaster figure out the actud labor hours for pipe handling based on the project setting about actual labor cost t r a c h g practice and the ernbedded espert d e s for handling different scenarios. Evennidy, the acmal degree-of- difficulty factors are computed for each record and listed against the factors estimators have used for comparison. After comparison, user decides on which records are valid for NN to use by switching the status of one record from No to Yes. Step 6 Make Questionnaires for Valid Records T L INDUSTRIAL CONTRACTORS INC. Pipe Handling Report Prtpaied By: I #i t di Sotaert Report Date: 15/7/59 1 .Handling Job Group by Prqect # 1500484_007 CosiCode: m- Figure D-6: Sample of Pipe Handling Questionnaire PipingMaster m&es questiomaires for those valid records as conhmied by user. Figure D-6 shows a sampIe questionnaire. Followng the above six steps, Pipngbkister processes one project and convert ram data into accurate cost-coded productiviy information for f5uther productivity analysis. Figure D-7 shows the flom chart of the whole program. 1 Check Raw Data Intecgrity 1 [ Correct Missng/ Compile Estimated Man- hours for Pipe Work Component from the Raw Data N 1 Incorrect Data 1s Data Complete and Correct? \1/ Prorate Estimated Man- hours for Pipe Work Component Based On Pipe A 1s Piping Work Component included in the "Pipe Install~' Cost Code? N Cost-Coded MA-hours & Genqrate Indices Extract Es timated Man- hours fcom "Pipe Instd" 1s Output Data Valid? Y Y k/ v < Generate Historical Cost / Codes/Qty for NN Y I Quantitative Input and 1 Questiomakes for 1 1 Subjectivq Data CollecMg 1 1s ~ocat i & Classification, Matenal Type, and Size Ranges Known For Piping Work Component? 9 Feed Valid Data To NN for Training I Figure D-7: Program Flow Chan of PipingMaster N Generate Indices for Piping \Vork Components APPENDIX E: USER'S MANUAL FOR SENSIT~VENN SensitiveNN is a back propagation Neural-Network based system to analgze the sensitivity of input factors in some comples engineering and management problems that are not amenable to analy-sis ushg conventional mathematical models- The sensitivity analysis method is proposed in this thesis. Step 1 Prepare data for SensitiveNN The last column in a data table must be named as "Status", which flags the teaining/testing status for each record. Status 1 stands for a training record, and Status 2 for a tesring record, and Status O for an ignored record. The nest-to-last N columns in a data table contain Actual Output Values of the target nsky variables such as actual nroduction rates, N being the number of outputs. rill the remaining columns in a data table will be the input factors. There are no requirements imposed on the column names. The trainer dl count the number of inputs and outputs according to user's setup of the netsvork, which is discussed in Step 3. Figure E-1: Splash Screen of SensitiveNN program The prepared data table for PINN must be imported to the database file "FE3PNN.mdb" prior to analysis, which is instded nrith the program and by default under the program folder. Once the data table is imported, user can st ar t up the program "SensitiveNN.exe". The splash screen shows up as in Figure E-1. By hittng the forin, user proceeds to &.e nest step. Step 2 Link SensitiveNN to data tables and select the one for analysis Figure E-2 shows the switchboard of the program. User needs to link to "FFBPNN.rndb" h s t in order to load up data. By clicking the "Link FFBPNN.mdb" button, an "Open File" dialog form pops up as show in Figure E-3. NOTE TO USERS Page(s) missing in num ber only; text follows. Page(s) were microfilmed as received. 196 This reproduction is the best copy available. * - . _ _ - . - .- . - . -linport& .-. - - ~ h e ~ ~ a t a set fa NN Anabsis shdd have been lmportedinto - FFEPNN;mdb ih theformat of a Table with a Status Field! Figure E-2: Program Switchboard , . . r Open aspad-only Figure E-3: Open FFBPNN-mdb Fust Once FT;SPNN.mdb is linked, all the data tables are listed in a combo bos for 197 user to select the one for analysis. AU the M d s in the selected table are numbered and Listed in the list box captioned "Field List of Selected Datay', as shonm in Figure E-4. By diclckg the "Show Datayy button, user can examine the details of data and edic the train/test status for each record, which is shomn in Figure E-5. - . Show Data Enter 1 Importa& T ~ - D & 'Set for NN Analpis .should have been Irnported nto . FFBPNN.mdb h the Format of a Table wth a S tatus Field! Fiold List of Selected Data. - Figure E-4: Select data source table . . . . .. Figure E-5: Examine details of data and edit record status Step 3 Set up NN structure and Iearning parameters to train-test BP NN Followkg linktng to the data source, user click the "Enter" button on the switchboard to enter the main interface of the program, as shown in Figure E-6. User enters the tal ID, the leaming rate, the momentum rate, the number of inputs, the number of hidden processing elements in the rniddle layer, the number of outputs, the training iterations, and the threshold of global error to terminate learning. Ir is important to match the number of inputs and outputs with the number of columns of the linked data table in the previous step. User may revert to the switchboard (Figure E- 4) my t h e for double check bp clicking the "Exit" button, and clicks the "Enter" button on the switchboard to restore the main interface. The mal ID is used to identify a spedfic tnin-test trial and store the n e ~ o r k parameters and weights. I 1 f tain 'n Test - - I Figure E-6: Program main interface of Sens itiveNN User may refer to the pertinent paper for details of those NN parameters. Once the network is set up, click the "Train n Test" bunon to start the leaming peocess, mhich can be monitored through a progress bar. The current iteration and global error are also shonm at the top and bottom of the progress bar dynamically during the leaming. Step 4 Training tenninates and investigate leamng results Figure E-7: Check learning results when NN training terminates The trainkg process terminates when any of the foilowing conditions is satisfied: 1. The current training iteration hits the uses-specified total iterations. 2. The current global enor is lomer than the user-specified threshold of global error. 3. User hits the "Stop" button. When training terminates, user investigates the leaming results by checking final global enor and compwing the actual outputs against the NN outputs for both training and testing data as shown Li Figure E-7. Note that the average absolute errors for both training data and leaming data are cornputed and shown in the screen as well. If the average absolute enors for both training data and lemming data are reasonably small, the netwolk is declared to be crained and the program flow moves to the next step. Othernrise, repeat step 3. Step 5 Perform Monte Carlo simulation to analyze the sensitivity of input factors Based on a mature network obtained from Step 4, user spedies the total number of simulation nrns in the left lower corner of the main interface and clicks the "Sensitivity Analysis Simulation" button. m e n the simulation is done, the statisticai analysis results of simulation about the input sensiavity berneen each input-output pair are shown on screen as in Figure E-8- X tab-delimited test 6le called "SenNNFile-txt" is also generated in the program folder recording the simulation results, mhich can be imported to Excel for p l o t ~ g . Note tiiat the text H e ~vi l l be erased nest tkne the simulation is perfomed, thus user should back it up if iieeded. Step 6 Save a trial User cari Save a mal including the network structure, l e d g parameters and final weights of trained netsvork by clicking the "Save Trial" button. The trial ID d be the key for access the network at later times, hence must be remembered.