282 Towards Support for Long-Term Digital Preservation
The International Journal of Digital Curation
Issue 1, Volume 6 | 2011 Towards Support for Long-Term Digital Preservation in Product Lifecycle Management Wolfgang Wilkes !"rg #runsmann Dominic $eutel%eck &ndreas $undsd"rfer Matt'ias $emm(e and $ans-)lric' $eid%rink )niversity of $agen *n+onTec ,m%$ - Abstract *mportant legal and economic motivations e.ist for t'e design and engineering industry to address and integrate digital long-term preservation into product lifecycle management /PLM01 *nvestigation revealed t'at it is not sufficient to arc'ive only t'e product design data t'at are created in early PLM p'ases %ut t'at preservation is needed for data t'at are produced during t'e entire product lifecycle including early and late p'ases1 Data t'at are relevant for preservation consist of re2uirements analysis documents design rationale data t'at reflect e.periences during product operation and also metadata like social colla%oration conte.t1 *n addition t'e engineering environment itself t'at contains specific versions of all tools and services is a candidate for preservation1 T'is paper takes a closer look at engineering preservation use case scenarios as well as PLM c'aracteristics and workflows t'at are relevant for long-term preservation1 3esulting re2uirements for a long-term preservation system lead to an 4pen &rc'ival *nformation System /4&*S0 -%ased system arc'itecture and a proposed preservation service interface t'at respects t'e needs of t'e engineering industry1 2 - *n+onTec ,m%$5 'ttp566www1incontec1de6*+T7We%page7en1'tm1 2 T'is paper is %ased on t'e paper given %y t'e aut'ors at iP38S 299:; received !anuary 29-9 pu%lis'ed Marc' 29--1 T'e International Journal of Digital Curation is an international (ournal committed to sc'olarly e.cellence and dedicated to t'e advancement of digital curation across a wide range of sectors1 *SS<5 -=>?-82@? T'e *!D+ is pu%lis'ed %y )A4L< at t'e )niversity of #at' and is a pu%lication of t'e Digital +uration +entre1 Wolfgang Wilkes et al1 28B Introduction T'e Sustaining $eritage &ccess t'roug' Multivalent &rc'ivi<g /S$&M&<0 digital preservation pro(ect B investigates t'e needs of different application domains w'ic' are known as *ntegration C Demonstration Su%pro(ects /*SPs01 T'e *SPs include5 Memory institutions5 scientific pu%lis'ing in li%raries and documents in governmental arc'ives; Data resources used in e-Science applications; T'e industrial design and engineering industry1 T'e industrial design and engineering industry is c'aracterised %y a large num%er of 'eterogeneous design and development tools t'at are organised %y product lifecycle management /PLM0 and product data management /PDM0 systems /S$&M&< 299801 T'ose PLM and PDM systems provide services suc' as company-adapted workflows data integration and version and configuration management1 & large num%er of 'eterogeneous data are created during e.ecution of all PLM p'ases1 +ompanies in t'e design and engineering industry consider long-term preservation of suc' product data as an additional asset to t'eir PLM6PDM systems1 &s discussed %y $eutel%eck #runsmann Wilkes and $undsd"rfer /299:0 t'ere are several important motivations for design and engineering companies to engage in digital long-term preservation1 T'ese include legal re2uirements defined %y law and contracts as well as economic reasons suc' as efficient reuse1 Dor long-term preservation t'e 4pen &rc'ival *nformation System /4&*S; +onsultative +ommittee for Space Data Systems 29920 is widely accepted as a reference arc'itecture1 #y analysing t'e design and engineering industry t'is paper derives a num%er of re2uirements w'ic' 'ave to %e satisfied %y digital preservation systems in order to %e adopta%le wit'in state-of-t'e-art design and engineering environments1 T'is paper is organised as follows5 we first give a s'ort introduction to t'e 4&*S reference arc'itecture1 We t'en lay out several use case scenarios for preservation system access in t'e engineering industry1 We proceed %y descri%ing c'aracteristics of PLM-%ased workflows and e.isting s'ortcomings of t'ese implementations1 Drom t'e descri%ed use case scenarios and s'ortcomings we derive re2uirements for long-term preservation systems in t'e engineering industry1 Dinally we propose a 'ig'-level arc'itecture and service interface for a long-term preservation system capa%le of supporting t'e needs of PLM systems1 The !en Archi"al Information #$stem %odel &n 4&*S is an arc'ive consisting of an organisation of people and systems t'at 'ave accepted t'e responsi%ility to preserve information and make it availa%le for a Designated +ommunity1 T'e 4&*S provides a conceptual reference arc'itecture and does not specify a concrete implementation1 B S$&M&< $omepage5 'ttp566s'aman-ip1eu6s'aman61 The International Journal of Digital Curation Issue 1, Volume 6 | 2011 28> Towards Support for Long-Term Digital Preservation T'e AI# en"ironment model consists of t'e 4&*S arc'ive and t'ree e.ternal actors1 & producer is a person organisation or system t'at provides t'e digital information to %e preserved1 & consumer is a person organisation or system t'at accesses t'e 4&*S to find and ac2uire preserved information1 T'e Designated +ommunity w'ic' may %e composed of multiple user communities is an identified group of potential consumers w'o s'ould %e a%le to understand a particular set of information1 T'e management are t'ose persons or organisations w'o set t'e overall 4&*S policies in t'e sense of managementEs responsi%ilities and not t'e daily operational arc'ive administration1 W'ile investigating use cases and re2uirements it is necessary to focus on t'e needs of t'e Designated +ommunity w'ic' will %e done in t'e ne.t section1 T'e AI# functional model consists of si. ma(or functional entities1 T'e *ngest functional entity accepts one or more Su%mission *nformation Packages /S*Ps0 and creates &rc'ival *nformation Packages /&*Ps0 w'ic' it provides to &rc'ival Storage for preservation1 *ngest also sends Descriptive *nformation /D*0 to t'e Data Management functional entity1 +onsumers interact wit' t'e &ccess function w'ic' uses t'e D* to find t'e content information of interest1 &ccess retrieves &*Ps from &rc'ival Storage and sends Dissemination *nformation Packages /D*Ps0 to t'e consumer1 &dministration oversees t'e day-to-day operation of t'e arc'ive and it receives advice from Preservation Planning on evolving strategies and mec'anisms for preservation1 &ll si. functional entities are furt'er %roken down in t'e 4&*S reference model1 4&*S-%ased preservation systems mig't use some of t'e following preservation met'ods5 Migration is t'e continuous translation of data %etween data formats and systems; Transformation is t'e conversion of data from one format to anot'er format t'at is assumed to %e more preserva%le; Emulation /virtualisation0 is t'e duplication of functionality of a system running on anot'er system1 Designated Communit$ &se Case #cenarios in the 'ngineering Industr$ Since t'e re2uirements for a preservation system are driven %y t'e needs of t'e preservation system consumers we study t'ese use cases in more detail1 T'e structures of use case scenarios for preserving engineering data follow t'e %asic motivations of corporate preservation as introduced %y $eutel%eck et al1 /299:01 $ere we e.tend t'e motivations wit' concrete e.amples1 The International Journal of Digital Curation Issue 1, Volume 6 | 2011 Wolfgang Wilkes et al1 28@ Legal Use Cases *n legal scenarios preservation is mandatory %y eit'er contract or law1 Legal defense can %e prepared using preserved engineering data1 Some e.amples are5 & malfunction in a >9-year-old aircraft results in a cras' w'ic' leads to serious in(uries or even deat' of pilots and passengers1 *nvestigating agencies demand access to preserved data from t'e manufacturing company1 Malfunction in medical e2uipment can cause serious 'ealt' pro%lems for patients or can even lead to deat's1 & malfunction in automotive electronics destroys ot'er parts of a car and can result in an accident1 Economic Use Cases 8conomic use cases seek to increase t'e return on investment %y lowering t'e cost for activities suc' as reuse maintenance refitting and training1 8.amples are5 &s a conse2uence of a malfunction in a >9-year-old aircraft une.pected downtimes of aircraft spacecraft or satellites occur t'at result in financial damage for t'e company1 &n engineer of t'e operating company will use t'e preserved product data to analyse a potential error1 Products wit' a long lifetime are occasionally en'anced during t'eir lifetime eit'er to introduce new functionality or to adopt new re2uirements coming from customers or t'e market1 Depending on t'e impact of t'ese modifications on different areas of a product it mig't %e necessary to c'eck eit'er portions of t'e data or t'e total product data1 & future engineer takes some parts /e1g1 t'e enclosure0 of an e.isting product and reuses it for a new product1 Sop'isticated searc' functionality supports 'im in finding t'e product data t'at fit 'is needs1 Dollowing t'is kind of approac' t'e advantage is a s'orter time to market for a new product and t'e reuse of parts w'ic' 'ave already passed 2uality assurance tests etc1 & t'ird-party provider for electronic e2uipment wants to develop and market add-on e2uipment for a companyEs product1 $e can selectively access t'e company product design data to investigate specification details and design recommendations for interfaces1 & product recall 'appens as a result of malfunction1 T'e engineering department 'as to c'eck t'e product design data fi. t'e pro%lems and create a new revision of t'e product to pass it again to manufacturing1 Archive Consumers #ased on t'ese use case scenarios we can identify t'e actors /consumers Designated +ommunity0 for accessing engineering data t'at are preserved on a long- term %asis5 Duture engineers *nvestigators 4t'er companies ,overnmental aut'orities 3egulatory agencies The International Journal of Digital Curation Issue 1, Volume 6 | 2011 28? Towards Support for Long-Term Digital Preservation We can identify two ma(or kinds of re2uirements from an access point-of-view5 &ccess of t'e /native0 engineering design data for reuse1 T'e engineer needs full access to t'ose parts of t'e design data w'ic' 'e wants to reuse1 *n electronic engineering t'ey are t'e logical design simulation models placement data for t'e p'ysical design specifications etc; &ccess to representations of t'e design data for e.ample PDD documents and images of 2D drawings w'ic' 'ave %een created from t'e native design data1 T'ese may %e used to support service personnel regulatory needs or for e.ample *S4:999 > re2uirements1 (roduct )ifec$cle %anagement Characteristics T'is section lays out t'e fundamental properties t'at e.ist in PLM-%ased environments1 We descri%e c'aracteristics t'at are in'erent to t'e engineering industry and t'at affect re2uirements and in conse2uence t'e system arc'itecture1 Overview Products in t'e aerospace automotive c'emical and petroleum electronics energy and utilities as well as s'ip%uilding industries pass a product lifecycle t'at spans from p'ases suc' as idea generation re2uirements collection product planning development process planning production and operation to disposal and recycling1 During t'e early p'ases of t'e product lifecycle /e1g1 development p'ase0 PDM systems are involved1 PDM systems are used in t'e engineering and design communities in order to manage t'e product concept wit' t'e o%(ective to generate reproduci%le product configurations1 T'erefore PDM systems maintain t'e product design as versioned data in a repository1 PDM functions as an interface %etween tec'nical and commercial data processing t'at is %etween computer-aided design /+&D0 systems on t'e one side and procurement and production on t'e ot'er side1 PLM systems e.tend PDM functionalities %y managing t'e entire lifecycle of a product from conception to service and disposal1 PLM integrates people data processes and %usiness systems and provides a product information %ack%one for companies and t'eir e.tended enterprise1 *n t'e following we focus on two areas of engineering systems w'ic' are relevant for preservation5 Data formats and models; Workflow processes and t'e product lifecycle1 Data Formats and Models Data models. & product is descri%ed from many different viewpoints for e.ample an electronic product is descri%ed %y geometry netlist routing and placement simulation models BD mec'anical construction data etc1 8ac' different viewpoint is composed of a num%er of o%(ects and su%-o%(ects1 T'us we can formulate t'e following c'aracteristics5 > *nternational 4rganisation for Standardisation *S4:9995 'ttp566www1iso1org6iso6iso7:9997essentials1 The International Journal of Digital Curation Issue 1, Volume 6 | 2011 Wolfgang Wilkes et al1 28= Product data size. T'e siFe of product data is large G t'e complete design information for a comple. product may need giga%ytes of data1 Product data collection. & product is descri%ed %y a collection of digital o%(ects1 &n electronic product is made up of many different components /enclosure printed circuit %oards ca%les software etc10 w'ic' are developed in different systems t'at use different file formats1 *n addition references to o%(ects outside t'e %oundaries of PLM systems may %e involved for e.ample to information captured in an enterprise resource planning /83P0 system /e1g1 S&P 36B @ 01 Multiple native formats. Digital engineering o%(ects are created in multiple native formats of +&D systems1 8ven one assem%ly in mec'atronics mig't contain multiple produced design elements coming from different sources represented in several different file formats1 & unified model covering all o%(ects in all design and engineering disciplines does not e.ist1 Data formats. Design data is usually produced %y t'e use of tools1 T'erefore we can formulate two furt'er c'aracteristics w'ic' are important for digital preservation5 Change of native formats. Typically +&D vendors release one ma(or release per year w'ic' contains additional functionality1 *n most cases t'e internal data model c'anges or is e.tended1 &dditionally minor releases /service packs0 are deployed several times a year w'ic' com%ine %ug fi.es wit' minor en'ancements1 Copyright on formats. Most of t'e native formats are proprietary and t'us not generally reada%le1 T'e interpretation of a native format usually re2uires t'e use of a specific tool of a specific vendor1 References to other data. 8ven in t'e conte.t of a PLM system t'e data may not %e self-contained t'at is t'ere may e.ist e.ternal references for e.ample to glo%ally- maintained ontologies and classifications or to o%(ects in data%ases of colla%oration partners1 T'is leads to t'e two following c'aracteristics5 Proprietary taxonomy description. Drawings components connectors standard parts and ot'er regularly-used design elements are managed in specific li%raries t'at are created and maintained e.ternal to a company1 ,lo%al element li%raries are mapped into local company li%raries1 T'is 'appens in many companies wit' proprietary ta.onomy descriptions including t'e geometrical layout of t'e design o%(ects1 in!s to externally"developed designs. Due to t'e comple. nature of product designs products are often created in colla%orative pro(ects %y several companies1 T'ese companies are geograp'ically separated and eac' maintain t'eir local design repositories t'at contain references to ot'er parts of t'e product data model1 Worflow! "rocesses! and the "roduct Lifec#cle Organisation of $rocesses. & product is developed in comple. processes w'ic' partly follow strict workflows %ut w'ic' also contain creative p'ases of colla%oration1 T'ese processes are organised in many different ways5 @ S&P &, software corporation5 'ttp566www1sap1com6a%out6company6'istory6-:82--::-6inde.1ep.1 The International Journal of Digital Curation Issue 1, Volume 6 | 2011 288 Towards Support for Long-Term Digital Preservation Company"specific design methodologies. T'e design met'odology can vary significantly from company to company1 *n addition t'e distinction must %e made %etween companies w'ic' 'ave a full-%lown design process wit' electronic /c'ip printed circuit %oard /P+#0 and ca%ling0 and mec'anical design and ot'er companies w'o 'ave only a fraction of t'ese1 Pro#ect heterogeneity. 8ven engineering pro(ects wit'in t'e same company can differ significantly regarding t'e policies and %est practices t'ey 'ave to follow depending on t'e kind of product for w'ic' market and region and so fort'1 Data and colla%oration. Design processes are colla%orative processes w'ere a num%er of people work toget'er to create products5 Colla$oration. Product data is s'ared among people and tools1 +olla%oration takes place wit'in t'e same company or across different companies1 *nter-company colla%oration re2uires s'aring of information %ut only as far as necessary for t'e colla%oration1 +ompanies are very sensitive to t'e protection of t'eir data /intellectual property0 from unwanted e.c'ange wit' colla%oration partners1 %eographically"distri$uted product lifecycle. Products are designed manufactured and serviced in a glo%ally-distri%uted environment1 Many designs are variations of e.isting designs1 T'us e.isting designs 'ave to %e kept in a retrieva%le form5 Design reuse. 8ngineers often searc' %rowse and retrieve already-developed designs in order to get ideas e.amples or reusa%le items1 T'e different design p'ases are performed in different conte.ts /tools locations people etc101 T'us a lot of metadata and conte.t data are produced during t'e course of product development1 Metadata and context data creation during the &hole lifecycle. During t'e design and development processes not only product data %ut also metadata descri%ing t'e performed processes are created1 Dor e.ample during colla%orations %etween different domains /e1g1 electronic +&D and mec'anical +&D He+&D6m+&DI0 and during interactions %etween various PLM p'ases /e1g1 a c'ange re2uest from re2uirement analysis to product development0 social conte.t data are created1 T'is data needs to %e collected during t'e w'ole product lifecycle1 'ersion and configuration management system usage. 8.isting PLM implementations contain version and configuration management t'at partly arc'ive product data1 #hortcomings and (roblem #tatement <'oug' several academic and industrial pro(ects already tackle t'e pro%lem area of long-term preservation of engineering data some important aspects 'ave not %een considered so far /#runsmann C Wilkes 299:01 T'e discussed c'aracteristics represent a %asis for finding missing pro(ect scopes and s'ortcomings of e.isting PLM implementations1 T'e identified s'ortcomings are grouped %y 4&*S arc'itecture functionality1 The International Journal of Digital Curation Issue 1, Volume 6 | 2011 Wolfgang Wilkes et al1 28: &eneral T'e state-of-t'e-art preservation feature in PLM-%ased systems is t'e creation and storage of versions in vast PLM repositories w'en certain milestones are reac'ed1 &rc'iving of product designs is done %y creating %ackups of PLM repositories1 4f course t'ese PLM repositories do not offer preservation functionalities1 T'us t'e c'ange to a different tool or PLM system is not possi%le1 *f data arc'iving is considered %y e.isting pro(ects at all t'en only geometric data are arc'ived1 $owever as descri%ed a%ove product designs do not consist only of geometric data w'ic' means t'at t'e ot'er data are lost forever1 Data 'ngest T'e practice of arc'iving and creating metadata only once makes t'e arc'ival step 2uite comple.1 T'is lowers t'e acceptance of t'e arc'iving process1 )nfortunately met'ods for automatic e.traction of metadata are currently missing during ingestion or during creation of data1 T'erefore capturing of metadata is perceived as intrusive and its value for later access to t'e arc'ived data is not understood1 *n addition valua%le data and metadata are created during t'e w'ole lifecycle1 Metadata include data a%out processes people products provenance and colla%oration1 *f t'ese data are not fully collected during t'e process t'ey are lost forever1 +ompany colla%orations lead to distri%uted data repositories1 T'e collection of distri%uted data for arc'iving is not provided %y current PLM solutions1 *n addition it is not guaranteed t'at all companies use tools of t'e same vendor in t'e same version1 T'e data t'at are created and ingested in one PLM p'ase are interconnected wit' o%(ects from ot'er p'ases for e.ample a product design depends on t'e re2uirements document a p'ysical layout depends on t'e logical design or a document created in t'e service p'ase depends on t'e product design1 T'ese relations'ips and connections to ot'er data and metadata t'at span over several PLM p'ases 'ave not %een considered in long-term preservation pro(ects in t'e engineering industry1 Data Access Dor product design reuse and all ot'er use cases searc'ing and finding of product designs is essential %ut can only %e done in a reasona%le way if semantics are attac'ed to t'e arc'ived product data1 )nfortunately semantics are neit'er attac'ed nor arc'ived for a product design %y e.isting PLM implementations1 Since t'e arc'ived product designs are only availa%le in native formats t'e data cannot %e used wit' future tool revisions unless t'e data are transformed into vendor- neutral formats1 "reservation "lanning 8.isting pro(ects concentrate on migration as t'eir preservation met'od w'ic' creates pro%lems due to t'e 'ig' fre2uency of format or system c'anges1 4t'er e.isting pro(ects use t'e met'od of transformation to vendor-neutral formats /#all Patel C Ding 2998; Lu%ell 3ac'uri Mani C Su%ra'manian 299801 #ut transformation does not solve all pro%lems since t'e neutral format usually does not cover -99J of t'e native format and vendor-neutral formats evolve wit' t'e native The International Journal of Digital Curation Issue 1, Volume 6 | 2011 2:9 Towards Support for Long-Term Digital Preservation formats1 4t'er met'ods like emulation or system preservation 'ave not %een considered in more detail1 *f t'e system environment and t'e proprietary product data are arc'ived it is not guaranteed t'at valid software licenses for applications are availa%le if t'e arc'ived native data are accessed several years later1 Since glo%al ta.onomy standards evolve independently arc'ived data t'at reference t'ese glo%al standards may %ecome unusa%le wit'out notice1 *e+uirements T'e e.amination of c'aracteristics and s'ortcomings and t'e analysis of t'e use case scenarios of t'e previous sections lead to a num%er of 'ig'-level re2uirements for digital preservation systems in PLM5 &eneral Process integration. Since in all PLM p'ases relevant data are created it s'ould %e possi%le to integrate arc'iving system functionality into all PLM-%ased workflows1 Modularity and adapta$ility. Since t'e design met'odology is very different in different companies a digital preservation system s'ould %e modular customisa%le and adapta%le1 (ervice interface availa$ility. *t must %e possi%le to access t'e preservation functionality via application programming interfaces1 T'is ensures t'at tools written in different programming languages and running on different operating systems can access preservation functionality1 (ystem autonomy. T'e preservation system 'as to %e independent from t'e specific PLM system t'at currently uses t'e preservation system1 *t is desira%le to ena%le a switc' to anot'er PLM system wit'out loss of arc'ived product data1 *t s'ould also %e possi%le to preserve t'e data wit'out 'aving access to an e.isting PLM system1 Data 'ngest Data and metadata) and file format standards. T'e system s'ould %e a%le to transform multiple native formats into standard vendor-neutral formats1 T'e system s'ould %e aware of standardised data and metadata and file formats so t'at future tools are a%le to interpret t'e data1 T'e reference model s'ould %e a%le to accommodate e.isting standard metadata and file formats t'at are used %y many companies and during colla%orations1 *n addition t'e model s'ould allow new standards to %e included1 Parallel archiving of native and standard format. T'e system must %e a%le to arc'ive %ot' native and standard formats in parallel1 Metadata generation and preservation. To ena%le engineers to find e.isting designs w'ic' can %e reused t'e system s'ould create and store metadata t'at descri%e t'e arc'ived designs for searc'ing on a reasona%ly detailed level1 Metadata include grap'ics and significant properties1 T'e e.traction of metadata s'ould %e possi%le %ot' automatically and manually1 The International Journal of Digital Curation Issue 1, Volume 6 | 2011 Wolfgang Wilkes et al1 2:- Process history trac!ing. Dor audit and error-c'ecking processes it is necessary to arc'ive not only t'e product data %ut also information a%out t'e design process of a product1 T'erefore at all stages of t'e design and preservation process automated as well as interactive user actions s'ould %e monitored and added to t'e c'ange 'istory of t'e stored design o%(ects1 T'is practice ena%les users to searc' retrieve and understand a product design1 Data validation. T'e system s'ould validate t'e data %efore t'ey are ingested into t'e arc'ive and after t'e access from t'e arc'ive1 Kalidation s'ould include c'ecks for completeness and for correctness1 Pac!aging of design data and metadata. *n order to allow t'e retrieval of parts of a product design /e1g1 t'e mot'er%oard of a computer0 or data reflecting a certain stage of t'e design cycle /e1g1 logical design or p'ysical design0 t'e system s'ould %e a%le to package %ot' types of design data according to t'eir type1 Data Access Intellectual property rights protection. T'e system s'ould restrict t'e access to all +&D o%(ects and properties in t'e design files1 T'is ensures t'e protection of intellectual property t'at is re2uired in colla%oration processes w'ere different companies are involved1 Data validation. T'e system s'ould also validate t'e accessed data for completeness and correctness1 *ccess to parts of a product design. & product design consists of a collection of o%(ects1 T'e system s'ould allow access to t'e complete design as well as to parts of t'e design /e1g1 logic design or mec'anical construction data01 *f product data are reused wit' old design tools it must %e ensured t'at t'e design or directory structure is maintained in t'e arc'ive1 (earch and retrieval of product data $y metadata. 8ngineers will re2uire access to data from different pro(ects in different data formats created %y different design tools1 T'e system s'ould provide discovery mec'anisms for arc'ived product designs1 Data "reservation "lanning Migration support. Due to t'e fre2uent c'anges of file formats t'e system s'ould support t'e migration of arc'ived product design in order to %e in sync wit' t'e appropriate +&D tool and PLM system versions1 &rc'ived data must %e migrated eit'er on access or following a sc'eduled preservation plan1 PM process triggers for preservation planning activities. Process milestones influence t'e necessity to keep specific formats in t'e arc'ive for e.ample after milestone Lend of lifeM t'e native format may %ecome o%solete1 External taxonomy dependencies. 8lements of component li%raries are descri%ed %y ta.onomies t'at are e.ternal to t'e companies1 T'e system s'ould maintain references and mappings to specific versions of t'ese ta.onomies1 'ersioning of parts of a complete product design. & product design consists of a 'ierarc'y of dependent o%(ects1 $owever it is possi%le t'at parts of an arc'ived The International Journal of Digital Curation Issue 1, Volume 6 | 2011 2:2 Towards Support for Long-Term Digital Preservation product design are deleted or updated or new o%(ects are added to it /e1g1 during t'e maintenance p'ase of a product01 T'erefore t'e system s'ould allow t'e creation of versions of a product design and put t'em under configuration control1 External o$#ect maintenance. T'e system s'ould maintain relations'ips to o%(ects outside of t'e PLM system for e.ample to an 83P system1 ,igh-le"el #$stem Architecture *n t'is section we map t'e given re2uirements to t'e 4&*S reference arc'itecture and derive e.tensions or adaptations to 4&*S functionality1 We identify steps in t'e workflows managed %y PLM systems w'ere a link to suc' a preservation system is possi%le1 &s already descri%ed in t'e introduction t'e S$&M&< pro(ect investigates t'e needs of t'ree different application domains1 T'ese different domains s'are a similar lifecycle w'ic' is s'own in Digure - and descri%ed %elow1 Digure -1 S$&M&< *nformation Lifecycle1 ()AMA* 'nformation Lifec#cle T'e proposed S$&M&< lifecycle /#rocks Aranstedt !Nsc'ke C $emm(e 299:0 contains t'e p'ases pre-ingest wit' creation and assem%ly arc'ival and post-access wit' adoption and reuse1 During pre"ingest all activities are e.ecuted t'at must %e taken prior to t'e ingestion of data into t'e preservation system1 T'e creation p'ase gives %irt' to new information t'at could %e t'e result of comple. processes involving many producers1 *n t'e engineering industry t'is is t'e creation of arc'ives of relevant metadata in real-time during all p'ases of t'e PLM process1 During t'e assem$ly p'ase additional information t'at is relevant for arc'iving is collected1 *ssem$ly assures t'at enoug' information is arc'ived so t'at a digital o%(ect remains reusa%le for a future consumer1 *n t'e engineering industry all relevant data are collected t'at are needed to The International Journal of Digital Curation Issue 1, Volume 6 | 2011 Wolfgang Wilkes et al1 2:B correctly interpret a product design /i1e1 geometry coordinate list metadata product component li%raries simulation and verification data01 During t'e archival p'ase t'e digital o%(ect is stored and maintained in a preservation system1 Policies descri%e t'e lifetime and migration of digital o%(ects1 T'e post"access p'ase comprises all activities t'at are needed for preparing t'e final access of t'e preserved data1 During t'e adoption p'ase t'e arc'ived digital o%(ects are adapted for domain-specific reuse t'at a preservation system cannot accommodate1 *n t'e engineering industry t'e arc'ived data formats are translated to formats t'at are interpreta%le wit' current tools1 During t'e reuse p'ase t'e consumer e.ploits t'e arc'ived information1 3euse may lead to t'e creation of new digital o%(ects t'at later are also candidates for arc'iving1 *n t'e engineering industry an arc'ived product design can %e imported into a proprietary tool for creating product variations1 &fter 'aving descri%ed t'e information lifecycle in general we are a%le to specify a S$&M&<-%ased 'ig'-level system arc'itecture for t'e engineering industry /see Digure 201 T'e systems are interconnected wit' solid /use-relations'ip0 and das'ed /reference-relations'ip0 arrows1 T'e relevant key aspects of t'e figure are descri%ed starting from t'e top1 Core PLM system PLM repository Pre ingest Post access Long-term preservation functionality PLM Tools Workflow Consumer Producer Ingest Access Service interface Idea Requirements Realize Operation Recycling Collaoration !ersion"Configuration management #ocument management Preservation policies Preservation planning SHAMAN preservation system $icense management Rig%ts management Legend references uses Digure 21 System &rc'itecture1 The International Journal of Digital Curation Issue 1, Volume 6 | 2011 2:> Towards Support for Long-Term Digital Preservation Actors and +ools During t'e w'ole product lifecycle /from idea generation to recycling0 t'e system is used %y producers and consumers w'o are assisted %y various tools t'at can %e plugged into t'e overall PLM system infrastructure1 T'e tools are a%le to display and modify t'e product data1 Core "LM (#stem T'e core PLM system consists of t'e standard PLM components suc' as workflow colla%oration and t'e management of data versions and configurations1 T'e core PLM system will %e used %y t'e upper tool layer and makes use of t'e long- term preservation functionality and t'e PLM repository t'at are %ot' descri%ed %elow1 "reservation,ena%led "LM (#stem T'e preservation-ena%led PLM system consists of t'e core PLM system en'anced wit' preservation functionality1 T'is preservation functionality is responsi%le for e.ecuting relevant actions during t'e pre-ingest and post-access p'ase1 Dor e.ample during t'e pre-ingest p'ase t'e system collects all relevant product data t'at are needed for arc'iving1 &lso during pre-ingest native proprietary formats are converted into vendor-neutral formats for arc'iving1 *n addition it may also %e decided to arc'ive t'e system environment /software tools etc101 Pre-ingest rules determine w'at and w'en to arc'ive1 Dor e.ample if a specific p'ase of t'e PLM process is reac'ed a rule mig't determine to arc'ive t'e product design1 During post-access t'e preservation-ena%led PLM system transforms t'e vendor- neutral arc'ival format into t'e current native format1 &lso t'e system mig't reconstruct t'e previously arc'ived system environment1 *n addition domain-specific post"access rules are specified and e.ecuted1 & post-access rule mig't em%ed t'e retrieved product design into t'e current tool landscape and create relevant relations'ips %etween t'e product design and current data models and ontologies1 Pre-ingest and post-access rules are domain-specific and do not include activities t'at are part of t'e preservation1 ()AMA* "reservation (#stem T'e preservation-ena%led PLM system makes use of t'e S$&M&< preservation system1 T'is preservation system offers a service interface t'at includes ingest and access functionality1 Suc' a service interface decouples t'e preservation system from ot'er systems and allows fle.i%le use even from systems t'at reside in ot'er companies or institutions1 T'erefore it is necessary t'at t'e preservation system also includes t'e management of rig'ts to prevent unaut'orised access to t'e arc'ived data1 T'e preservation system also includes standard preservation planning functionality /o%solescence detection0 and preservation policy management1 T'e management of software licenses is also needed to ensure legal access to file formats and tools1 Preservation policies define principles t'at guide t'e preservation of digital o%(ects according to company pro(ect document type or even o%(ect-%ased guidelines1 T'e PLM system uses suc' policies during ingest in order to define t'e desired 'andling of ingested product data1 Policies are needed %ecause engineering pro(ects differ significantly regarding t'e kind of product market and country1 Dor The International Journal of Digital Curation Issue 1, Volume 6 | 2011 Wolfgang Wilkes et al1 2:@ e.ample if t'e collection of ingested product data consist of geometric logical and p'ysical designs a preservation policy is a%le to specify t'at legal demands allow t'e deletion of arc'ived p'ysical designs after an elapsed amount of time1 Preservation policies s'ould guarantee t'at a preservation system is self-sustaining and t'at t'e arc'ived content remains in a usa%le state1 "LM Re$ositor# T'e PLM repository stores t'e native product data and is t'e operational data%ase for t'e core PLM system1 *t is possi%le t'at t'e PLM repository references t'e content of t'e preservation system1 Dor e.ample if parts of a design move into t'e preservation system t'en t'e PLM repository can store a uni2ue reference to t'e arc'ived portion of t'e design and remove t'e arc'ived product design from t'e PLM repository1 *f t'is design is accessed later t'e PLM system will retrieve t'e design from t'e preservation system1 *f t'e arc'ived data are migrated t'en references must %e kept in a valid state1 3eferences from t'e preservation system to t'e PLM repository may not e.ist since t'e preservation system must %e independent from systems t'at use t'e preservation functionality1 (ummar# Since t'e arc'itecture was derived from re2uirements it is wort'w'ile to verify if t'e key aspects of t'e arc'itecture are a good matc' to t'e given re2uirements1 %eneral. T'e preservation system is modular and 'as a service interface t'at allows t'e integration of t'e system in all p'ases of t'e PLM process1 Data ingest. During pre-ingest relevant metadata /e1g1 process 'istory0 are collected and mig't %e transformed into vendor-neutral formats %efore ingest1 &lso during pre-ingest t'e data can %e validated for completeness1 Due to t'e parallel usage of a preservation system and a PLM repository %ot' native and transformed data are kept1 Data access. During post-access t'e data are validated for completeness1 T'e service interface of t'e preservation system allows users to searc' via metadata1 *ntellectual property rig'ts are protected %y t'e preservation system and it controls t'e access to parts of t'e product design1 Data preservation planning. T'e preservation system migrates t'e arc'ived data w'enever t'ey are needed and during pre-ingest t'e preservation-ena%led PLM systems collect all relevant product design data1 Conclusion and utloo. #ased on PLM system c'aracteristics we 'ave descri%ed e.tensions t'at are needed for integration of preservation functionality into PLM workflows1 We also 'ave looked at re2uirements for a preservation system %ased on t'e needs of t'e Designated +ommunity in t'e engineering industry1 4ne of t'e ma(or re2uirements is easy integration of digital preservation processes and PLM workflows %y using configura%le modular solutions1 T'is re2uirement is met %y a PLM system arc'itecture t'at colla%orates wit' t'e functionality of a S$&M&<-%ased preservation system w'ile conserving t'e possi%ility of e.ecuting t'e regular PLM processes1 The International Journal of Digital Curation Issue 1, Volume 6 | 2011 2:? Towards Support for Long-Term Digital Preservation Durt'er investigations are needed to specify t'e service interface in more detail1 T'e conse2uences of distri%uted arc'ives resulting from glo%al cross-company colla%orations in t'e engineering industry must %e reflected upon1 *t remains to %e investigated w'ic' relevant metadata in PLM processes e.ist and 'ow t'ese metadata can %e captured arc'ived and preserved1 Duture researc' will also tackle t'e pro%lem of maintaining dependencies to e.ternal ta.onomies t'at are used in t'e engineering industry1 T'e pro%lem of maintaining version-management metadata 'as to %e addressed since suc' version metadata are currently 'eld in t'e PLM system1 *f version information s'ould %e accessi%le %y ot'er PLM systems version metadata 'as also to %e arc'ived and maintained in a preservation system1 Ac.no/ledgements T'is paper is supported %y t'e 8uropean )nion wit'in t'e = t' Dramework Programme *ntegrated Pro(ect5 S$&M&<1 *eferences #all &1 Patel M1 C Ding L1 /299801 Towards a curation and preservation arc'itecture for +&D engineering models1 *n Proceedings of the+ th International Conference on Preservation of Digital ,$#ects. London )A1 3etrieved 29 De%ruary 29-- from 'ttp566www1%l1uk6ipres29986presentations7day-6-=7#all1pdf1 #rocks $1 Aranstedt &1 !Nsc'ke ,1 C $emm(e M1/29-901 Modeling Context for Digital Preservation1 *n SFcFer%icki 81 C <guyen <1 /8ds10 (tudies in Computational Intelligence-'ol. ./0. (mart Information and 1no&ledge Management /p1-:=-22?01 #erlin5 Springer1 #runsmann !1 C Wilkes W1 /299:01 (tate"of"the"art of long"term archiving in product lifecycle management. Manuscript su%mitted for pu%lication1 +onsultative +ommittee for Space Data Systems /299201 2eference model for an open archival information system. *S4 ->=2-1 3etrieved De%ruary 29 29-- from 'ttp566pu%lic1ccsds1org6pu%lications6arc'ive6?@9.9%-1pdf1 $eutel%eck D1 #runsmann !1 Wilkes W1 C $undsd"rfer &1 /299:01 Motivations and c'allenges for digital preservation in design and engineering1 *n Proceedings of 3 st International 4or!shop on Innovation in Digital Preservation) June 35) .005. &ustin TO1 3etrieved De%ruary 29 29-- from 'ttp566cs1'arding1edu6indp6papers6'eutel%eck@1pdf1 Lu%ell !1 3ac'uri S1 Mani M1 C Su%ra'manian 81 /299801 Sustaining engineering informatics5 Towards met'ods and metrics for digital curation1 The International Journal of Digital Curation 6) 7.81 S$&M&< /299801 (urvey of users and providers in Europe and (9*M*: usage scenarios1 S$&M&< *nternal Pro(ect Delivera%le -1-1 'ttp566www1s'aman- ip1eu589996S$&M&<6wiki6WP- /access restricted to S$&M&< +onsortium Mem%ers01 The International Journal of Digital Curation Issue 1, Volume 6 | 2011