Call Us: US - +1 845 478 5244 | UK - +44 20 7193 7850 | AUS - +61 2 8005 4826

Archive Logical Organization

Archive Organization 2A Archive Logical Organization 2A.1 Bundles A bundle is the default logical construct for archiving digital data in the PDS. (Recall that terms such as bundle, collection, and basic product are defined in the glossary of the PDS4 Concepts document.) Bundles have a simple hierarchical structure. A bundle has one or more member collections, each of which has one or more member basic products (Figure 2A-1). PDS does not impose requirements on how bundles are defined except that (1) bundles must be distinct (their LIDs/LIDVIDs must be distinct) within the overall holdings of PDS, and (2) each bundle must be approved by a PDS peer review. Figure 2A-1: Archive structure. Members of a bundle are listed in a Product_Bundle, an XML file which serves as both a label and the bundle inventory. Product_Bundle is described and uniquely identified using the Product_Bundle class definition (see Section 9D for information on constructing Product_Bundle). An optional “readme” file may be included as part of Product_Bundle; it is described by the bundle label so is not a separate product. The “readme” file provides a general overview of the bundle contents and organization in human readable format. It may also contain general instructions for use of the bundle and contact information for data provider or discipline node personnel. The “readme” file must be formatted either as 7-bit ASCII text or as UTF-8 text. 2A.2 Collections Basic products are organized into collections based on the type and function of the data. PDS imposes only broad requirements on how these type and function boundaries are drawn (see Section 2A.5). Collections must be distinct within a bundle, products must be distinct within each collection, and each collection must be approved by a PDS peer review. Any single version of a collection is defined and uniquely identified using the Product_Collection class definition — PDS Standards Reference 1.11.0 2018-10-01 11 an XML label file paired with an inventory table, which lists collection members for that version (see Section 9C for information on constructing Product_Collection). 2A.3 Products A basic product is the simplest product in PDS4 — one or more data objects and their description objects, which constitute a single observation, document, etc. Typically, a data object is a file containing a single image, table, or time series; description objects are typically text that describes both the format and content of the associated data object. A label is an XML file, which is the concatenation of one or more closely related description objects (such as for the red, green, and blue components of a color image) with some XML overhead; the corresponding basic product is that label plus the RGB data objects. A document basic product is constructed in the same way: the basic product is the set of files containing text, figures, and tables together with a label comprising the several description objects. Certain XML files qualify as basic products by themselves — they are ‘XML documents’. Digital objects which comprise observational data may be used in one and only one product. Product_Collection and Product_Bundle are aggregate products; they define an aggregation of basic products and an aggregation of collections, respectively. They are not basic products. All products, whether basic or aggregate, must have globally unique logical identifiers (see Section 6D). 2A.4 Primary and Secondary Members Basic products may be either primary or secondary members of their respective collections. A primary member is one that is being registered with PDS for the first time. A secondary member is one which has already been registered with PDS, but which is now being associated with an additional collection2 . A product’s member status (primary or secondary) is based on its first association with a collection. Although the product may be omitted from a later version of the collection, it retains its primary or secondary member status through all subsequent versions of the collection based on its initial association. In a similar way, collections are categorized as having either primary or secondary ‘member status’ in their bundles.