PBCore 2.0: What I'd like to see

This is a short writeup of things I would like to see present in PBCore 2.0, which is currently in progress. It reflects my own personal opinions, etc.

One of the biggest challenges that PBCore 2.0 will face is determining how all-encompassing a standard it should be. Media organizations create a large variety of assets through diverse mechanisms for a wide range of purposes with any and all possible skill sets and technologies. Billed as the metadata standard for public broadcasting, it probably needs to respond to everyone’s needs and avoid requiring the impossible or limiting the foreseeable. It is for this reason I believe the most important thing PBCore 2.0 can do is provide a structure and framework for metadata without proscribing “the one true way”. To do this, PBCore 2.0 must be flexible, and more importantly, extendable if it is going to succeed.

These ideas probably fall outside “core” PBCore-compliance, but would enhance the descriptive power of the schema. All it would take are two considerations during the development of PBCore 2.0: a permissive data model and (more importantly) a system and place to document and describe standard extensions, best practices, and implementations.

One of the biggest strengths of PBCore 1.x, as I’ve written earlier, is the vast data dictionary that is the combination of a number of siloed applications full of current data. In PBCore 2.0, I truly hope due consideration is given to linked data and semantic ontologies to provide an easy way for an organization or individual to supplement a core vocabulary with a purpose-driven vocabulary for describing assets (the EBU’s P-META classification schemes have taken the first tentative step into this realm and are well worth a look) . This could be done as simply as providing URL-based references to data dictionary values, e.g.:

...

RDF Schema
wikipedia.org

...

This system could be easily extended (in a standardized way) to provide data dictionary descriptions, relational information (sameAs, parentOf, etc) and more, while allowing some level of basic compliance that can ignore the extension.

Other extensions to the schema are probably more complex and would require the PBCore 2.0 schema to be permissive, rather than restrictive. One important (and I’d argue, essential) example of this is temporal + spatial media fragments, which could allow a system to describe, in some level of detail, fragments of an asset. This could be represented like:

...

RDF Schema


...

...

...

(obviously the semantics, describing multiple instantiations, and other issues would need to be worked out..)

I’d like to take this a step further and develop a systematic way of embedding other schemas (presumably designed for describing objects and ideas outside of the core focus of PBCore, such as people and entities, rights metadata, and provenance). By developing some best practices, this could be done in a discoverable and standard way, maybe something like:

...

    Chris Beer

   Chris Beer
   Male
   Mr
   
 
    Rabble-rouser

...

Tools that don’t understand FOAF should be encouraged to ignore these additions, but they provide a rich method of extending the schema in a decentralized and flexible manner.

Again, I’m not calling for the inclusion of advanced (and likely, complicated) features into core PBCore compliance, just hoping that in developing a standard for the future, it remains flexible and extendable to meet the needs of all users while being accessible to all.

This entry was posted in Uncategorized and tagged , , . Bookmark the permalink.

6 Responses to PBCore 2.0: What I'd like to see

  1. Dan Brickley says:

    Interesting :)

    Have you looked at SKOS too? An increasing number of controlled vocabularies are available in it, especially those that aren’t quite formal enough to work well as ontologies.

    I hadn’t seen pbcore before finding this post. Does it have any overlap with the Programmes Ontology from the BBC, or similarity with their RDF programme descriptions eg http://www.bbc.co.uk//programmes/b006qgrd.rdf ?

  2. chris says:

    Hi Dan,

    I guess I’ve conflated RDFS, SKOS and OWL. You’re probably right that SKOS is a better fit for vocabularies (and probably easier to implement), although the EBUs classifications (linked to in the post) seem like they’ll map easiest to plain RDFS.

    PBCore started years ago (2004? earlier?) as a data dictionary project to create at least some semantic-level agreement between stations in public broadcasting in the US, and it grew into an XML schema for media assets. After some initial activity, funding dried up, tools were never fully developed (lack of vendor support..), outreach probably wasn’t adequate, and other standards took over. It’s an OK schema and addresses some of the trouble moving images metadata runs into trying to fit into the world (being time-based, having extraordinarily complicated provenance and rights issues, etc).

    There’s been some renewed funding and the process of PBCore 2.0 is just getting started, and we’re trying to figure out where it fits in (not entirely libraries + archives, although there is a serious need there, not in the broadcast chain because public broadcasting is a fairly small fish, etc), although (I hope) it finds a place in cataloguing and discovery systems.

    There is no direct overlap with the Programmes Ontology, although they are operating in similar fields (and it wouldn’t surprise me if there were some informal discussions..). The programme ontology is obviously heavily invested in final broadcast moving image programs, while PBCore is probably closer to Dublin Core (with a pretty large data dictionary behind it..) Another similar standard to throw in the mix is EBUCore, which came about a couple of years ago after PBCore had stagnated.

  3. Dave Rice says:

    Hi Chris,

    This is a great post. My favorite points here are the inclusiveness of attributes to express temporal metadata, here with SMIL. This allows documentation of segments or scenes to be much more streamlined than the current strategy of creating assets for the whole work as well as the segments while sharing the same instantiations with differing timestart and duration values. Potentially this allows for PBCore to be more interoperable with other temporal standards like Apple’s XML Interchange Format in Final Cut.

    I’m a little confused about the nesting of pbcoreInstantiation within pbcoreSubject. If you’re using ID and REFID is this necessary to adjust the structure rather than just the attributes?

    On flexibility vs. restriction I think this is a challenge. On hand PBCore is intended as a ‘core’ set of data and on the hand any organization making an investment in implementing PBCore will likely discover the need to manage extended structured data. Potentially the definition of an alternate extended profile(s) of PBCore that still meet PBCore’s core requirements could aid here. Perhaps this could work similarly to MXF where the overall structure is flexible but published application specifications provide specific profiles of implementation.

    Dave Rice

  4. chris says:

    Hi Dave — it looks like the google syntax highlighter ruined that example — if you read carefully, the instantiation + subject are siblings , and pbcoreSubject is the parent to subject and the fragment extension.

    The fragment extension is necessary (I think..) to say the subject appears in {n} instantiations at {m} media fragments. You could keep repeating the pbcoreSubject for each instantiation and just use attributes, but that seems slightly worse i think vs defining some basic media fragment semantics.

    PBCore compliance could require parsers to ignore unknown attributes (like most XML tools do by default anyway..), so unknown extensions shouldn’t be a problem as long as the core pbcore requirements are in place.

  5. If you are interested in SKOS, I have translated some the key hierarchical EBU classification schemes in SKOS, which can be found here: http://www.ebu.ch/metadata/ontologies/skos/

    It is true that all could be translated in SKOS but ‘flat’ lists are not the best example to demonstrate the value added of SKOS.

    EBU is working on RDF/OWL (look around on the server). We are devloping a class model, which data properties might well be EBUCore (since 1999) attributes. RDF is THE clever answer to the use of classiifcation schemes.

    I have myself some reservation about FOAF, which looks to me like a geek profile but its is so fashionable ;-)

  6. chris says:

    Great stuff, I don’t think I’d run across the EBU SKOS work before — so much nicer than inventing yet another schema.

    I hope the decision makers for PBCore 2.0 look seriously at this work. In my mind, it solves a lot of the problems with data dictionary management. I think there is a lot of work to do to convince less techy people that it is worth the additional complexity.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>