Derivatives API: midterm status update

This year's Summer of code has come to it's midterm evaluation, and this could also be a great opportunity to write some words about my project. As I've written before, I work on Derivatives API for Media ecosystem in Drupal 7. 

I've been committing quite a lot of code to my sandbox and API was getting it's rough form lately, so it should be the right time to share current state of the project and my thoughts about the roadmap with the community. I'd love to get some feedback about it, since this is my first serious API project. Feel free to comment this blog post, thread on groups.drupal.org or to contact me directly, if you have any thoughts about the project, you would like to share with me.

API is absolutely not ready to be used in any type of production environment, but it definitely needs testing. It would be great, if there are engine developers, that would be prepared to implement initial support for Derivatives API. This would help API to become stable enough in a much shorter period of time. API changes are still possible, but most of the basic API should stay at least similar to how it looks now (except Schedulers - see next part). 

Basic structure


Basic API structure is shown on image. Most important parts of API are:

  • Engines: responsible for physical creation of derivatives. Engines are implemented as own modules. API itself does not implement any engine functionalities, it only provides hooks for them.
  • Rules: their responsibility is to decide, if a given file should be derivated with a given preset at a given time. Few basic rules are implemented in API itself. Hooks are provided to allow custom modules to implement own custom rules.
  • Triggers: triggers are executed on events, when a file's derivative should be created. API implements two basic triggers, that are executed when a file is uploaded and when a file was added to a file/media/image field. Hooks are provided, that allow other modules to provide custom triggers.
  • Presets: presets are responsible for configuration, that is used to create a derivative of a given file. Presets are based on CTools export API. There is currently no UI for preset management (a preset can be created in code, though), but it is planned to implement that in a short time (see Roadmap).
  • Schedulers: scheduler's responsibility is to run derivation of a file at the right time. API currently implements two basic trigger, that are executed when a file is inserted and when a file is added to a field. No hooks, that would allow customization of this behavior, are currently provided. It is, however, planned to do this in next few days (see Roadmap). This is the only part of basic API, where I plan to do significant API changes.

Derivative is saved in file_managed table if successfully created.

You are more than welcome to take a look at the code and experiment with it. It can be found in project's sandbox module at d.o.. I will provide more detailed description of each part of API in next sections.

Engines

Engines are really simple. Module that wants to be an engine, should first implement a simple info hook, that returns an array containing info about engine's capabilities:

function hook_media_derivatives_engine_info() {
  return array(
    'type' => 'video',
    'stream_wrappers' => array('public://', 'private://'),
    'mime_types' => array('video/mp4', 'audio/*'),
  );
}
  • type: type of files, that can be derivated using this engine,
  • stream_wrappers: array of stream schemas, that this engine know how to deal with (optional),
  • mime_types: array of mime types, that this engine know how to deal with (optional), wildcard '*' can be used.

There is also hook_media_derivatives_engine_info_alter() available. Derivation callback should be implemented, when Drupal knows about the engine:

function hook_media_derivatives_create_derivative($file, $derivative) {
  // Create derivative of a $file. Configuration preset is saved in $derivative.

  return $new_file;
}

This function is expected to return derivative's URI or a full file object. API will build file object, if only URI is returned.

Rules

Module that wants to implement a derivatives rule, should implement info hook:

function hook_media_derivatives_rules_info() {
  $rules = array();
  $rules['example_rule'] = array(
    'name' => t('Example derivation rule'),
    'description' => t('In-depth description of example rule'),
    'callback' => 'mymodule_myrule_callback',
  );
  return $rules;
}

There is also hook_media_derivatives_rules_info_alter() available. Hook should return array of rules implemented by this module, with each element's key being a rule's machine name and value an array with human readable name, description and name of callback function. API will call this function when a rule needs to be tested. Callback function is expected to return TRUE or FALSE. File will only be processed if if all active rules return TRUE. Source file and configuration preset will be passed to callback function. Here is a simple example of a rule callback function:

function media_derivatives_type_support($file, $preset) {
  return $file->type == $preset->rules_settings['type'];
}

Triggers

Info hook should be implemented to inform system about a new trigger:

function hook_media_derivatives_triggers_info() {
  $triggers = array();
  $triggers['file_insert'] = array(
    'name' => t('File insert derivative trigger'),
    'description' => t('Derivative trigger, that will be executed when a new file is saved.'),
    'validation_callbacks' => array(
      '_media_derivatives_file_insert_validation'
    ),
  );
  return $triggers;
}

There is, once again, hook_media_derivatives_triggers_info_alter() available. Info hook is mostly self-explanatory. Only part, that probably needs some more attention, are optional validation functions. Names of validation callbacks should be passed in that array. Callback functions are going to be called before a derivative will be created. Validation function should return TRUE if a file should really be derivated or FALSE, if a derivation process of a given file, triggered by this trigger, should be canceled. Validation callback will get file, configuration preset and context information as arguments. Example of validation callback:

function _media_derivatives_field_presave_validation($file, $preset, $context) {
  return in_array(
    $context['field'], 
    $preset->triggers_settings['field_presave_allowed_fields']
  );
}

Module should call media_derivatives_create_all_derivatives() or media_derivatives_create_derivative(), when a trigger should be executed. First function will take a file and try to create derivatives for all active presets in the system, while second will take preset as an argument and create a derivative just for that preset.

// Some code ...
media_derivatives_create_all_derivatives($file, $trigger, $context);
// Or just for a single preset
media_derivatives_create_derivative($file, $preset, $trigger, $context);
  • $file: source file file object,
  • $trigger: trigger's machine name as passed to info hook,
  • $context: context information, that will be passed to the validation callback,
  • $preset: configuration preset, that is to be used for this derivation process.

Presets

Presets hold configuration for derivation. There can be more active presets in a system at the same time. To create a preset from code, one should first implement CTools hook_ctools_plugin_api(). This example should work for 95% of use cases:

function hook_ctools_plugin_api($owner, $api) {
  if ($owner == 'media_derivatives' && $api == 'media_derivatives_presets') {
    return array('version' => 1);
  }
}

hook_media_derivatives_presets() should be implemented to actually create one or more configuration presets. Function should return an array with preset objects as it's items. Here is an example:

function hook_media_derivatives_presets() {
  $export = array();
  
  $preset = new stdClass;
  $preset->api_version = 1;
  $preset->machine_name = 'example_preset';
  $preset->rules = array('file_type', 'mime_type', 'derivatives_of_derivatives');
  $preset->rules_settings = array(
    'type' => 'video',
    'stream_wrappers' => array('public://', 'temporary://'),
  );
  $preset->triggers = array('field_presave');
  $preset->triggers_settings = array(
  	'field_presave_allowed_fields' => array('field_asdfasdf'),
  );
  $preset->settings = array(
    'engine' => 'example_engine',
    'scheduled_type' => MEDIA_DERIVATIVE_RUN_IMMEDIATELY,
    'recursive_delete' => TRUE,
  );
  
  $export['ffmpeg_ex_preset'] = $preset;  
  return $export;
}

Definition of preset object:

  • api_version: presets api version (currently 1),
  • machine_name: preset's unique machine name,
  • rules: rules, that need to be tested, before a file is going to be derivation using this preset,
  • rules_settings: settings that rules expect for itself,
  • triggers: array of triggers, that this preset should react upon,
  • triggers_settings: settings that trigger validation callbacks expect for itself,
  • settings: global settings, expected by API itself
    • engine: machine name of engine that should be used with this preset,
    • scheduled_type: scheduling policy. Can be MEDIA_DERIVATIVE_RUN_IMMEDIATELY (default) if encoding should be run immediately or MEDIA_DERIVATIVE_SCHEDULE if it should be run sometime in the future,
    • scheduled_time: timestamp - when to run encoding if scheduled in future (required if scheduled_type was set to MEDIA_DERIVATIVE_SCHEDULE).
    • recursive_delete: delete derivative if source file was deleted (defaults to FALSE),
    • user: derivative owner selection policy. Possible values: MEDIA_DERIVATIVE_OWNER_FILE - derivative will have the same owner as original file (default), MEDIA_DERIVATIVE_OWNER_DERIVATIVE - user that triggered creation of derivative will also be it's owner, MEDIA_DERIVATIVE_OWNER_STATIC - owner can be statically chosen,
    • user_uid: owner uid if 'user' was set to MEDIA_DERIVATIVE_OWNER_STATIC.

Roadmap

Features that are planned to be implemented in near future:

  • schedulers,
  • live encode status information retrieval, 
  • non-blocking batch encoding, 
  • ability to delete source, when a derivative was created, 
  • presets management UI, 
  • display derivatives list for a given file, 
  • unmanaged derivatives,
  • features support, 
  • views support.