How We Optimized Drupal 8 PDF Exports: Handling 600+ Node Reports

Reading time: 2 minutes

Last modified: 24 May 2018

The Challenge: Large-Scale PDF Generation in Drupal 8

When a client needed to generate comprehensive PDF reports from hundreds of Drupal nodes, we knew the standard Entity Print module wouldn’t cut it. The requirements were clear:

  • Export individual nodes as separate PDFs
  • Combine up to 600 nodes into a single, unified PDF report
  • Operate within Pantheon’s execution time and memory limits
  • Maintain reasonable performance for content editors

Why Standard Solutions Fell Short

Drupal 8’s Entity Print module works well for smaller exports, but it hits significant limitations with larger datasets:

  • Memory constraints: Processing 600 nodes at once required nearly 1GB of memory
  • Execution timeouts: The process could take up to 2 hours, far exceeding typical web server timeouts
  • Server limitations: Pantheon’s platform enforces strict resource limits to ensure stability
Drupal PDF Export Process

Our Solution: Batch Processing with Drush

Phase 1: Parallel PDF Generation

Instead of processing all nodes in a single request, we implemented a batch processing system that:

  1. Processes nodes in manageable chunks
  2. Generates individual PDFs for each node
  3. Tracks progress between batch operations
// Example batch operation callback
function mymodule_generate_pdf_batch_operation($nids, &$context) {
  $node_storage = \Drupal::entityTypeManager()->getStorage('node');
  $nodes = $node_storage->loadMultiple($nids);
  
  foreach ($nodes as $node) {
    // Generate individual PDF for each node
    $pdf = \Drupal::service('entity_print.pdf.engine')->getBlob($node);
    // Save PDF to temporary directory
    file_save_data($pdf, 'temporary://pdf-export/node-' . $node->id() . '.pdf');
  }
  
  $context['results']['processed'] = count($nodes);
  $context['message'] = t('Processed @count nodes', ['@count' => count($nodes)]);
}

Phase 2: PDF Merging with Ghostscript

After generating individual PDFs, we used Ghostscript to merge them efficiently:

# Example Ghostscript merge command
gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=combined.pdf node-*.pdf

Performance Results

Our optimized solution delivered impressive results:

  • 596 nodes processed in just 14 minutes (vs. 2+ hours previously)
  • Memory usage stayed well within Pantheon’s limits
  • Final merge operation completed in 9 minutes
  • Output: A single 596-page PDF (25MB)

Key Takeaways

  1. Batch processing is essential for large Drupal operations
  2. Drush provides a more reliable environment for long-running tasks
  3. Ghostscript offers efficient PDF merging capabilities
  4. Modular architecture allows for flexible deployment (UI or CLI)

Implementation Notes

  • The solution works both through Drupal’s UI and Drush
  • Progress tracking helps monitor long-running operations
  • Temporary file cleanup is handled automatically
  • The system gracefully handles failures and can resume interrupted processes

Looking Ahead

This approach can be extended to:

  • Scheduled report generation
  • Email delivery of completed reports
  • Integration with Drupal’s queue system for even larger datasets

Need help with your Drupal performance challenges? Contact our team to discuss how we can optimize your Drupal implementation.

Table of Contents