Asynchronous PHP Gotchas

As part of a PixCede re-write (read: correcting damage from a rabbit I chased to far down a hole -- more on that in another post), I decided to modularize the scripts for handling new messages. Previously, procmail was sending the email to a single PHP script that handled extracting attachments, storing the image, and sending the confirmation message. As the script started to get unwieldy, I decided to break it into separate scripts:

  • newMailDaemon.php - main script that handles writing the attachment to disk and asynchronously calling additional tasks. This is purposely kept minimal, so that there is less chance of something going wrong (e.g. compilation error). Worst case scenario, the email is written to disk; if the other tasks fail, the unit of work can be replayed from the message dump

  • processMail.php - handles extracting the image, creating the shortname, inserting it into the database, and sending the email confirmation. This will be split up eventually

After newMailDaemon.php writes the mail file to disk, it calls exec(..) to asynchronously kick off processMail.php. I was about to pull my hair out until I figured out a couple of things:

  • exec(..) does not get called as though through a shell. As a result of that, you need to use absolute paths. That means that `php ./processMail.php` becomes `/usr/bin/php /var/pixcede/xxxx/processMail.php`

  • As a corollary to the previous point, absolute paths need to be specified in exec'ed scripts as well; e.g. the database file specified to sqlite3

  • Permissions matter! I was using a logger to follow the actions once an email was resolved, and I couldn't figure out why nothing from processMail.php was getting logged. I was logging the command that got exec'ed, and running it manually -- and it always worked. procmail calling newMailDaemon.php calling processMail.php runs at the permissions of the user account that procmail is acting on behalf of, so the script needs to permissions to run from that user. Whatever is running the main script needs permission to exec additional scripts

This design is far from perfect. The different actions need separated more; ideally, having newMailDaemon load and parse an external workflow file wouldbe nice -- then I could lock down newMailDaemon and not have to change it to update the workflow (less chance of it failing). To handle and track the different steps, I would like to keep a table of actions to process, have entries put into that, and have another script act on that table until all tasks are set to "complete"; that sounds more scalable and manageable than exec'ing scripts for each action that needs to be done. Having newMailDaemon schedule tasks and processTasks execute those tasks sounds cleaner.

At any rate, PixCede works again now!

No comments: