Email Receipts to CSV

Save your digital receipts to CSV

I have created my own system for retrieving data from my digital receipts by scouring through incoming emails that meet the specific criteria set in cPanel.

After a little trial and error, I settled on some simple PHP and a few steps to get the desired result. As soon as a new receipt email comes in, the PHP script will read through its contents and store a copy of the relevant data – just the date, business name and total – in a CSV which you can download to use later for your bookkeeping.

Supported receipts

The script will recognise receipts in £ using the following services:

The PHP script

First, you will need to create a PHP script that reads your email contents and stores information relevant to your bookkeeping.

Here is a copy of the PHP file to save to a public or private directory on your website:

#!/usr/local/bin/php -q
<?php

chdir(__DIR__);
date_default_timezone_set('Europe/London');

$log_file = './receipts-log.csv';
$access = (isset($_GET['access']) && $_GET['access'] == 'Enter-A.code-Here-2_download_CSV');
$download = ($access && isset($_GET['download']) && $_GET['download']);
$log = (!isset($_GET['log']) || isset($_GET['log']) && $_GET['log']);
$m = array();
$data = $email = '';
$total = $date_formatted = $date = $subject = $business = NULL;

if ($download)
{
    header('Content-Type: text/csv');
    header('Content-Disposition: attachment; filename="Receipts ' . date("Y-m-d") . '.csv"');
    echo 'Date,Business,Total' . PHP_EOL;
    echo file_get_contents($log_file);
    exit;
}

$fp = fopen('php://stdin', 'r');

while (!feof($fp))
{
    $email .= fread($fp, 1024);
}

fclose($fp);

if ($email == NULL)
{
    exit;
}

if (preg_match('/^Subject:\s+(.+)/m', $email, $m))
{
    $subject = trim(mb_decode_mimeheader($m[1]));
}

if (preg_match('/^Content\s+preview:.+(?:Your\s+)?[Rr]eceipt\s+(?:from|for\s+[Oo]rder[\s#]+\d+\s+at):?\s+(.+)\s+for\s+.*$/m', $email, $m)) 
{
    $business = trim($m[1]);
}
elseif (preg_match('#Content\s+preview:\s+Thank.+[\d.-]+\s+[A-Z]{3}[\r\n\s]+[\d/-]+\s+[\d:]+\s+([^\r\n]+)#', $email, $m)) 
{
    $business = trim($m[1]);
}
elseif ($subject != NULL && preg_match('/(?:Your\s+)?[Rr]eceipt\s+(?:from|for\s+[Oo]rder[\s#]+\d+\s+at):?\s+(.+)/m', $subject, $m))
{
    $business = trim($m[1]);
}

if (preg_match('/^[A-Z ]+$/', $business))
{
    $business = ucwords(mb_strtolower($business));
}

if (preg_match('/^Date:\s+(.+)/m', $email, $m))
{
    $date = strtotime($m[1]);
}

if ($date != NULL)
{
    $date_formatted = date("Y/m/d H:i", $date);
}

if (preg_match('/^\s*Content\s+preview:.+(?:[Rr]eceipt\s+for|Total)\s+£(\d{1,3}(?:[.]\d{2})?).*$/m', $email, $m))
{
    $total = $m[1];
}
elseif (preg_match('/^\s*Content\s+preview:\s+[^\d:]+(\d+(?:\.\d+)?)[^\d:]*$/m', $email, $m)) 
{
    $total = $m[1];
}
elseif (preg_match('/(?:for\s+|>)(?:=C2=A3)(\d{1,3}(?:[.]\d{2})?)(?:[\b\s]+)/', $email, $m)) 
{
    $total = $m[1];
}
elseif (preg_match('/<th\s+class=(?:3D)?"value">\s*(?:=C2=A3)(\d{1,3}(?:[.]\d{2})?)\s*<\/th>/m', $email, $m)) 
{
    $total = $m[1];
}
elseif (preg_match_all('#(?:=C2=A3|£|£)([\d.]+)#m', $email, $m) && is_array($m) && is_array($m[1]))
{
    foreach ($m[1] as $v)
    {
        if (!is_numeric($v) || is_numeric($v) && (is_numeric($total) && $total > $v))
        {
            continue;
        }

        $total = $v;
    }
}

$data = array(
    'date_formatted' => $date_formatted,
    'business' => $business,
    'total' => $total,
);

if ($log)
{
    $fp = fopen($log_file, 'a');
    fputcsv($fp, $data);
    fclose($fp);
}

You will need to edit and test this to match with your own requirements.

The regular expressions will pick up data from Dojo, Square, SumUp and Toast receipts. The currency symbol here is £ (or =C2=A3 within the content).

Ensure you’ve uploaded the PHP file using just LF (\n) line breaks (to avoid a Zend Extension error) and set the file’s CHMOD to “700”. Remember to include the first line, just before you open the PHP tag as this is used to identify the location of the PHP executable.

The steps

Here are the steps you’ll need to take:

  1. Create a PHP script using the example above as your starting point; place it publicly if you want to download the CSV in your browser. CHMOD this file to 700.
  2. Optionally, add an empty log file to match the file in the code.
  3. Set up your unique email alias in cPanel | Email Forwarders (e.g. receipt@domain.com) that redirects to two destinations:
    • To an email account so you can receive the original receipt (e.g. your-main-account@domain.com) and
    • To a PHP script to store the data as a CSV file – go to Advanced | Pipe to Program and enter the path starting relative to your account’s root: (e.g. public_html/sub_directory/receipt.php
  4. Test it.

Debugging

I recommend adding a long string for the $email variable to simulate an incoming email. You can grab the source data from Roundcube’s view source (or similar). This will help identify the patterns for the business name and total.

If you spot the Zend Extension error, you’ll need to ensure the files only contain LF line breaks (unix format).

Do remember to specify the two destinations for your email forwarders, otherwise, you will lose the original copy of the receipt.

Your comments

This will require some moderate knowledge of PHP and cPanel. This is my initial draft, so there may be some points that I’ve missed. Please add your comment or get in touch if you want questions.

Last updated on

Leave a Reply

Your email address will not be published. Required fields are marked *