Custom Table for WordPress Custom Fields

I just read Tom McFarlin’s thoughts on building web apps with WP and my comment started to run long, so here I am, writing a blog post.

I agree with Tom, WP can make a great foundation for an app. I chose to use WP when I started building WP App Store and it gave me a huge head start when compared to starting with an app framework. Having the dashboard UI built and ready to customize is probably the piece that saved the most time. As Tom mentions, there’s user management, a clean API, and lots more. All of which are battle-tested by thousands of developers and millions of web sites. As I went deeper into building WP App Store though, I did learn some lessons about what works and what doesn’t when using WP for an app. In this post, I’ll describe one that I’ve been grappling with lately.

As WP App Store has become more complex, so has our data and our queries. Some of our queries are now looking pretty ugly, having a lot of joins. Here’s one that gets a report of a publisher’s sales commissions by querying the order post type and its custom field data:

SELECT pm2.meta_value as product_title, COUNT(*) as unit_count, SUM( pm1.meta_value ) as commission_total
FROM $wpdb->posts
	INNER JOIN $wpdb->postmeta AS pm1 ON pm1.post_id = $wpdb->posts.ID AND pm1.meta_key = '_commission_amount'
	INNER JOIN $wpdb->postmeta AS pm2 ON pm2.post_id = $wpdb->posts.ID AND pm2.meta_key = '_product_title'
	LEFT OUTER JOIN $wpdb->postmeta AS pm3 ON pm3.post_id = $wpdb->posts.ID AND pm3.meta_key = '_is_test'
	INNER JOIN $wpdb->postmeta AS pm4 ON pm4.post_id = $wpdb->posts.ID AND pm4.meta_key = '_publisher_id'
    INNER JOIN $wpdb->postmeta AS pm5 ON pm5.post_id = $wpdb->posts.ID AND pm5.meta_key = '_product_type'
    LEFT OUTER JOIN $wpdb->postmeta AS pm6 ON pm6.post_id = $wpdb->posts.ID AND pm6.meta_key = '_is_refunded'
WHERE pm5.meta_value = '%s'
    AND pm4.meta_value = '%s'
    AND (pm3.meta_value IS NULL OR pm3.meta_value = '' OR pm3.meta_value = '0')
    AND (pm6.meta_value IS NULL OR pm6.meta_value = '' OR pm6.meta_value = '0')
    AND $wpdb->posts.post_type = 'order'
    AND ($wpdb->posts.post_status = 'publish')
GROUP BY pm2.meta_value

Why a custom query? Why not use the WP_Query object for this? I believe WP_Query is intended for querying posts, but in this case we’re generating a report, leveraging MySQL’s awesome ability to count the number of sales, add up the commissions, and return one neat row with these numbers for each product title. This is the kind of stuff MySQL is great at doing.

Unfortunately though, custom fields were not designed for this. They were originally designed to be returned from the database and displayed, not queried. Performance is terrible when queries have to join the postmeta table many times for each custom field key. Plus, as Alex King pointed out in a recent episode of Shop Talk Show we’re querying a MySQL column of type longtext which is slow. And to top it off, the meta_value column isn’t indexed.

If we look at the custom fields in the query above, you could convert some of the custom fields into taxonomies because they would be repeated across several orders, but there is some data that is unique to a single instance of a post as well. Also, if we converted all the custom fields to taxonomies, we’d end up querying for many taxonomies in this one query, which would again result in many joins. Instead, I’ve been considering creating a custom database table for each post type with columns for each of the custom fields.

So, for the data in the example above, a custom table to hold the custom field data might look like the following:

  `order_id` bigint(20) NOT NULL,
  `publisher_id` bigint(20) NOT NULL,
  `product_title` varchar(255) NOT NULL,
  `product_type` varchar(255) NOT NULL,
  `commission_amount` float NOT NULL,
  `is_test` tinyint(4) NOT NULL,
  `is_refunded` tinyint(4) NOT NULL
  PRIMARY KEY (`order_id`)

Then the query above would look like this:

SELECT pm.product_title, COUNT(*) as unit_count, SUM( pm.commission_amount ) as commission_total
FROM $wpdb->posts
    INNER JOIN $wpdb->pm_order AS pm ON pm.post_id = $wpdb->posts.ID
WHERE pm.product_type = '%s'
    AND pm.publisher_id = '%s'
    AND (pm.is_test IS NULL OR pm.is_test = '' OR pm.is_test = '0')
    AND (pm.is_refunded IS NULL OR pm.is_refunded = '' OR pm.is_refunded = '0')
    AND $wpdb->posts.post_type = 'order'
    AND ($wpdb->posts.post_status = 'publish')
GROUP BY pm.product_title

We’ve gone from 6 joins down to 1 and the columns we’re now querying are of types representative of the data they contain. A major improvement to the query performance, but also much easier to read and maintain in the future.

What do you think of using custom database tables to store custom field data for an app built with WP? Maybe there’s a better way?

Thanks to Tom for reviewing and providing feedback on this post before I published it.