Clean Up WordPress Revisions Using a MySQL Multi-Table Delete
by J. Michael Ambrosio
Thursday, May 6th, 2010
I’ve been doing some research on the revisions WordPress creates automatically whenever you edit or update a post. The revision feature is sparsely documented in the WordPress Codex, so I was pleased to find an article from SitePoint on the subject: How to Control Post Revisions in WordPress. It’s an excellent write-up overall, and I recommend reading it. However, I did notice a flaw in the SQL statement given for deleting old revisions. Since this flaw could potentially cause unintended data loss, and since comments on the article are closed, I wanted to explain the problem here.
In order to thoroughly remove all traces of old revisions from the WordPress database, it is necessary to clean up the wp_term_relationships table. This is because WordPress prior to version 3.0 stores the category relationships for revisions in that table. The SQL statement given in the SitePoint article to purge revision data is as follows:
FROM wp_posts a
LEFT JOIN wp_term_relationships b ON (a.ID = b.object_id)
LEFT JOIN wp_postmeta c ON (a.ID = c.post_id)
WHERE a.post_type = ‘revision’;
Notice that the SQL statement is joining the wp_posts and wp_term_relationships tables, and then performing a multiple table delete to remove all rows from both tables where post_type equals ‘revision’ (and from the wp_postmeta table as well). It’s an elegant statement designed to clean up the entire database in one step.
Here’s the flaw: there are two types of category relationships stored in wp_term_relationshps — post categories, and link categories. The object_id field can contain either an ID from the wp_posts table, or a link_id from the wp_links table. It is entirely possible that a revision and a link could have the same object_id, as was in fact the case in my own WordPress database:
| object_id | term_taxonomy_id | term_order |
| 2 | 1 | 0 |
| 3 | 1 | 0 |
| 9 | 1 | 0 | <-- same object_id
| 9 | 2 | 0 | <--
| 10 | 1 | 0 | <-- same object_id
| 10 | 2 | 0 | <--
When I ran the above SQL statement on my database (after having made a backup, of course), it deleted the category relationships for both revisions and links, in all cases where they happened to have the same object_id. Oops.
How can this be? It is important to understand that wp_term_relationships uses a composite primary key. If we run ‘describe’ on that table, we can see that both the object_id and term_taxonomy_id fields are marked ‘PRI’:
| Field | Type | Null | Key | Default |
| object_id | bigint(20) unsigned | NO | PRI | 0 |
| term_taxonomy_id | bigint(20) unsigned | NO | PRI | 0 |
| term_order | int(11) | NO | | 0 |
The wp_term_relationships table sits between wp_posts and wp_term_taxonomy, and defines the many-to-many relationships between them. This means that, in order to select a unique row from that table, we must join it with both wp_posts and wp_term_taxonomy, which stores the distinction between post categories and link categories in the ‘taxonomy’ field:
| term_taxonomy_id | term_id | taxonomy | . . .
| 1 | 1 | category | . . .
| 2 | 2 | link_category | . . .
Armed with this knowledge, we can write a revised SQL statement that removes all revisions, without the potential for accidentally deleting link relationships. When I ran the revised query on my database, it worked exactly as intended:
FROM wp_posts a
LEFT JOIN wp_term_relationships b ON ( a.ID = b.object_id)
LEFT JOIN wp_postmeta c ON ( a.ID = c.post_id )
LEFT JOIN wp_term_taxonomy d ON ( b.term_taxonomy_id = d.term_taxonomy_id)
WHERE a.post_type = 'revision'
AND d.taxonomy != 'link_category';
Granted, this is not a catastrophic flaw; many people who use WordPress as a CMS don’t even use the wp_links table, and for those who do, any damage done would be relatively easy to fix. However, since it did affect me personally, and since it is a interesting case study in database design and the use of composite keys, I felt it was worth pointing out.
My thanks to Craig Buckler, author of the SitePoint article, for graciously responding to my message concerning the flaw, and posting a correction.
More From the Blog
- Simplify WordPress Editing Screens Using remove_post_type_support
- Seeking Beta Testers for Next/Previous Post Link Plus 2.0
- Truncate Long Titles to the Nearest Whole Word Using PHP strrpos
- It’s Official: WordPress is Not Just a Blog Any More
- Shorten Category Names in WordPress Dropdown Menus
- Hide Phone Numbers From Skype Using the HTML Soft Hyphen