Cleanup changes table

From Zarafa wiki

Jump to: navigation, search

When doing an upgrade on a really large dataset, it can take a really long time to change the primary keys on the changes table. It seems that cleaning up the changes table also can improve performance in some situations.

Lots of information in the changes table is not necessary because all data is most of the time synced to the clients. So if you are sure that all clients have synced all data before e.g. 1-1-2011, it is possible to delete all entries before that date. This will make the upgrade to 6.40 go much faster.

First run the query:

select min((select min(id) from changes where changes.parentsourcekey=syncs.sourcekey and> syncs.change_id and 
changes.change_type & (syncs.sync_type * 0x1000) != 0 and sourcesync != as next from syncs where change_id > 1 and sync_time > '2011-1-1';

The date format in the above query is: 'Year-Month-Day'.

The query will output a number e.g.:

| next     |
| 94429444 |
1 row in set (12 min 47.77 sec)

You can use this number to delete anything in the change table that has an id smaller than this number, e.g.:

delete from changes where id < 94429444

Note: If there are clients that have not synced AFTER date x, and you start deleting all data BEFORE date x from the changes table, those clients will NOT sync any item anymore from BEFORE date x. A work around for this is to do a complete resync of those clients

However it is better to do this in phases on large databases, to prevent deadlocks. The following bash script is an example how to do this (change the CHANGES_ID number to the number from the query above, in the example case this was 94429444):


# This option will limit the deletion procedure to 1000 items at a time

# This option contains the id number from the query above in this wiki page


while [ $RA -gt 1 ]; do
  RA=`mysql zarafa -e "delete from changes where id < $CHANGES_ID LIMIT $LIMIT; select row_count();" | tail -n 1`
  RT=`expr $RT + $RA`
  echo "Deleted $RT records"
Personal tools