hg diff on MySQL Workbench files
Asked Answered
S

1

2

I'm posting this as a Q&A to document a workaround for a problem that seems to come up frequently—how to put MySQL Workbench files under version control—but for which I have been unable to find any solutions. Feedback is welcome!

How can I tell Mercurial to diff the contents of a zipped archive and ignore some of the changes to those contents? Specifically, how can I use hg to diff the contents of a MySQL Workbench (.mwb) file, ignoring the many unimportant changes that MySQL Workbench makes every time the file is opened? Can I use a custom script that ignores certain irrelevant changes?

Background

I am trying to diff a file in an hg repository. The file, document.mwb.xml, is an XML document extracted from a .mwb file (a MySQL Workbench model file). Basically, I am looking to keep the model's contents—the table structure, visual model, etc.—under version control, but not commit the .mwb file itself, which is a zip archive and thus a binary file.

Anytime I save the .mwb file, I unzip it. I keep the unzipped contents in my repository, and just zip them up again when I need to work with the .mwb in MySQL.

The XML in question looks like this:

<?xml version="1.0"?>
<data grt_format="2.0" document_type="MySQL Workbench Model" version="1.4.4">
  <value type="object" struct-name="workbench.Document" id="8551CCFA-3AD0-4207-BC76-15ED589CF22C" struct-checksum="0x7131bf99">
    <value type="object" struct-name="workbench.logical.Model" id="B48E1CD2-3386-40B7-8E59-AA191598F667" struct-checksum="0xf4220370" key="logicalModel">
      <value _ptr_="0x7fbcd1cc3270" type="list" content-type="object" content-struct-name="workbench.logical.Diagram" key="diagrams"/>
      <value _ptr_="0x7fbcd1cc3210" type="dict" key="customData"/>
      <value _ptr_="0x7fbcd1cc32d0" type="list" content-type="object" content-struct-name="model.Marker" key="markers"/>
      <value _ptr_="0x7fbcd1cc3330" type="dict" key="options"/>
      <value type="string" key="name"></value>
      <link type="object" struct-name="GrtObject" key="owner">8551CCFA-3AD0-4207-BC76-15ED589CF22C</link>
    </value>
    <value _ptr_="0x7fbcd1cc2b70" type="list" content-type="object" content-struct-name="workbench.OverviewPanel" key="overviewPanels"/>
    <value _ptr_="0x7fbcd1cc2c00" type="list" content-type="object" content-struct-name="workbench.physical.Model" key="physicalModels">
      <value type="object" struct-name="workbench.physical.Model" id="34B9E967-5C9B-4D1B-8759-C417F6C33AA3" struct-checksum="0x5f896d18">
...

The problem is all of those _ptr_ attributes: there are literally thousands of them in this file, and every one of them changes every single time the file is saved, even if nothing is modified. As a result, the repository can rapidly get cluttered with completely meaningless "changes" to this file.

Is there a way to use a custom diff routine to ignore these irrelevant changes?

Schism answered 9/11, 2015 at 17:32 Comment(0)
S
3

I have not found a true solution, but I have developed a satisfactory workaround, inspired by this mwb-diff gist. This allows me to unzip and diff the .mwb file's contents, commit those contents and their changes to the repository, and use the .mwb normally when necessary.

Project Structure

My project is set up like this:

project_root
    /dist
    /schema
    /src
    /test

I save the .mwb file - call it MyModel.mwb - in project_root/schema. Obviously, you can use a different structure, but you will need to modify the instructions below accordingly.

The Scripts

I created the following scripts and saved them in project_root/schema:

unpack.sh

#!/bin/bash

# Unzip the model (MyModel.mwb) into a particular directory (project_root/schema/MyModel)
unzip -o MyModel.mwb -d MyModel/

# Replace all _ptr_="...." attributes with _ptr_="xxx"
sed -i presed -E 's/_ptr_="0x[0-9a-f]+"/_ptr_="xxx"/g' MyModel/document.mwb.xml

pack.sh

#!/bin/bash

# This file goes into the directory containing the model contents, zips them up, and saves them as a .mwb model

cd MyModel/
zip -r ../MyModel.mwb ./* -x lock
cd ..

Getting Mercurial Ready to Rock

We need to tell hg to ignore the model (and all other .mwb files). Also, when MySQL Workbench is open, it adds a lock file to the .mwb archive, which we need to ignore. So, add these lines to your .hgignore file:

*.mwb
*.mwb.bak
schema/MyModel/lock

An Aside About the data.db File

Optionally, also ignore the data.db file (a SQLite database) in the .mwb file. It is a binary file that contains any INSERTs or other non-create SQL statements that are part of your model. As a rule, I don't use MySQL Workbench for this stuff; I use it only to create and edit tables, views, etc. So, I added this line to .hgignore:

schema/MyModel/data.db

If you want to track changes to the data.db file, you may need to modify this workaround.

How to Use the Scripts

When you want to modify the .mwb file, rebuild it from its components by running pack.sh above. This could be added as a hook to happen automatically when you hg pull, update, etc., but I haven't explored this, yet.

When you are done editing the .mwb file and want to commit your changes, run the unpack.sh script. If you want, you can set up a file-watching utility on your system to do this automatically when the file changes, but that's beyond the scope of this answer.

The Results

Mercurial is now perfectly happy to track changes to the contents of your .mwb file without tracking thousands of apparently-useless _ptr_ attributes. Also, while I am using this with Mercurial, the basic logic (and the shell scripts) will work with git, SVN, etc.

IMPORTANT CAVEAT: As far as I can tell, the _ptr_ attributes are irrelevant. The scripts I have posted above actually replace the contents of those attributes. _ptr_="0x98a7b3e4" (or whatever) becomes _ptr_"xxx". Based on my testing, this doesn't matter, and MySQL Workbench will happily work with the reconstituted file, apparently disregarding the _ptr_ values. I may be wrong, and these values may matter! You are strongly encouraged to test this for yourself before relying on my solution.

Schism answered 9/11, 2015 at 17:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.