Git conflicts with JSON files
Asked Answered
T

6

23

Our website is localized using a bunch of JSON files with translations (one file per language). The content of the files looks like this:

{
    "Password": "Passwort",
    "Tables": "Tische"
}

Many team members edit these JSON files at the same time, adding new phrases and editing existing ones, and we get lots of conflicts even though people are changing different lines.

Is there a way to set up git in such a way that would help avoid merge conflicts?

P.S. I've found this script to help merge locally: https://gist.github.com/jphaas/ad7823b3469aac112a52. However, I'm interested in a solution that would fix the problem for everyone in the team (even for persons who edit JSONs through GitHub's web-interface).

Table answered 14/10, 2015 at 9:47 Comment(12)
dummy answer: instead, can you help git by structuring your file in a better way? I mean, if your json is split over two or more files (each file for a section, ideally), then you'll reduce merge conflictsJoannejoannes
@Joannejoannes thanks! That might work for some other case, but with our files I don't think there's a good rule per which to split the files.Table
Here is an article which might help.Ilka
@TimBiegeleisen thank you. I actually put a link to the code from the article into the question.Table
Um, if all of your developers are all modifying these files at different times, at different line locations, with different contents, I don't see any way that you could avoid conflicts. Dealing with merge conflicts is just part of the nature of working with a distributed, asynchronous workflow. I think the only way you can avoid it 100% is to have each developer wait for other developers to finish modifying a file before they modify it themselves...but then that defeats the whole purpose of using a distributed VCS like Git.Oops
Ok, so after I thought about it, the single most effective way to avoid merge conflicts is to have your developers sync with upstream changes very frequently before they make and commit their own local changes. But it sounds like that's not what your team is doing, especially if they're making changes through GitHub directly. You could have them make pull requests more frequently between each other, but then that starts getting really taxing if all you're doing is committing one line changes at a time, not to mention polluting your commit history with a ton of complicated merge commits.Oops
If you want to maintain a cleaner history (without the merge commits), your team would need to get into the habit of rebasing their local changes with upstream changes on their own local machines. As far as I know, the GitHub interface doesn't let you do rebasing, only classic merging.Oops
@Cupcake exactly! Everything you said is true.Table
@Table I didn't understand that part about only JSON files conflicting more often than your other files...that's actually interesting. I've never noticed that before. So never mind everything else I said, lol :POops
This might tangentially help: https://mcmap.net/q/341870/-git-merging-within-a-line It seems to me (as NikoNyrh also suspected) git doesn't recognize chunk boundaries as granularly as you would want. git-diff has --word-diff option, which can be helpful before/after merge. One option could be using external tool for merging, e.g. wiggle as suggested on unix.stackexchange.com/questions/20021/…Thomsen
@koiyu it does help understand! Nice link, thanks!Table
One might also use the .gitattributes file to specify per-word diffing. It's applicable to GitHub. See git-scm.com/book/en/v2/Customizing-Git-Git-AttributesTable
I
16

we get lots of conflicts even though people are changing different lines

This shouldn't be the case, you only get conflicts if same line is modified by different people, committed and then later merged.

Oh, I actually tried this out and encountered some odd problems.

Commit 1 (master):

{
    "a": "1",
    "b": "2",
    "c": "3",
    "d": "4",
    "e": "5",
    "f": "6",
    "g": "7"
}

Commit 2 (tmp)

{
    "A": "1",
    "B": "2",
    "C": "3",
    "d": "4",
    "e": "5",
    "f": "6",
    "g": "7"
}

Commit 3 (master):

{
    "a": "1",
    "b": "2",
    "c": "3",
    "d": "4",
    "E": "5",
    "F": "6",
    "G": "7"
}

git merge tmp: correct result

{
    "A": "1",
    "B": "2",
    "C": "3",
    "d": "4",
    "E": "5",
    "F": "6",
    "G": "7"
}

However I get conflicts if also row "d" was modified, maybe git wasn't able to establish diff boundaries. My stupid suggestion to avoid this stupid git behavior is to add "padding" to the JSON file (ugly, isn't it? But no more conflicts):

{
    "a": "1",

    "b": "2",

    "c": "3",

    "d": "4",

    "e": "5",

    "f": "6",

    "g": "7"
}
Importune answered 21/10, 2015 at 14:33 Comment(2)
Interesting! We should try that! I'll let you know if it works for us. Thanks!Table
I was able to reproduce a conflict with your test case and padding with newlines resolves it indeed! Thank you for actually trying and finding an easy-to-implement solution!Table
E
8

One thing I would do in such a scenario would be to maintain the configurations in a database table instead of a JSON file - if they change all that frequently. As others have already pointed out, there is not much you can do to avoid conflicts if you have that high number of changes happening to the config all the time. Your example anyway looks more like a mapping between word in English and some other language, so a three column table should suffice.

The JSON file, if needed could be generated either on the fly every time or generated once during deployment for each server from the database table.

Error answered 16/10, 2015 at 15:20 Comment(2)
It would be a nice solution, didn't think about it! Thank you!Table
@Table Happy to help :)Error
S
3

Another reason, why you see so many conflicts could be that your developers are using different line ending configurations. See How to change line-ending settings in Git. In order to find out, you can open a json file with a Hex editor and see if all line endings are consistent across the whole file.

Shaggy answered 23/10, 2015 at 6:50 Comment(0)
I
1

You could run git pull --rebase. That way when someone else has edited your JSON file, git will first pull their changes, then try to apply your changes on top of theirs. There is an option to do this every time: just put your branch name in place of BRANCH and run this: git config branch.BRANCH.rebase true

Idiocy answered 22/10, 2015 at 22:12 Comment(0)
Y
0

Taken from @Tim Biegeleisen's comment, which I think deserves to be an answer:

Yeti answered 19/5, 2022 at 15:27 Comment(0)
A
-2

For example, there are 2 developers (developer A and B) work in a project. I will create 2 translation files: A.json and B.json. A.json is for developer A, B.json is for developer B. And I will create the translation file named en_US.json. en_US.json is ignored in .gitignore file.

So the structure of the translation directory as below:

$ tree . -a
.
├── A.json
├── B.json
├── en_US.json
└── .gitignore

So now, I have to create a task to combine all JSON files to en_US.json. It is easier if you are running a Javascript project. I suggest you use grunt or gulp to run the task. For example, you can refer to https://www.npmjs.com/package/grunt-merge-json or https://www.npmjs.com/package/grunt-concat-json

Antin answered 16/10, 2015 at 11:4 Comment(2)
What if A and B add the same keys?Eastman
It belongs the script you define in your tasksAntin

© 2022 - 2024 — McMap. All rights reserved.