Download all messages from a Google group
Asked Answered
K

4

30

I'm in the process of developing a Google apps migration/archive system and at this point in development I'm trying to come up with a way to download all messages in all the groups that my domain users have created. I know that I can set up forwarding filters and have all messages archived to an email, but this doesn't help with older messages.

Is there a way to download these messages from a Google group and if so, is there away in the admin API to get a list of all groups that users have created?

Koniology answered 7/5, 2014 at 15:59 Comment(1)
So as it stands, the best solution I've come up with is to create a web scraper that goes and pulls all the raw posts from the various groups. This is obviously a subpar solution since it's prone to errors and will need to be updated as changes are made to the google groups layout.Koniology
Z
17

If you don't mind using #bash, you may try a tool I wrote

https://github.com/icy/google-group-crawler

It can download all mbox files from Google Group. If you have a cookie file, you can even download all files from a private Google Group, and/or to see all original emails. It can also read rss feeds and fetch the latest posts ; and this is useful for daily mirror.

An example result is here http://l.archlinuxvn.org/archlinuxvn/. MHonArch is used to convert mbox files into HTML format.

Zareba answered 20/7, 2015 at 2:37 Comment(1)
Very old thread... but still relevant. Sadly this answer no longer works since Google deprecated AJAX crawling.Foamflower
K
7

Ultimately I ended up using the gdata python library to get a list of all groups along with their respective URLs. From there I used selenium to scrape the groups for messages and all replies. Probably not the best solution but it works for what I need.

Koniology answered 18/6, 2014 at 21:19 Comment(1)
Do you have made available any code do achieve this?Gist
D
2

I made a simple scrap utility by using selenium and htmlunit.. you can use it.. it is not very optimized and can help you download messages of small groups only(up-to 7000 msgs)

https://github.com/himukr/google-grp-scraper

Dysphonia answered 29/4, 2015 at 4:49 Comment(1)
Links to external resources are encouraged, but please add context around the link so your fellow users will have some idea what it is and why it’s there. Always quote the most relevant part of an important link, in case the target site is unreachable or goes permanently offline.Granule
V
0

A work around could be where we try something like the below

  1. Forward google groups messages to an email box
  2. Use the IMAP Protocol & Gmail's app passwords to download all emails with the tag
  3. Sort and filter through the resulting emails

Note: Not sure if this will work for historical messages, only the newer ones

Vance answered 3/4, 2023 at 16:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.