How to automate download of weekly export service files
Asked Answered
F

8

8

In SalesForce you can schedule up to weekly "backups"/dumps of your data here: Setup > Administration Setup > Data Management > Data Export

If you have a large Salesforce database there can be a significant number of files to be downloading by hand.

Does anyone have a best practice, tool, batch file, or trick to automate this process or make it a little less manual?

Fiore answered 16/4, 2012 at 16:49 Comment(0)
T
12

Last time I checked, there was no way to access the backup file status (or actual files) over the API. I suspect they have made this process difficult to automate by design.

I use the Salesforce scheduler to prepare the files on a weekly basis, then I have a scheduled task that runs on a local server which downloads the files. Assuming you have the ability to automate/script some web requests, here are some steps you can use to download the files:

  1. Get an active salesforce session ID/token
    • enterprise API - login() SOAP method
  2. Get your organization ID ("org ID")
    • Setup > Company Profile > Company Information OR
    • use the enterprise API getUserInfo() SOAP call to retrieve your org ID
  3. Send an HTTP GET request to https://{your sf.com instance}.salesforce.com/ui/setup/export/DataExportPage/d?setupid=DataManagementExport
    • Set the request cookie as follows:
      • oid={your org ID}; sid={your session ID};
  4. Parse the resulting HTML for instances of <a href="/servlet/servlet.OrgExport?fileName=
    • (The filename begins after fileName=)
  5. Plug the file names into this URL to download (and save):
    • https://{your sf.com instance}.salesforce.com/servlet/servlet.OrgExport?fileName={filename}
    • Use the same cookie as in step 3 when downloading the files

This is by no means a best practice, but it gets the job done. It should go without saying that if they change the layout of the page in question, this probably won't work any more. Hope this helps.

Tonneson answered 16/4, 2012 at 22:54 Comment(7)
Thanks! Any chance you could genericize your solution (or xxxx out your credentials/etc) and pastebin it? In the meantime I'm keeping an eye out on this new service from backupify: blog.backupify.com/2012/02/28/…Fiore
The backupify project looks interesting. I would guess that they're pulling the data using one of the salesforce API's, rather than using the files prepared by Salesforce backup.Tonneson
My solution is written in C#, would that be of use to you?Tonneson
I think you're correct about backupify. I'm not familiar at all with C#, so no I wouldn't, but perhaps others might find it useful (I think there's a need out there) - Thanks again for your detailed feedback on this question!Fiore
Using @Adam 's suggestion, I wrote a small C# application to handle this. You can access it here: github.com/dthagard/salesforce_backupAssamese
@Adam Butler, this is good stuff I followed instructions and was able to do everything till step 4, however I wanted to ask you how do you accomplish step 5? I tried everything to download file but am unable to do so. I tried using webclient and also use the same webrequest but not working. Please help. I am using C# (.net 4.5 Visual Studio 2012)Thanks!Henigman
@Henigman try adapting this method: gist.github.com/anonymous/932690a8064fdbcac45fTonneson
E
8

A script to download the SalesForce backup files is available at https://github.com/carojkov/salesforce-export-downloader/

It's written in Ruby and can be run on any platform. Supplied configuration file provides fields for your username, password and download location.

With little configuration you can get your downloads going. The script sends email notifications on completion or failure.

It's simple enough to figure out the sequence of steps needed to write your own program if Ruby solution does not work for you.

Eames answered 27/7, 2013 at 21:59 Comment(6)
Great idea! I managed to get the script running and to download the ZIPs. Unfortunately all of them are corrupted, so I cannot use them. Any idea? Thx!Ess
This could be related to Ruby version, OS or options for file reading or writing. My script is running on Linux, ruby version ruby 1.9.3p484.Eames
Actually, I think I might see what the issue is. Try replacing 'w' with 'wb' for file open options on line 133: f = open("#{@data_directory}/#{fn}", "wb"). If that works, I will make the changes to the source. Unfortunately I can not test this on windows, so your feedback is critical. Thank you!Eames
Using 'wb' instead of 'w' is working, thanks so much!Ess
Thanks a lot! This script works for me (Win10, Ruby 2.2.5) after getting the CA certs from curl.haxx.se/ca/cacert.pem and setting the SSL_CERT_FILE environment variable as described at gist.github.com/fnichol/867550.Cleavable
Script is not working anymore since Nov 17. Opened an issue on the git repo, no reaction yet.Ess
M
5

I'm Naomi, CMO and co-founder of cloudHQ, so I feel like this is a question I should probably answer. :-)

cloudHQ is a SaaS service that syncs your cloud. In your case, you'd never need to upload your reports as a data export from Salesforce, but you'll just always have them backed up in a folder labeled "Salesforce Reports" in whichever service you synchronized Salesforce with like: Dropbox, Google Drive, Box, Egnyte, Sharepoint, etc.

The service is not free, but there's a free 15 day trial. To date, there's no other service that actually syncs your Salesforce reports with other cloud storage companies in real-time.

Here's where you can try it out: https://cloudhq.net/salesforce

I hope this helps you!

Cheers, Naomi

Mureil answered 3/12, 2014 at 18:15 Comment(0)
V
2

Be careful that you know what you're getting in the back-up file. The backup is a zip of 65 different CSV files. It's raw data, outside of the Salesforce UI cannot be used very easily.

Vaclav answered 9/11, 2012 at 14:42 Comment(1)
Please don't promote your website while pretending it is not your ownIssue
H
2

Our company makes the free DataExportConsole command line tool to fully automate the process. You do the following:

  1. Automate the weekly Data Export with the Salesforce scheduler
  2. Use the Windows Task Scheduler to run the FuseIT.SFDC.DataExportConsole.exe file with the right parameters.
Heterologous answered 18/4, 2013 at 4:11 Comment(1)
Does the tool really use the scheduled export or does it do it's own? I ask because it uses the API (not the email link) and I heard the scheduled export links are not part of the API.Godber
P
1

Adding a Python3.6 solution. Should work (I haven't tested it though). Make sure the packages (requests, BeautifulSoup and simple_salesforce) are installed.

import os
import zipfile
import requests
import subprocess
from datetime import datetime
from bs4 import BeautifulSoup as BS
from simple_salesforce import Salesforce

def login_to_salesforce():
    sf = Salesforce(
        username=os.environ.get('SALESFORCE_USERNAME'),
        password=os.environ.get('SALESFORCE_PASSWORD'),
        security_token=os.environ.get('SALESFORCE_SECURITY_TOKEN')
    )
    return sf

org_id = "SALESFORCE_ORG_ID"  # canbe found in salesforce-> company profile
export_page_url = "https://XXXX.my.salesforce.com/ui/setup/export/DataExportPage/d?setupid=DataManagementExport"

sf = login_to_salesforce()
cookie = {'oid': org_id, 'sid':sf.session_id}
export_page = requests.get(export_page_url, cookies=cookie)
export_page = export_page.content.decode()

links = []
parsed_page = BS(export_page)
_path_to_exports = "/servlet/servlet.OrgExport?fileName="
for link in parsed_page.findAll('a'):
   href = link.get('href')
   if href is not None:
       if href.startswith(_path_to_exports):
           links.append(href)

print(links)
if len(links) == 0:
    print("No export files found")
    exit(0)

today = datetime.today().strftime("%Y_%m_%d")
download_location = os.path.join(".", "tmp", today)
os.makedirs(download_location, exist_ok=True)
baseurl = "https://zageno.my.salesforce.com"

for link in links:
    filename = baseurl + link
    downloadfile = requests.get(filename, cookies=cookie, stream=True)  # make stream=True if RAM consumption is high
    with open(os.path.join(download_location, downloadfile.headers['Content-Disposition'].split("filename=")[1]), 'wb') as f:
        for chunk in downloadfile.iter_content(chunk_size=100*1024*1024):  # 50Mbs ??
            if chunk:
                f.write(chunk)
Pillar answered 5/4, 2018 at 14:29 Comment(2)
does this solution require the API to be enabled? We have SF Professional edition which doesn't come with the API enabled. I've tried your login method but I'm not getting access is why I'm asking. Thanks!Wiggs
This worked for me on Windows once I removed the double quotes from the filename os.path.join(download_location,downloadfile.headers["Content-Disposition"].split("filename=")[1].replace('"', ""),)Pulchritudinous
H
0

I recently wrote a small PHP utility that uses the Bulk API to download a copy of sObjects you define via a json config file.

It's pretty basic but can easily be expanded to suit your needs.

Force.com Replicator on github.

Hamza answered 15/7, 2013 at 8:44 Comment(0)
K
-1

I have added a feature in my app to automatically backup the weekly/monthly csv files to S3 bucket, https://app.salesforce-compare.com/

Create a connection provider (currently only AWS S3 is supported) and link it to a SF connection (needs to be created as well).

On the main page you can monitor the progress of the scheduled job and access the files in the bucket

More info: https://salesforce-compare.com/release-notes/

Kwang answered 12/10, 2022 at 12:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.