AWS-SalesForce

Python 3 script for AWS lambda, cloud9 and parquet to run the Sales Forces Query.

Sales Force API

This README file explains how the script works.The API_SalesForce.py script contains the API with queries and functions.The aws_sf_lambda.py script contains the API for a AWS Lambda.The reqs.txt file is needed for the installation.

  • Information
  • Requirements
  • Configuration
  • Functions
  • Logic

Information

This script permits to connect and run queries to the SOQL(Salesforce Object Query Language). It uses the JWT (Java Web Token) andthe Simple SalesForce python’s libraries for the Authentication.It is flexible and modular, easy to add new functionality or modify thecurrent.This version runs queries and saves as a DataFrame the results to an AWSS3 Bucket in CSV format.

KnowlodgeURL
SOQLhttps://developer.salesforce.com/docs/atlas.en-us.soql_sosl.meta/soql_sosl/sforce_api_calls_soql_sosl_intro.htm
SimpleSalesForcehttps://pypi.org/project/simple-salesforce/
JWT with Pythonhttps://dev.to/apcelent/json-web-token-tutorial-with-example-in-python-23kb
DATEFORMAT SOQLhttps://developer.salesforce.com/docs/atlas.en-us.soql_sosl.meta/soql_sosl/sforce_api_calls_soql_select_dateformats.htm

Requirements

  • Python3
  • Libraries and modules: datetime, time, json, jwt, pandas, requests, config, io, s3fs, simple_salesforce.
  • The Private Key (PEM) and the Customer Key from your SF Admin.
  • Info how the SOQL and the JWT are working.

You need also:

  • A S3 bucket to save files from SalesForce.
  • A EMR or EC2 with enough RAM (for the big dump).
  • The rights permissions to get access in SF or save files to S3.
  • The SalesForce Objects Schema from your SF Admin.

Configuration

Before running the script you may install the dependencies and change config in the script.

$ pip3 install -r reqs.txt 

Once you opened the script, you have to change the below section.

### Define Accountsbucket = 'INSERT_BUCKET_NAME'folder = 'salesforce_output/'consumer_id = 'INSERT_THE_CONSUMER_KEY'username = 'INSERT_THE_USERNAME' 
### READ THE PEM FILE FROM SECRET MANAGERSdef get_secret():secret_name = ""region_name = ""

If you have a query do add or you need to modify one, you need to change this section.

### Define the Queries Dictdef query_full_list():query_list = dict({ 

If you want to skip the SF Query_More, you have to modify this section in runit()

`runit('WHEN', 'QTYPE', '')` (Query object in ALL/Query_All or if it is too big use MORE/Query_More) 

Functions

The section explains how the script’s functions are working.

  • The def get_secret() It calls the PEM key from SECRET MANAGER
  • The def jwt_login() It’s the real API Call to get the SESSIONID and Authenticate in SF.
  • The def query_full_list() There is the LIST (Dict) ofthe queries to run. You can modify or add and remove them as yourpreferences. Remember, they are in SOQL format.
  • The def query_custom_list(when,N) There are differentoptions in SOQL and SF permits to retrieve objects in a different periodlike YESTERDAY, THIS_MONTH, LAST_N_DAYS, or DELETE.
  • The def query_loop(query_list, q_type) It’s the query loop, it define the type of query to run.
  • The def queryMORE(sf,filename,start,query) It’s thequery_more, it gets 2000 rows for each call as sf.query + nextRecordsUrl+ sf.query_more method. The function also filters and creates adataframe with panda.
  • The def queryALL(sf,filename,start,query) It’s the query_all, it gets all rows in one call. The function also filters and creates a dataframe with panda.
  • The def now() Simple DATE/TIME function
  • The def folder_name() Makes the partition folders based of current data.(YEAR/MONTH/DATE)
  • The def saveS3(filename,df) Save to the S3 bucket.
  • The def end_msg(df, filename, start) It shows the Total query time.
  • The def runit(when,"ALL/MORE",'N'): It’s the lastfunction that calls the in seq the previous functions. Specify if youwant a Query_more or a Query_All. Specify the N value if your run theLAST_N_DAYS query.

Logic

If you want to run the script you have to call the runit() function:

  • Check the WHEN period var in runit() function and the Qtype var if you want run the QUERY_ALL (ALL) or the QUERY_more (MORE).
  • Based of what queries you need, uncomment the function runit(”) to run.
  • runit('FULL', 'ALL' ,'') For a full Dump and MORE = SF.query_more, ALL = SF.query_all
  • runit('YESTERDAY', 'ALL','') Starts 00:00:00 the day before and continues for 24 hours. ORE = SF.query_more, ALL = SF.query_all
  • runit('TODAY', 'ALL','') Starts 00:00:00 of the current day and continues for 24 hours. MORE = SF.query_more, ALL = SF.query_all
  • runit('THIS_MONTH', 'ALL','') Starts 00:00:00 on thefirst day of the month that the current day is in and continues for allthe days of that month. MORE = SF.query_more, ALL = SF.query_all
  • runit('LAST_MONTH', 'ALL','') Starts 00:00:00 on thefirst day of the month before the current day and continues for all thedays of that month. MORE = SF.query_more, ALL = SF.query_all
  • runit('DELETED', 'ALL','') Objects deleted (but not available for all queries) MORE = SF.query_more, ALL = SF.query_all
  • runit('LAST_N_DAYS', 'ALL','N') Objects from the last N days, You may change the N value. MORE = SF.query_more, ALL = SF.query_all
  • runit('LAST_WEEK', 'ALL','') Starts 00:00:00 on thefirst day of the week before the most recent first day of the week andcontinues for seven full days. Your locale determines the first day ofthe week. MORE = SF.query_more, ALL = SF.query_all
  • runit('THIS_WEEK', 'ALL','') Starts 00:00:00 on themost recent first day of the week before the current day and continuesfor seven full days. Your locale determines the first day of the week.MORE = SF.query_more, ALL = SF.query_all

{SF-LOGIN} —|-> {SECRET} —|-> {SESSIONID} —|-> {QUERY_more 2000 ROWS IN A CYCLE} / {QUERY_all a full dump without a cycle} —

  • The def get_secret() It calls the PEM key from SECRET MANAGER
  • The def jwt_login() It’s the real API Call to get the SESSIONID and Authenticate in SF.
  • The def query_full_list() There is the LIST (Dict) ofthe queries to run. You can modify or add and remove them as yourpreferences. Remember, they are in SOQL format.
  • The def query_custom_list(when,N) There are differentoptions in SOQL and SF permits to retrieve objects in a different periodlike YESTERDAY, THIS_MONTH, LAST_N_DAYS, or DELETE.
  • The def query_loop(query_list, q_type) It’s the query loop, it define the type of query to run.
  • The def queryMORE(sf,filename,start,query) It’s thequery_more, it gets 2000 rows for each call as sf.query + nextRecordsUrl+ sf.query_more method. The function also filters and creates adataframe with panda.
  • The def queryALL(sf,filename,start,query) It’s the query_all, it gets all rows in one call. The function also filters and creates a dataframe with panda.
  • The def now() Simple DATE/TIME function
  • The def saveS3(filename,df) Save to the S3 bucket.
  • The def end_msg(df, filename, start) It shows the Total query time.
  • The def runit(when,"ALL/MORE",'N'): It’s the lastfunction that calls the in seq the previous functions. Specify if youwant a Query_more or a Query_All. Specify the N value if your run theLAST_N_DAYS query.

Follow us

We will keep you updated