It might be useful to extract data from Salesforce directly into Python and manipulate the data as a Pandas dataframe.
First, you need to create a Connected App in Salesforce and copy the consumer key and consumer secret.

Then start your Jupyter Notebook and create a Python notebook.

Start coding your Python notebook as follows:
import requests
import json ## Authorization
consumer_key = “enter the consumer key from your connected app here”
consumer_secret = “enter the consumer secret from your connected app here”
username = “enter your Salesforce username here”
password = “enter your Salesforce password here”
headers = { ‘Content-Type’: ‘application/x-www-form-urlencoded’ }
body = “grant_type=password&client_id=” + consumer_key \
+ “&client_secret=” + consumer_secret + “&username=” \
+ username + “&password=” + password
url = “https://login.salesforce.com/services/oauth2/token”
resp = requests.post( url, headers= headers, data= body )
print( ‘Status: ‘ + str( resp.status_code ) )
print( json.dumps( resp.json(), indent= 4 ) )
print( resp.json()[ “access_token” ] )
When executing, you should get results like below:
Status: 200
{
“access_token”: “00D… really long string here …”,
“instance_url”: “https://na35.salesforce.com”,
“id”: “https://login.salesforce.com/id/00D …”,
“token_type”: “Bearer”,
“issued_at”: “1541562541782”,
“signature”: “… encoded string …”
}
00D … really long string here …
Continue adding the code below:
session_id = resp.json()[ “access_token” ]
headers = { ‘Authorization’: ‘Bearer ‘ + session_id }
instance_url = resp.json()[ “instance_url” ]
query = ‘SELECT ID, Name, BillingCity FROM Account’
# using Bulk API 1.0
url = instance_url + ‘/services/data/v36.0/query?q=’ + query
resp = requests.get( url, headers= headers )
jsonResponse = resp.json()
print( ‘Status: ‘ + str( resp.status_code ) )
print( json.dumps( jsonResponse, indent= 4 ) )
And the resulting execution should look like this:
Status: 200
{ “totalSize”: 12,
“done”: true,
“records”: [ { “attributes”: {
“type”: “Account”,
“url”: “/services/data/v36.0/sobjects/Account/001….”
},
“Id”: “001….”,
“Name”: “GenePoint”,
“BillingCity”: “Mountain View”
}, …………..
If you want to you can work with the data in JSON as is:
# first record
print( jsonResponse[ “records” ][ 0 ] )
# first record id
print( jsonResponse[ “records” ][ 0 ][ “Id” ] )
But you can convert the JSON data to a Panda dataframe, which is a lot more powerful:
import pandas as pd
from pandas.io.json import json_normalize
data = jsonResponse[ “records” ]
df = pd.DataFrame.from_dict( json_normalize( data ), orient= ‘columns’ )
df
And get a dataframe like this:

Alternatively, you can use a Python library like simple-salesforce: https://pypi.org/project/simple-salesforce/
Leave a Reply