Posting Data and Using Sessions with Requests
In the previous post, we covered downloading/pulling data using Requests. We downloaded and manipulated data from HTML web pages as well as API’s. Posting data is what will be covered here. We will be submitting data as if it were from an HTML form. We will also be posting data in JSON format in the form of a payload.
The examples covered below will be using httpbin.org, which is an HTTP request & response service. This site has some nice features with one of them being that it will return the data you originally posted in the response. This can make debugging easier.
Posting Form Data
If you have not installed requests
already, it can be done easily using pip
.
pip install requests
Next, we can create a new Python script to import requests
and setup a variable for our target URL. We will also be using a dictionary to post form data.
import requests
url = 'https://httpbin.org/post'
data = {'user':'me@example.com'}
Now, we’re ready to use requests
to post the user data to our target URL:
import requests
url = 'https://httpbin.org/post'
data = {'user':'me@example.com'}
response = requests.post(url, data=data)
print(response) # <Response [200]>
Notice that our response variable is a Response
object. To be able to use this data, we need to apply a method or a property. We will apply the text
property in the next example. The text property will give us all the response data as a single string:
import requests
url = 'https://httpbin.org/post'
data = {'user':'me@example.com'}
# as form data
response = requests.post(url, data=data)
print(response) # <Response [200]>
result = response.text
print(type(result)) # <class 'str'>
print(result)
Printed results below:
Since this is merely a string (httpbin.org kindly returns a pretty-printed version with all the spacing), it’s difficult to make use of this data. However, as covered in the previous post, Requests has a built-in JSON decoder that we can use:
import requests
from pprint import pprint
url = 'https://httpbin.org/post'
data = {'user':'me@example.com'}
# as form data
response = requests.post(url, data=data)
print(response) # <Response [200]>
result = response.json()
print(type(result)) # <class 'dict'>
pprint(result)
Our output to the terminal window will look similar (we used pprint
to pretty-print our dictionary data to help make it readable). The important difference is that we have dictionary variable in which we can work with to access our data:
...
print(result['form']) # {'user': 'me@example.com'}
Posting a JSON Payload
Sending a POST request using a JSON payload is different from sending form data. Form data is sent using a series of key-value pairs. Alternatively, a payload consists of sending everything in one, single chunk of data. Here is a quick breakdown of the differences between sending form data versus a JSON payload:
Form Data:
POST
Content-Type: application/x-www-form-urlencoded
user=me@example.com
JSON Payload:
POST
Content-Type: application/json
{"user":"me@example.com"}
Notice that there is also a change to the Content-Type
in the header. If we have our Python script setup the same way, except that we’re switching to a JSON Payload, we will need to convert our data into JSON. For this, we can use the built-in json
Python library:
import requests
import json
from pprint import pprint
url = 'https://httpbin.org/post'
data = {'user':'me@example.com'}
# as payload
response = requests.post(url, data=json.dumps(data))
result = response.json()
pprint(result)
The data we intend to post is a dictionary. By using the json.dumps
method, we can convert the dictionary into a JSON-formatted string to post as a payload. Also, as we did previously, we can apply the Requests JSON-decoder to convert our response info to a dictionary.
Previously, it was mentioned that it’s common to set the Content-Type
in the header. We can setup a dictionary variable with our custom headers. We will do that by setting Content-Type
to application/json
:
...
url = 'https://httpbin.org/post'
headers = {
'Content-Type': 'application/json',
# additional headers here
}
data = {'key':'value'}
# as payload
response = requests.post(url, data=json.dumps(data), headers=headers)
# headers that were originally sent
print(response.request.headers)
...
With this added, you should be able to verify that this header was set because httpbin.org will return it back with the result. Alternatively, we can also access the original headers sent from the response.request.headers
variable.
Working with Request Sessions
Eventually, you will run into situations where you must persist a user session. Let’s say you first have to log in/authenticate, which sets a browser cookie that must be sent with each subsequent request.
Using httpbin once again, we will save a cookie and then try to retrieve it. Consider the following:
import requests
# loading this URL will set a cookie
res = requests.get('https://httpbin.org/cookies/set/abc/123')
print('res: {}'.format(res.text))
# No cookie data found!
res = requests.get('https://httpbin.org/cookies')
print('res: {}'.format(res.text))
# Outputs:
# res: {
# "cookies": {}
# }
In the example above, each subsequent request is starting a new session. We can get around this dilemma by using a Session
object. We will use a context manager that will enclose all the HTTP requests made within our session:
import requests
with requests.Session() as s:
res = s.get('https://httpbin.org/cookies/set/abc/123')
print('res: {}'.format(res.text))
res = s.get('https://httpbin.org/cookies')
print('res: {}'.format(res.text))
# Outputs
# res: {
# "cookies": {
# "abc": "123"
# }
# }
print(s.cookies) # <RequestsCookieJar[<Cookie abc=123 for httpbin.org/>]>
print('actual cookies: {}'.format(s.cookies.get_dict())) # actual cookies: {'abc': '123'}
Similar to the previous attempt, we’re setting cookie “abc” with the value of “123”. Since we’re using the same session to get our cookies, the data will be returned to us. Note that httpbin returns the cookie info in the response. To get the actual cookies, there is a RequestsCookieJar
attached to the session. Using this, we can call the get_dict
method to get our cookies in a dictionary format.
Setting a requests session is necessary when you need to make multiple requests. And, each subsequent request will require persisting data, such as a session cookie.
Posted in python