Reusing free images and media files with Python

Wikimedia Commons is a collection of over 90,000,000 freely usable media files, many of which are used in Wikipedia articles. In this tutorial, you'll use the Wikimedia API to reuse an image, audio file, and video from Wikimedia Commons.

Run this sample code yourself by downloading it as a Jupyter Notebook. To get started with Jupyter Notebooks, visit PAWS.


To find media files on Commons, use the advanced search to filter by file type or extension. By using the Wikimedia API, you can easily access the file in multiple formats.

Images

Once you've selected an image, get the filename from the top of the page. For example, the filename for this image is File:The_Blue_Marble.jpg. You can use the filename to get the image via the get file endpoint.

# Python 3
# Choose a file, and call the file endpoint.
import requests

file = 'File:The_Blue_Marble.jpg'

headers = {
  # 'Authorization': 'Bearer YOUR_ACCESS_TOKEN',
  'User-Agent': 'YOUR_APP_NAME (YOUR_EMAIL_OR_CONTACT_PAGE)'
}

base_url = 'https://api.wikimedia.org/core/v1/commons/file/'
url = base_url + file
response = requests.get(url, headers=headers)

Once you've made the request, you can extract the image's preferred format from the JSON response. On the file description page, you can see that File:The_Blue_Marble.jpg is in the public domain. To provide attribution for the image, include the license, and link back to the file page. For example:

The Blue Marble.jpg
by NASA | Public domain

# Get the file's title, preferred size, and URL.
import json

response = json.loads(response.text)

display_title = response['title']
attribution_url = 'https:' + response['file_description_url']
preferred_file_url = response['preferred']['url']
preferred_width = response['preferred']['width']
preferred_height = response['preferred']['height']

Video

You can access videos from Wikimedia Commons in the same way. For example, to access File:Cheetahs on the Edge (Director's Cut).ogv, reset the file name, and call the file endpoint again. On the file page, you can see that this video is licensed under CC BY-SA 3.0. To provide the correct attribution for this license, make sure to credit the author and include a link to the license:

Cheetahs on the Edge (Director's Cut).ogv
by Gregory Wilson | CC BY-SA 3.0

# Choose a video, and call the file endpoint.
file = "File:Cheetahs on the Edge (Director's Cut).ogv"

url = base_url + file
response = requests.get(url, headers=headers)

# Get the file's title, preferred size, and URL.
response = json.loads(response.text)

display_title = response['title']
attribution_url = 'https:' + response['file_description_url']
file_url = response['preferred']['url']

Audio

For audio files, use the original format to access the file URL. Here's an example using File:Wikipedia-Morse.ogg. To provide attribution for this file under CC BY-SA 3.0, credit the author and include a link to the license:

Wikipedia-Morse.ogg
by Horsten | CC BY-SA 3.0

# Choose an audio file, and call the file endpoint.
file = 'File:Wikipedia-Morse.ogg'

url = base_url + file
response = requests.get(url, headers=headers)

# Get the file's title, original size, and URL.
response = json.loads(response.text)

display_title = response['title']
attribution_url = 'https:' + response['file_description_url']
file_url = response['original']['url']

To learn more about free images and media files, read the documentation on Wikimedia Commons.