SNScrape

Language: Python

Web

SNScrape was created to provide a simple and reliable way to extract social media content without relying on official APIs, which often have rate limits or require authentication. It has become popular among data analysts, researchers, and developers for social media mining and analysis.

SNScrape is a Python library for scraping social networking services such as Twitter, Facebook, Reddit, and more. It allows fetching posts, comments, tweets, user profiles, and other social data without using official APIs.

Installation

pip: pip install snscrape
conda: conda install -c conda-forge snscrape

Usage

SNScrape can scrape posts, tweets, users, and hashtags from multiple social platforms. It supports filtering by date, keyword, or user, and outputs data in JSON, CSV, or other formats.

Scraping tweets from a user

import snscrape.modules.twitter as sntwitter

for i, tweet in enumerate(sntwitter.TwitterUserScraper('twitter').get_items()):
    if i > 5:
        break
    print(tweet.date, tweet.content)

Scrapes the latest 5 tweets from a specified Twitter user and prints their date and content.

Scraping tweets by hashtag

import snscrape.modules.twitter as sntwitter

for i, tweet in enumerate(sntwitter.TwitterHashtagScraper('#Python').get_items()):
    if i > 5:
        break
    print(tweet.date, tweet.content)

Fetches the latest 5 tweets containing the hashtag #Python.

Saving tweets to CSV

import snscrape.modules.twitter as sntwitter
import pandas as pd

tweets = []
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('Python since:2025-01-01 until:2025-01-31').get_items()):
    if i > 50:
        break
    tweets.append([tweet.date, tweet.user.username, tweet.content])
df = pd.DataFrame(tweets, columns=['Date','User','Content'])
df.to_csv('tweets.csv', index=False)

Scrapes up to 50 tweets containing 'Python' in January 2025 and saves them to a CSV file.

Scraping Reddit posts

import snscrape.modules.reddit as snreddit

for post in snreddit.RedditSearchScraper('Python').get_items():
    print(post.title, post.url)

Scrapes Reddit posts containing the keyword 'Python' and prints the post titles and URLs.

Filtering by date range

import snscrape.modules.twitter as sntwitter

for tweet in sntwitter.TwitterSearchScraper('Python since:2025-08-01 until:2025-08-21').get_items():
    print(tweet.date, tweet.content)

Scrapes tweets containing 'Python' posted between 1st and 21st August 2025.

Error Handling

HTTPError / ConnectionError: Check network connectivity or retry after a short delay.
ValueError: invalid query: Ensure the search query syntax matches the scraper's supported format.

Best Practices

Limit the number of items to scrape to avoid excessive requests.

Use pandas DataFrame or CSV to store and analyze scraped data efficiently.

Be mindful of platform usage terms to avoid scraping sensitive or restricted data.

Combine with Python date filters to scrape relevant time periods.

Use try/except to handle exceptions and network issues.