Language: Python
Web
SNScrape was created to provide a simple and reliable way to extract social media content without relying on official APIs, which often have rate limits or require authentication. It has become popular among data analysts, researchers, and developers for social media mining and analysis.
SNScrape is a Python library for scraping social networking services such as Twitter, Facebook, Reddit, and more. It allows fetching posts, comments, tweets, user profiles, and other social data without using official APIs.
pip install snscrapeconda install -c conda-forge snscrapeSNScrape can scrape posts, tweets, users, and hashtags from multiple social platforms. It supports filtering by date, keyword, or user, and outputs data in JSON, CSV, or other formats.
import snscrape.modules.twitter as sntwitter
for i, tweet in enumerate(sntwitter.TwitterUserScraper('twitter').get_items()):
if i > 5:
break
print(tweet.date, tweet.content)Scrapes the latest 5 tweets from a specified Twitter user and prints their date and content.
import snscrape.modules.twitter as sntwitter
for i, tweet in enumerate(sntwitter.TwitterHashtagScraper('#Python').get_items()):
if i > 5:
break
print(tweet.date, tweet.content)Fetches the latest 5 tweets containing the hashtag #Python.
import snscrape.modules.twitter as sntwitter
import pandas as pd
tweets = []
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('Python since:2025-01-01 until:2025-01-31').get_items()):
if i > 50:
break
tweets.append([tweet.date, tweet.user.username, tweet.content])
df = pd.DataFrame(tweets, columns=['Date','User','Content'])
df.to_csv('tweets.csv', index=False)Scrapes up to 50 tweets containing 'Python' in January 2025 and saves them to a CSV file.
import snscrape.modules.reddit as snreddit
for post in snreddit.RedditSearchScraper('Python').get_items():
print(post.title, post.url)Scrapes Reddit posts containing the keyword 'Python' and prints the post titles and URLs.
import snscrape.modules.twitter as sntwitter
for tweet in sntwitter.TwitterSearchScraper('Python since:2025-08-01 until:2025-08-21').get_items():
print(tweet.date, tweet.content)Scrapes tweets containing 'Python' posted between 1st and 21st August 2025.
Limit the number of items to scrape to avoid excessive requests.
Use pandas DataFrame or CSV to store and analyze scraped data efficiently.
Be mindful of platform usage terms to avoid scraping sensitive or restricted data.
Combine with Python date filters to scrape relevant time periods.
Use try/except to handle exceptions and network issues.