Social Media Scraper
A skill that pulls all data from social media posts, analyzes them, and performs transcription + visual analysis for video/audio content.
First-Time Setup
When this skill is invoked for the first time or when the user says "install", "configure", "setup", run the setup mode first.
Setup Detection
Configuration file: ~/.social-media-scraper.env. If the file does not exist or the user requests reinstallation, ask the interactive questions below.
test -f ~/.social-media-scraper.env || echo "First-time setup required"
Setup Questions
Ask the questions one by one, letting the user answer each. Based on the answers, create the ~/.social-media-scraper.env file and install only the tools for the selected platforms.
Question 1 — Platforms
"Which platforms do you want to use? Instagram, TikTok, Twitter/X, YouTube — pick all of them or only the ones you want. (default: all)"
Normalize the answer as a comma-separated lowercase list: instagram,tiktok,twitter,youtube.
Question 2 — Gemini visual analysis
"Should Gemini visual video analysis be enabled? This reads on-screen text, products, and scenes — it provides visual context that Whisper cannot translate. Enabling it requires a free Gemini API key (https://aistudio.google.com/apikey). Enable it? (default: yes)"
If yes: ask "Paste your Gemini API key:". Write the key into ~/.social-media-scraper.env but do not print to terminal, do not commit, do not log. Protect the file with chmod 600.
If no: mark GEMINI_ENABLED=false, do not install google-genai.
Question 3 — Transcription language
"What should the default transcription language be? Auto-detect (recommended — 99 languages), Turkish, English, or another ISO 639-1 language code (e.g. fr, de, es). (default: auto)"
Store the value as auto, tr, en or an ISO code.
Configuration File Format
~/.social-media-scraper.env:
# Active platforms (comma-separated)
PLATFORMS=instagram,tiktok,twitter,youtube
# Gemini video analysis
GEMINI_ENABLED=true
GEMINI_API_KEY=AIza...
# Transcription
TRANSCRIPTION_LANG=auto
WHISPER_MODEL=medium
Installation Steps (based on answers)
Install only the tools for the selected platforms:
# Always required
pip install faster-whisper --break-system-packages
# ffmpeg: macOS → brew install ffmpeg | Linux → apt install ffmpeg
# Only if selected
# instagram:
pip install instaloader --break-system-packages
# tiktok or youtube:
pip install yt-dlp --break-system-packages
# twitter:
npm install -g @steipete/bird
# if gemini enabled:
pip install google-genai --break-system-packages
After Setup
When setup is complete, give the user a short summary (which platforms are active, whether Gemini is enabled, what language) and ask for a sample link to test.
Reading the Current Configuration
When the skill is running, read the ~/.social-media-scraper.env file and behave according to the selections. E.g. if GEMINI_ENABLED=false, skip the visual analysis step. If TRANSCRIPTION_LANG=tr, pass language="tr" to Whisper.
set -a
source ~/.social-media-scraper.env 2>/dev/null
set +a
If a link from a platform not in the PLATFORMS list arrives, ask the user "This platform is not installed, would you like to add it?".
General Flow
- The user shares a social media link
- The platform is auto-detected (from the URL)
- All data is fetched with the appropriate tool for the platform
- If video/audio content exists: download → extract audio with ffmpeg → transcribe with faster-whisper → analyze visuals/screen with Gemini → delete temporary files
- Transcription + visual analysis are merged into a full understanding
- Present results to the user in a clean and readable way
Platform Detection
Detect the platform by looking at the URL:
instagram.comorinstagr.am→ Instagramtiktok.com→ TikTokx.comortwitter.com→ Twitter/Xyoutube.comoryoutu.be→ YouTube
Per-Platform Tools
Twitter/X
Priority order:
birdCLI (npm package: @steipete/bird) — most comprehensive, tweet + reply thread + media info- Jina Reader (
curl -s "https://r.jina.ai/TWEET_URL") — fallback method - Reading via browser — last resort
bird CLI usage:
bird --urls "TWEET_URL"
If bird is not installed: npm install -g @steipete/bird
bird may need Chrome cookies to work, it auto-detects them.
Priority order:
instaloader(pip package) — reel, post, story download and metadatainstagrapi(pip package) — more comprehensive API, may require login- yt-dlp — fallback video download
instaloader usage:
pip install instaloader --break-system-packages
instaloader -- -SHORTCODE
The shortcode is extracted from the URL: instagram.com/reel/SHORTCODE/ or instagram.com/p/SHORTCODE/
Fetching metadata (Python):
import instaloader
L = instaloader.Instaloader()
post = instaloader.Post.from_shortcode(L.context, "SHORTCODE")
print(f"Caption: {post.caption}")
print(f"Likes: {post.likes}")
print(f"Comment count: {post.comments}")
print(f"Date: {post.date}")
print(f"Hashtags: {post.caption_hashtags}")
TikTok
Priority order:
- Download video + metadata with yt-dlp
- Fetch page content with Jina Reader
yt-dlp usage:
yt-dlp --write-info-json --write-comments -o "downloads/%(id)s.%(ext)s" "TIKTOK_URL"
If yt-dlp is not installed: pip install yt-dlp --break-system-packages
Cookies may be required for TikTok: yt-dlp --cookies-from-browser chrome "URL"
YouTube
Video + metadata + comments with yt-dlp:
yt-dlp --write-info-json --write-comments --skip-download -o "downloads/%(id)s.%(ext)s" "YOUTUBE_URL"
If video transcription is needed (when no captions are available):
yt-dlp -f "bestaudio" -o "downloads/audio.%(ext)s" "YOUTUBE_URL"
Transcription (For Video/Audio Content)
Automatically transcribe every post containing video or audio. The user does not need to ask separately.
Steps:
- Download the video (with the appropriate tool per platform)
- Extract audio with ffmpeg:
ffmpeg -i video.mp4 -vn -acodec pcm_s16le -ar 16000 -ac 1 audio.wav
- Transcribe with faster-whisper:
from faster_whisper import WhisperModel
model = WhisperModel("medium", compute_type="int8")
segments, info = model.transcribe("audio.wav")
print(f"Detected language: {info.language} ({info.language_probability:.0%})")
for segment in segments:
print(f"[{segment.start:.1f}s → {segment.end:.1f}s] {segment.text}")
- Delete temporary video and audio files (to save space)
If faster-whisper is not installed: pip install faster-whisper --break-system-packages
If ffmpeg is not installed: brew install ffmpeg (macOS) or apt install ffmpeg (Linux)
Video Analysis (Gemini Vision)
For posts containing video, transcribing the audio alone is not enough. On-screen text, displayed products, interfaces, logos, gestures, scene transitions — all of these are part of the meaning. Whisper only translates speech; use Gemini to analyze what appears on screen.
Flow
- Download the video (yt-dlp / instaloader / etc.)
- Transcribe audio with Whisper (section above)
- Upload the video with Gemini File API, analyze it
- Merge the two sources and present to the user
- Delete temporary files
API Key
The Gemini API key is read from the GEMINI_API_KEY environment variable. You can get an API key for free from Google AI Studio.
Setup:
export GEMINI_API_KEY="your_api_key_here"
To add it permanently to your shell config:
echo 'export GEMINI_API_KEY="your_api_key_here"' >> ~/.zshrc # macOS / zsh
echo 'export GEMINI_API_KEY="your_api_key_here"' >> ~/.bashrc # Linux / bash
If you use a .env file:
export GEMINI_API_KEY=$(grep "^GEMINI_API_KEY=" .env | cut -d= -f2- | tr -d '"' | tr -d "'" | tr -d ' ')