517
submitted 4 months ago by corbin@infosec.pub to c/opensource@lemmy.ml
you are viewing a single comment's thread
view the rest of the comments
[-] chemicalwonka@discuss.tchncs.de 4 points 4 months ago

I tried to download some videos from Reddit using YT-DLP and it didn't work, I think maybe because Reddit limited access

[-] yogthos@lemmy.ml 7 points 4 months ago

I made a script for grabbing reddit videos that's been working pretty well for me, needs Babashka to run https://babashka.org/

#!/usr/bin/env bb
(require '[clojure.java.shell :refer [sh]]
         '[clojure.string :as string]
         '[cheshire.core :as cheshire]
         '[org.httpkit.client :as http]
         '[clojure.walk :as walk])

(defn http-get [url]
  (-> @(http/get url {})
      :body))

(defn find-base-url [data]
  (let [results (atom [])]
    (walk/postwalk
     (fn [node]
       (when (and (string? node) (.contains node "DASH"))
         (swap! results conj node))
       node)
     data)
    (some-> @results first (string/replace #"DASH_[0-9]+\.mp4" ""))))

(defn find-best-quality [names audio?]
  (->> ((if audio? filter remove) #(.contains (.toLowerCase %) "audio") names)
       (sort-by
        (fn [n]
          (-> n
              (string/replace #"\.mp4" "")
              (string/replace #"[a-zA-Z_]" "")
              (Integer/parseInt))))
       (last)))

(defn find-parts [base-url data]
  (let [url (atom nil)
        _ (walk/prewalk
           (fn [node]
             (when (and (map? node)
                        (contains? node :dash_url))
               (reset! url (:dash_url node)))
             node)
           data)
        xml (http-get @url)
        parts (->> (re-seq #"<BaseURL>(.*?)</BaseURL>" xml) (map second))
        best-video (find-best-quality parts false)
        best-audio (find-best-quality parts true)]
    [(str base-url best-video) (str base-url best-audio)]))

(defn filename [url]
  (let [idx (inc (.lastIndexOf url "/"))]
    (subs url idx)))

(defn tsname []
  (str "video-" (System/currentTimeMillis) ".mp4"))

(let [data (-> (first *command-line-args*) (str ".json") http-get (cheshire/decode true)) 
      base-url (find-base-url data)
      [video-url audio-url] (find-parts base-url data)
      video-file (filename video-url)
      audio-file (filename audio-url)]
  (sh "wget" video-url)
  (sh "wget" audio-url)
  (sh "ffmpeg" "-i" video-file "-stream_loop" "-1" "-i" audio-file "-shortest" "-map" "0:v:0" "-map" "1:a:0" "-y" (tsname))
  (sh "rm" audio-file video-file))
[-] Peffse@lemmy.world 6 points 4 months ago

You might also look at gallery-dl

[-] thingsiplay@beehaw.org 3 points 4 months ago

https://github.com/yt-dlp/yt-dlp/blob/master/supportedsites.md

Reddit is listed in the list of supported sites. I just tested it with a random post video post on Reddit, and it downloaded the file perfectly fine (played in local player). My theory you either did a user error and gave a link that is not a video post, I'm not sure if posts that link to a video would work, I think the post itself must be a video post. Or you tested it when Reddit blocked yt-dlp. The yt-dlp team needs to update it first, then it functions again. YouTube does the same.

[-] trunklz29@lemmy.world 3 points 4 months ago

Make sure yt-dlp is up to date. I’ve been able to download reddit, YouTube shorts, TikTok, videos etc just fine

[-] corbin@infosec.pub 3 points 4 months ago

I don't think I've had issues with reddit, as long as you use the link to the reddit comment thread, not one of the shortlinks or the video link or something else.

this post was submitted on 25 Aug 2024
517 points (98.3% liked)

Open Source

31830 readers
223 users here now

All about open source! Feel free to ask questions, and share news, and interesting stuff!

Useful Links

Rules

Related Communities

Community icon from opensource.org, but we are not affiliated with them.

founded 5 years ago
MODERATORS