updated documentation #85

Merge pull request #84 from imsamuka/master
add option to apply metadata in existing file. Apologies for the late merge, you sent this pull request right as school was beginning to pick up in earnest and I forgot about it in that rush. Thanks for the great work!
2025-08-16 23:51:02 +00:00 · 2022-02-23 10:59:15 -07:00 · 2022-01-27 11:18:58 -07:00 · 2022-01-07 22:15:27 -03:00 · 2022-01-04 08:46:37 -07:00 · 2022-01-03 01:01:58 -03:00
17 changed files with 570 additions and 194 deletions
--- a/.gitignore
+++ b/.gitignore
@ -11,4 +11,5 @@
 .ripper.log
 ffmpeg
 ffprobe
-youtube-dl
+youtube-dl
+*.temp
--- a/README.md
+++ b/README.md
@ -54,6 +54,8 @@ Arguments:
    -s, --song <song>           Specify song name to download
    -A, --album <album>         Specify the album name to download
    -p, --playlist <playlist>   Specify the playlist name to download
+    -u, --url <url>             Specify the youtube url to download from (for single songs only)
+    -g, --give-url              Specify the youtube url sources while downloading (for albums or playlists only only)

 Examples:
    $ irs --song "Bohemian Rhapsody" --artist "Queen"
@ -94,6 +96,8 @@ If you're one of those cool people who compiles from source
    ```yaml
    binary_directory: ~/.irs/bin
    music_directory: ~/Music
+    filename_pattern: "{track_number} - {title}"
+    directory_pattern: "{artist}/{album}"
    client_key: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    client_secret: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    single_folder_playlist:
@ -120,6 +124,9 @@ Here's what they do:
 ```yaml
 binary_directory: ~/.irs/bin
 music_directory: ~/Music
+search_terms: "lyrics"
+filename_pattern: "{track_number} - {title}"
+directory_pattern: "{artist}/{album}"
 client_key: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 client_secret: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 single_folder_playlist:
@ -130,8 +137,10 @@ single_folder_playlist:
 - `binary_directory`: a path specifying where the downloaded binaries should
    be placed
 - `music_directory`: a path specifying where downloaded mp3s should be placed.
-    Note that there will be more structure created inside that folder, usually
-    in the format of `music-dir>artist-name>album-name>track`
+ - `search_terms`: additional search terms to plug into youtube, which can be
+    potentially useful for not grabbing erroneous audio.
+ - `filename_pattern`: a pattern for the output filename of the mp3
+ - `directory_pattern`: a pattern for the folder structure your mp3s are saved in
 - `client_key`: a client key from your spotify API application
 - `client_secret`: a client secret key from your spotify API application
 - `single_folder_playlist/enabled`: if set to true, all mp3s from a downloaded
@ -143,6 +152,55 @@ single_folder_playlist:
    the album name and album image of the mp3 with the title of your playlist
    and the image for your playlist respectively

+
+In a pattern following keywords will be replaced:
+
+| Keyword | Replacement | Example |
+| :----: | :----: | :----: |
+| `{artist}` | Artist Name | Queen |
+| `{title}` | Track Title | Bohemian Rhapsody |
+| `{album}` | Album Name | Stone Cold Classics |
+| `{track_number}` | Track Number | 9 |
+| `{total_tracks}` | Total Tracks in Album | 14 |
+| `{disc_number}` | Disc Number | 1 |
+| `{day}` | Release Day | 01 |
+| `{month}` | Release Month | 01 |
+| `{year}` | Release Year | 2006 |
+| `{id}` | Spotify ID | 6l8GvAyoUZwWDgF1e4822w |
+
+Beware OS-restrictions when naming your mp3s.
+
+Pattern Examples:
+```yaml
+music_directory: ~/Music
+filename_pattern: "{track_number} - {title}"
+directory_pattern: "{artist}/{album}"
+```
+Outputs: `~/Music/Queen/Stone Cold Classics/9 - Bohemian Rhapsody.mp3`
+<br><br>
+```yaml
+music_directory: ~/Music
+filename_pattern: "{artist} - {title}"
+directory_pattern: ""
+```
+Outputs: `~/Music/Queen - Bohemian Rhapsody.mp3`
+<br><br>
+```yaml
+music_directory: ~/Music
+filename_pattern: "{track_number} of {total_tracks} - {title}"
+directory_pattern: "{year}/{artist}/{album}"
+```
+Outputs: `~/Music/2006/Queen/Stone Cold Classics/9 of 14 - Bohemian Rhapsody.mp3`
+<br><br>
+```yaml
+music_directory: ~/Music
+filename_pattern: "{track_number}. {title}"
+directory_pattern: "irs/{artist} - {album}"
+```
+Outputs: `~/Music/irs/Queen - Stone Cold Classics/9. Bohemian Rhapsody.mp3`
+<br>
+
+
 ## How it works

 **At it's core** `irs` downloads individual songs. It does this by interfacing
--- a/shard.yml
+++ b/shard.yml
@ -1,5 +1,5 @@
 name: irs
-version: 1.0.1
+version: 1.4.0

 authors:
  - Cooper Hammond <kepoorh@gmail.com>
--- a/spec/irs_spec.cr
+++ b/spec/irs_spec.cr
@ -1,9 +1,35 @@
 require "./spec_helper"

-describe Irs do
+describe CLI do
  # TODO: Write tests

-  it "works" do
-    false.should eq(true)
+  it "can show help" do
+    run_CLI_with_args(["--help"])
+  end
+
+  it "can show version" do
+    run_CLI_with_args(["--version"])
+  end
+
+  # !!TODO: make a long and short version of the test suite
+  # TODO: makes so this doesn't need user input
+  it "can install ytdl and ffmpeg binaries" do
+    # run_CLI_with_args(["--install"])
+  end
+
+  it "can show config file loc" do
+    run_CLI_with_args(["--config"])
+  end
+
+  it "can download a single song" do
+    run_CLI_with_args(["--song", "Bohemian Rhapsody", "--artist", "Queen"])
+  end
+
+  it "can download an album" do
+    run_CLI_with_args(["--artist", "Arctic Monkeys", "--album", "Da Frame 2R / Matador"])
+  end
+
+  it "can download a playlist" do
+    run_CLI_with_args(["--artist", "prakkillian", "--playlist", "IRS Testing"])
  end
 end
--- a/spec/spec_helper.cr
+++ b/spec/spec_helper.cr
@ -1,2 +1,10 @@
 require "spec"
-require "../src/irs"
+
+# https://github.com/mosop/stdio
+
+require "../src/bottle/cli"
+
+def run_CLI_with_args(argv : Array(String))
+    cli = CLI.new(argv)
+    cli.act_on_args
+end
--- a/src/bottle/cli.cr
+++ b/src/bottle/cli.cr
@ -20,6 +20,10 @@ class CLI
    [["-s", "--song"], "song", "string"],
    [["-A", "--album"], "album", "string"],
    [["-p", "--playlist"], "playlist", "string"],
+    [["-u", "--url"], "url", "string"],
+    [["-S", "--select"], "select", "bool"],
+    [["--ask-skip"], "ask_skip", "bool"],
+    [["--apply"], "apply_file", "string"]
  ]

  @args : Hash(String, String)
@ -48,6 +52,12 @@ class CLI
        #{Style.blue "-s, --song <song>"}           Specify song name to download
        #{Style.blue "-A, --album <album>"}         Specify the album name to download
        #{Style.blue "-p, --playlist <playlist>"}   Specify the playlist name to download
+        #{Style.blue "-u, --url <url>"}             Specify the youtube url to download from
+        #{Style.blue "                 "}           (for albums and playlists, the command-line
+        #{Style.blue "                 "}           argument is ignored, and it should be '')
+        #{Style.blue "-S, --select"}                Use a menu to choose each song's video source
+        #{Style.blue "--ask-skip"}                  Before every playlist/album song, ask to skip
+        #{Style.blue "--apply <file>"}              Apply metadata to a existing file

    #{Style.bold "Examples:"}
        $ #{Style.green %(irs --song "Bohemian Rhapsody" --artist "Queen")}
@ -69,34 +79,35 @@ class CLI

    if @args["help"]? || @args.keys.size == 0
      help
-      exit
+
    elsif @args["version"]?
      version
-      exit
+
    elsif @args["install"]?
      YdlBinaries.get_both(Config.binary_location)
-      exit
+
    elsif @args["config"]?
      puts ENV["IRS_CONFIG_LOCATION"]?
-      exit
+
    elsif @args["song"]? && @args["artist"]?
      s = Song.new(@args["song"], @args["artist"])
      s.provide_client_keys(Config.client_key, Config.client_secret)
-      s.grab_it
-      s.organize_it(Config.music_directory)
-      exit
+      s.grab_it(flags: @args)
+      s.organize_it()
+
    elsif @args["album"]? && @args["artist"]?
      a = Album.new(@args["album"], @args["artist"])
      a.provide_client_keys(Config.client_key, Config.client_secret)
-      a.grab_it
+      a.grab_it(flags: @args)
+
    elsif @args["playlist"]? && @args["artist"]?
      p = Playlist.new(@args["playlist"], @args["artist"])
      p.provide_client_keys(Config.client_key, Config.client_secret)
-      p.grab_it
+      p.grab_it(flags: @args)
+
    else
      puts Style.red("Those arguments don't do anything when used that way.")
      puts "Type `irs -h` to see usage."
-      exit 1
    end
  end

--- a/src/bottle/config.cr
+++ b/src/bottle/config.cr
@ -7,8 +7,11 @@ require "../search/spotify"
 EXAMPLE_CONFIG = <<-EOP
 #{Style.dim "exampleconfig.yml"}
 #{Style.dim "===="}
+#{Style.blue "search_terms"}: #{Style.green "\"lyrics\""}
 #{Style.blue "binary_directory"}: #{Style.green "~/.irs/bin"}
 #{Style.blue "music_directory"}: #{Style.green "~/Music"}
+#{Style.blue "filename_pattern"}: #{Style.green "\"{track_number} - {title}\""}
+#{Style.blue "directory_pattern"}: #{Style.green "\"{artist}/{album}\""}
 #{Style.blue "client_key"}: #{Style.green "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"}
 #{Style.blue "client_secret"}: #{Style.green "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"}
 #{Style.blue "single_folder_playlist"}: 
@ -22,8 +25,11 @@ module Config
  extend self

  @@arguments = [
+    "search_terms",
    "binary_directory",
    "music_directory",
+    "filename_pattern",
+    "directory_pattern",
    "client_key",
    "client_secret",
    "single_folder_playlist: enabled",
@ -41,6 +47,10 @@ module Config
    exit 1
  end

+  def search_terms : String
+    return @@conf["search_terms"].to_s
+  end
+
  def binary_location : String
    path = @@conf["binary_directory"].to_s
    return Path[path].expand(home: true).to_s
@ -50,6 +60,14 @@ module Config
    path = @@conf["music_directory"].to_s
    return Path[path].expand(home: true).to_s
  end
+  
+  def filename_pattern : String
+    return @@conf["filename_pattern"].to_s
+  end
+  
+  def directory_pattern : String
+    return @@conf["directory_pattern"].to_s
+  end

  def client_key : String
    return @@conf["client_key"].to_s
--- a/src/bottle/pattern.cr
+++ b/src/bottle/pattern.cr
@ -0,0 +1,28 @@
+module Pattern
+  extend self
+
+  def parse(formatString : String, metadata : JSON::Any)
+    formatted : String = formatString
+
+    date : Array(String) = (metadata["album"]? || JSON.parse("{}"))["release_date"]?.to_s.split('-')
+
+    keys : Hash(String, String) = {
+      "artist" => ((metadata.dig?("artists") || JSON.parse("{}"))[0]? || JSON.parse("{}"))["name"]?.to_s,
+      "title" => metadata["name"]?.to_s,
+      "album" => (metadata["album"]? || JSON.parse("{}"))["name"]?.to_s,
+      "track_number" => metadata["track_number"]?.to_s,
+      "disc_number" => metadata["disc_number"]?.to_s,
+      "total_tracks" => (metadata["album"]? || JSON.parse("{}"))["total_tracks"]?.to_s,
+      "year" => date[0]?.to_s,
+      "month" => date[1]?.to_s,
+      "day" => date[2]?.to_s,
+      "id" => metadata["id"]?.to_s
+    }
+
+    keys.each do |pair|
+      formatted = formatted.gsub("{#{pair[0]}}", pair[1] || "")
+    end
+
+    return formatted
+  end
+end
--- a/src/bottle/version.cr
+++ b/src/bottle/version.cr
@ -1,3 +1,3 @@
 module IRS
-  VERSION = "0.1.0"
+  VERSION = "1.4.0"
 end
--- a/src/glue/album.cr
+++ b/src/glue/album.cr
@ -42,6 +42,6 @@ class Album < SpotifyList
  end

  private def organize(song : Song)
-    song.organize_it(@home_music_directory)
+    song.organize_it()
  end
 end
--- a/src/glue/list.cr
+++ b/src/glue/list.cr
@ -17,6 +17,9 @@ abstract class SpotifyList
    "searching" => [
      Style.bold("Searching for %l by %a ... \r"),
      Style.green("+ ") + Style.bold("%l by %a                                 \n")
+    ],
+    "url" => [
+      Style.bold("When prompted for a URL, provide a youtube URL or press enter to scrape for one\n")
    ]
  }

@ -24,11 +27,19 @@ abstract class SpotifyList
  end

  # Finds the list, and downloads all of the songs using the `Song` class
-  def grab_it
+  def grab_it(flags = {} of String => String)
+    ask_url = flags["url"]?
+    ask_skip = flags["ask_skip"]?
+    is_playlist = flags["playlist"]?
+  
    if !@spotify_searcher.authorized?
      raise("Need to call provide_client_keys on Album or Playlist class.")
    end

+    if ask_url
+      outputter("url", 0)
+    end
+
    outputter("searching", 0)
    list = find_it()
    outputter("searching", 1)
@ -36,22 +47,28 @@ abstract class SpotifyList

    i = 0
    contents.each do |datum|
+      i += 1
      if datum["track"]?
        datum = datum["track"]
      end

      data = organize_song_metadata(list, datum)

-      song = Song.new(data["name"].to_s, data["artists"][0]["name"].to_s)
+      s_name = data["name"].to_s
+      s_artist = data["artists"][0]["name"].to_s
+
+      song = Song.new(s_name, s_artist)
      song.provide_spotify(@spotify_searcher)
      song.provide_metadata(data)

-      puts Style.bold("[#{data["track_number"]}/#{contents.size}]")
-      song.grab_it
+      puts Style.bold("[#{i}/#{contents.size}]")

-      organize(song)
-
-      i += 1
+      unless ask_skip && skip?(s_name, s_artist, is_playlist)
+        song.grab_it(flags: flags)
+        organize(song)
+      else
+        puts "Skipping..."
+      end
    end
  end

@ -60,6 +77,13 @@ abstract class SpotifyList
    @spotify_searcher.authorize(client_key, client_secret)
  end

+  private def skip?(name, artist, is_playlist)
+    print "Skip #{Style.blue name}" +
+      (is_playlist ? " (by #{Style.green artist})": "") + "? "
+    response = gets
+    return response && response.lstrip.downcase.starts_with? "y"
+  end
+
  private def outputter(key : String, index : Int32)
    text = @outputs[key][index]
      .gsub("%l", @list_name)
--- a/src/glue/mapper.cr
+++ b/src/glue/mapper.cr
@ -46,6 +46,7 @@ class TrackMapper
      type: Int32,
      setter: true
    },
+    duration_ms: Int32,
    type: String,
    uri: String
  )
--- a/src/glue/playlist.cr
+++ b/src/glue/playlist.cr
@ -67,9 +67,10 @@ class Playlist < SpotifyList
        FileUtils.mkdir_p(strpath)
      end
      safe_filename = song.filename.gsub(/[\/]/, "").gsub("  ", " ")
-      File.rename("./" + song.filename, (path / safe_filename).to_s)
+      FileUtils.cp("./" + song.filename, (path / safe_filename).to_s)
+      FileUtils.rm("./" + song.filename)
    else
-      song.organize_it(@home_music_directory)
+      song.organize_it()
    end
  end
 end
--- a/src/glue/song.cr
+++ b/src/glue/song.cr
@ -4,6 +4,8 @@ require "../search/youtube"
 require "../interact/ripper"
 require "../interact/tagger"

+require "../bottle/config"
+require "../bottle/pattern"
 require "../bottle/styles"

 class Song
@ -24,7 +26,10 @@ class Song
    ],
    "url" => [
      "  Searching for URL ...\r",
-      Style.green("  + ") + Style.dim("URL found                       \n")
+      Style.green("  + ") + Style.dim("URL found                       \n"),
+      "  Validating URL ...\r",
+      Style.green("  + ") + Style.dim("URL validated                   \n"),
+      "  URL?: "
    ],
    "download" => [
      "  Downloading video:\n",
@ -47,11 +52,16 @@ class Song
  end

  # Find, downloads, and tags the mp3 song that this class represents.
+  # Optionally takes a youtube URL to download from
  #
  # ```
  # Song.new("Bohemian Rhapsody", "Queen").grab_it
  # ```
-  def grab_it
+  def grab_it(url : (String | Nil) = nil, flags = {} of String => String)
+    passed_url : (String | Nil) = flags["url"]?
+    passed_file : (String | Nil) = flags["apply_file"]?
+    select_link = flags["select"]?
+
    outputter("intro", 0)

    if !@spotify_searcher.authorized? && !@metadata
@ -79,43 +89,78 @@ class Song
    end

    data = @metadata.as(JSON::Any)
-    @filename = data["track_number"].to_s + " - #{data["name"].to_s}.mp3"
+    @song_name = data["name"].as_s
+    @artist_name = data["artists"][0]["name"].as_s
+    @filename = "#{Pattern.parse(Config.filename_pattern, data)}.mp3"

-    outputter("url", 0)
-    url = Youtube.find_url(@song_name, @artist_name, search_terms: "lyrics")
-    if !url
-      raise("There was no url found on youtube for " +
-            %("#{@song_name}" by "#{@artist_name}. ) +
-            "Check your input and try again.")
+    if passed_file
+      puts Style.green("  +") + Style.dim(" Moving file: ") + passed_file
+      File.rename(passed_file, @filename)
+    else
+      if passed_url
+        if passed_url.strip != ""
+          url = passed_url
+        else
+          outputter("url", 4)
+          url = gets
+          if !url.nil? && url.strip == ""
+            url = nil
+          end
+        end
+      end
+
+      if !url
+        outputter("url", 0)
+        url = Youtube.find_url(data, flags: flags)
+        if !url
+          raise("There was no url found on youtube for " +
+                %("#{@song_name}" by "#{@artist_name}. ) +
+                "Check your input and try again.")
+        end
+        outputter("url", 1)
+      else
+        outputter("url", 2)
+        url = Youtube.validate_url(url)
+        if !url
+          raise("The url is an invalid youtube URL " +
+                "Check the URL and try again")
+        end
+        outputter("url", 3)
+      end
+
+      outputter("download", 0)
+      Ripper.download_mp3(url.as(String), @filename)
+      outputter("download", 1)
    end
-    outputter("url", 1)
-
-    outputter("download", 0)
-    Ripper.download_mp3(url.as(String), @filename)
-    outputter("download", 1)

    outputter("albumart", 0)
    temp_albumart_filename = ".tempalbumart.jpg"
-    HTTP::Client.get(data["album"]["images"][0]["url"].to_s) do |response|
+    HTTP::Client.get(data["album"]["images"][0]["url"].as_s) do |response|
      File.write(temp_albumart_filename, response.body_io)
    end
    outputter("albumart", 0)

    # check if song's metadata has been modded in playlist, update artist accordingly
-    if data["artists"][-1]["owner"]? 
-      @artist = data["artists"][-1]["name"].to_s
+    if data["artists"][-1]["owner"]?
+      @artist = data["artists"][-1]["name"].as_s
    else
-      @artist = data["artists"][0]["name"].to_s
+      @artist = data["artists"][0]["name"].as_s
    end
-    @album = data["album"]["name"].to_s
+    @album = data["album"]["name"].as_s

    tagger = Tags.new(@filename)
    tagger.add_album_art(temp_albumart_filename)
-    tagger.add_text_tag("title", data["name"].to_s)
+    tagger.add_text_tag("title", data["name"].as_s)
    tagger.add_text_tag("artist", @artist)
-    tagger.add_text_tag("album", @album)
-    tagger.add_text_tag("genre", 
-      @spotify_searcher.find_genre(data["artists"][0]["id"].to_s))
+
+    if !@album.empty?
+      tagger.add_text_tag("album", @album)
+    end
+
+    if genre = @spotify_searcher.find_genre(data["artists"][0]["id"].as_s)
+      tagger.add_text_tag("genre", genre)
+    end
+
    tagger.add_text_tag("track", data["track_number"].to_s)
    tagger.add_text_tag("disc", data["disc_number"].to_s)

@ -127,20 +172,24 @@ class Song
    outputter("finished", 0)
  end

-  # Will organize the song into the user's provided music directory as
-  # music_directory > artist_name > album_name > song
+  # Will organize the song into the user's provided music directory
+  # in the user's provided structure
  # Must be called AFTER the song has been downloaded.
  #
  # ```
  # s = Song.new("Bohemian Rhapsody", "Queen").grab_it
-  # s.organize_it("/home/cooper/Music")
-  # # Will move the mp3 file to
+  # s.organize_it()
+  # # With
+  # # directory_pattern = "{artist}/{album}"
+  # # filename_pattern = "{track_number} - {title}"
+  # # Mp3 will be moved to
  # # /home/cooper/Music/Queen/A Night At The Opera/1 - Bohemian Rhapsody.mp3
  # ```
-  def organize_it(music_directory : String)
-    path = Path[music_directory].expand(home: true)
-    path = path / @artist_name.gsub(/[\/]/, "").gsub("  ", " ")
-    path = path / @album.gsub(/[\/]/, "").gsub("  ", " ")
+  def organize_it()
+    path = Path[Config.music_directory].expand(home: true)
+    Pattern.parse(Config.directory_pattern, @metadata.as(JSON::Any)).split('/').each do |dir|
+      path = path / dir.gsub(/[\/]/, "").gsub("  ", " ")
+    end
    strpath = path.to_s
    if !File.directory?(strpath)
      FileUtils.mkdir_p(strpath)
--- a/src/search/ranking.cr
+++ b/src/search/ranking.cr
@ -0,0 +1,144 @@
+alias VID_VALUE_CLASS = String
+alias VID_METADATA_CLASS = Hash(String, VID_VALUE_CLASS)
+alias YT_METADATA_CLASS = Array(VID_METADATA_CLASS)
+
+module Ranker
+  extend self
+
+  GARBAGE_PHRASES = [
+    "cover", "album", "live", "clean", "version", "full", "full album", "row",
+    "at", "@", "session", "how to", "npr music", "reimagined", "version",
+    "trailer"
+  ]
+
+  GOLDEN_PHRASES = [
+    "official video", "official music video",
+  ]
+
+  # Will rank videos according to their title and the user input, returns a sorted array of hashes
+  # of the points a song was assigned and its original index
+  # *spotify_metadata* is the metadate (from spotify) of the song that you want
+  # *yt_metadata* is an array of hashes with metadata scraped from the youtube search result page
+  # *query* is the query that you submitted to youtube for the results you now have
+  # ```
+  # Ranker.rank_videos(spotify_metadata, yt_metadata, query)
+  # => [
+  #      {"points" => x, "index" => x},
+  #      ...
+  #    ]
+  # ```
+  # "index" corresponds to the original index of the song in yt_metadata
+  def rank_videos(spotify_metadata : JSON::Any, yt_metadata : YT_METADATA_CLASS,
+                  query : String) : Array(Hash(String, Int32))
+    points = [] of Hash(String, Int32)
+    index = 0
+
+    actual_song_name = spotify_metadata["name"].as_s
+    actual_artist_name = spotify_metadata["artists"][0]["name"].as_s
+
+    yt_metadata.each do |vid|
+      pts = 0
+
+      pts += points_string_compare(actual_song_name, vid["title"])
+      pts += points_string_compare(actual_artist_name, vid["title"])
+      pts += count_buzzphrases(query, vid["title"])
+      pts += compare_timestamps(spotify_metadata, vid)
+
+      points.push({
+        "points" => pts,
+        "index"  => index,
+      })
+      index += 1
+    end
+
+    # Sort first by points and then by original index of the song
+    points.sort! { |a, b|
+      if b["points"] == a["points"]
+        a["index"] <=> b["index"]
+      else
+        b["points"] <=> a["points"]
+      end
+    }
+
+    return points
+  end
+
+  # SINGULAR COMPONENT OF RANKING ALGORITHM
+  private def compare_timestamps(spotify_metadata : JSON::Any, node : VID_METADATA_CLASS) : Int32
+    # puts spotify_metadata.to_pretty_json()
+    actual_time = spotify_metadata["duration_ms"].as_i
+    vid_time = node["duration_ms"].to_i
+
+    difference = (actual_time - vid_time).abs 
+
+    # puts "actual: #{actual_time}, vid: #{vid_time}"
+    # puts "\tdiff: #{difference}"
+    # puts "\ttitle: #{node["title"]}"
+
+    if difference <= 1000
+      return 3
+    elsif difference <= 2000
+      return 2
+    elsif difference <= 5000
+      return 1
+    else 
+      return 0
+    end
+  end
+
+  # SINGULAR COMPONENT OF RANKING ALGORITHM
+  # Returns an `Int` based off the number of points worth assigning to the
+  # matchiness of the string. First the strings are downcased and then all
+  # nonalphanumeric characters are stripped.
+  # If *item1* includes *item2*, return 3 pts.
+  # If after the items have been blanked, *item1* includes *item2*,
+  #   return 1 pts.
+  # Else, return 0 pts.
+  private def points_string_compare(item1 : String, item2 : String) : Int32
+    if item2.includes?(item1)
+      return 3
+    end
+
+    item1 = item1.downcase.gsub(/[^a-z0-9]/, "")
+    item2 = item2.downcase.gsub(/[^a-z0-9]/, "")
+
+    if item2.includes?(item1)
+      return 1
+    else
+      return 0
+    end
+  end
+
+  # SINGULAR COMPONENT OF RANKING ALGORITHM
+  # Checks if there are any phrases in the title of the video that would
+  # indicate audio having what we want.
+  # *video_name* is the title of the video, and *query* is what the user the
+  # program searched for. *query* is needed in order to make sure we're not
+  # subtracting points from something that's naturally in the title
+  private def count_buzzphrases(query : String, video_name : String) : Int32
+    good_phrases = 0
+    bad_phrases = 0
+
+    GOLDEN_PHRASES.each do |gold_phrase|
+      gold_phrase = gold_phrase.downcase.gsub(/[^a-z0-9]/, "")
+
+      if query.downcase.gsub(/[^a-z0-9]/, "").includes?(gold_phrase)
+        next
+      elsif video_name.downcase.gsub(/[^a-z0-9]/, "").includes?(gold_phrase)
+        good_phrases += 1
+      end
+    end
+
+    GARBAGE_PHRASES.each do |garbage_phrase|
+      garbage_phrase = garbage_phrase.downcase.gsub(/[^a-z0-9]/, "")
+
+      if query.downcase.gsub(/[^a-z0-9]/, "").includes?(garbage_phrase)
+        next
+      elsif video_name.downcase.gsub(/[^a-z0-9]/, "").includes?(garbage_phrase)
+        bad_phrases += 1
+      end
+    end
+
+    return good_phrases - bad_phrases
+  end
+end
--- a/src/search/spotify.cr
+++ b/src/search/spotify.cr
@ -60,9 +60,10 @@ class SpotifySearcher
  # ```
  def find_item(item_type : String, item_parameters : Hash, offset = 0,
                limit = 20) : JSON::Any?
-    query = generate_query(item_type, item_parameters, offset, limit)
+    query = generate_query(item_type, item_parameters)

-    url = @root_url.join("search?q=#{query}").to_s
+    url = "search?q=#{query}&type=#{item_type}&limit=#{limit}&offset=#{offset}"
+    url = @root_url.join(url).to_s

    response = HTTP::Client.get(url, headers: @access_header)
    error_check(response)
@ -204,8 +205,14 @@ class SpotifySearcher
  # ```
  # SpotifySearcher.new.authorize(...).find_genre("1dfeR4HaWDbWqFHLkxsg1d")
  # ```
-  def find_genre(id : String) : String
-    genre = get_item("artist", id)["genres"][0].to_s
+  def find_genre(id : String) : String | Nil
+    genre = get_item("artist", id)["genres"]
+
+    if genre.as_a.empty?
+      return nil
+    end
+
+    genre = genre[0].to_s
    genre = genre.split(" ").map { |x| x.capitalize }.join(" ")

    return genre
@ -222,8 +229,7 @@ class SpotifySearcher

  # Generates url to run a GET request against to the Spotify open API
  # Returns a `String.`
-  private def generate_query(item_type : String, item_parameters : Hash,
-                             offset : Int32, limit : Int32) : String
+  private def generate_query(item_type : String, item_parameters : Hash) : String
    query = ""

    # parameter keys to exclude in the api request. These values will be put
@ -235,9 +241,9 @@ class SpotifySearcher
      if k == "name"
        # will remove the "name:<title>" param from the query
        if item_type == "playlist"
-          query += item_parameters[k].gsub(" ", "+") + "+"
+          query += item_parameters[k] + "+"
        else
-          query += param_encode(item_type, item_parameters[k])
+          query += as_field(item_type, item_parameters[k])
        end

        # check if the key is to be excluded
@ -248,14 +254,21 @@ class SpotifySearcher
        # NOTE: playlist names will be inserted into the query normally, without
        # a parameter.
      else
-        query += param_encode(k, item_parameters[k])
+        query += as_field(k, item_parameters[k])
      end
    end

-    # extra api info
-    query += "&type=#{item_type}&limit=#{limit}&offset=#{offset}"
+    return URI.encode(query.rchop("+"))
+  end

-    return query
+  # Returns a `String` encoded for the spotify api
+  #
+  # ```
+  # query_encode("album", "A Night At The Opera")
+  # => "album:A Night At The Opera+"
+  # ```
+  private def as_field(key, value) : String
+    return "#{key}:#{value}+"
  end

  # Ranks the given items based off of the info from parameters.
@ -321,15 +334,6 @@ class SpotifySearcher
    end
  end

-  # Returns a `String` encoded for the spotify api
-  #
-  # ```
-  # query_encode("album", "A Night At The Opera")
-  # => "album:A+Night+At+The+Opera"
-  # ```
-  private def param_encode(key : String, value : String) : String
-    return key.gsub(" ", "+") + ":" + value.gsub(" ", "+") + "+"
-  end
 end

 # puts SpotifySearcher.new()
--- a/src/search/youtube.cr
+++ b/src/search/youtube.cr
@ -1,6 +1,12 @@
 require "http"
 require "xml"
 require "json"
+require "uri"
+
+require "./ranking"
+
+require "../bottle/config"
+require "../bottle/styles"


 module Youtube
@ -11,167 +17,123 @@ module Youtube
    "yt-uix-tile-link yt-ui-ellipsis yt-ui-ellipsis-2 yt-uix-sessionlink      spf-link ",
  ]

-  GARBAGE_PHRASES = [
-    "cover", "album", "live", "clean", "version", "full", "full album", "row",
-    "at", "@", "session", "how to", "npr music", "reimagined", "hr version",
-    "trailer",
-  ]
-
-  GOLDEN_PHRASES = [
-    "official video", "official music video",
-  ]
-
-  alias NODES_CLASS = Array(Hash(String, String))
+  # Note that VID_VALUE_CLASS, VID_METADATA_CLASS, and YT_METADATA_CLASS are found in ranking.cr

  # Finds a youtube url based off of the given information.
  # The query to youtube is constructed like this:
  #   "<song_name> <artist_name> <search terms>"
  # If *download_first* is provided, the first link found will be downloaded.
+  # If *select_link* is provided, a menu of options will be shown for the user to choose their poison
  #
  # ```
  # Youtube.find_url("Bohemian Rhapsody", "Queen")
  # => "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
  # ```
-  def find_url(song_name : String, artist_name : String, search_terms = "",
-               download_first = false) : String?
-    query = (song_name + " " + artist_name + " " + search_terms).strip.gsub(" ", "+")
+  def find_url(spotify_metadata : JSON::Any,
+               flags = {} of String => String) : String?

-    url = "https://www.youtube.com/results?search_query=" + query
+    search_terms = Config.search_terms

-    response = HTTP::Client.get(url)
+    select_link = flags["select"]?

-    valid_nodes = get_video_link_nodes(response.body)
+    song_name = spotify_metadata["name"].as_s
+    artist_name = spotify_metadata["artists"][0]["name"].as_s

-    if valid_nodes.size == 0
-      puts "There were no results for that query."
+    human_query = "#{song_name} #{artist_name} #{search_terms.strip}"
+    params = HTTP::Params.encode({"search_query" => human_query})
+
+    response = HTTP::Client.get("https://www.youtube.com/results?#{params}")
+
+    yt_metadata = get_yt_search_metadata(response.body)
+
+    if yt_metadata.size == 0
+      puts "There were no results for this query on youtube: \"#{human_query}\""
      return nil
    end

    root = "https://youtube.com"
+    ranked = Ranker.rank_videos(spotify_metadata, yt_metadata, human_query)

-    return root + valid_nodes[0]["href"] if download_first
-
-    ranked = rank_videos(song_name, artist_name, query, valid_nodes)
+    if select_link
+      return root + select_link_menu(spotify_metadata, yt_metadata)
+    end

    begin
-      return root + valid_nodes[ranked[0]["index"]]["href"]
+      puts Style.dim("  Video: ") + yt_metadata[ranked[0]["index"]]["title"]
+      return root + yt_metadata[ranked[0]["index"]]["href"]
    rescue IndexError
      return nil
    end
+
+    exit 1
  end

-  # Will rank videos according to their title and the user input
-  # Return:
-  # [
-  #   {"points" => x, "index" => x},
-  #   ...
-  # ]
-  private def rank_videos(song_name : String, artist_name : String,
-                          query : String, nodes : Array(Hash(String, String))) : Array(Hash(String, Int32))
-    points = [] of Hash(String, Int32)
-    index = 0
-
-    nodes.each do |node|
-      pts = 0
-
-      pts += points_compare(song_name, node["title"])
-      pts += points_compare(artist_name, node["title"])
-      pts += count_buzzphrases(query, node["title"])
-
-      points.push({
-        "points" => pts,
-        "index"  => index,
-      })
+  # Presents a menu with song info for the user to choose which url they want to download
+  private def select_link_menu(spotify_metadata : JSON::Any,
+                               yt_metadata : YT_METADATA_CLASS) : String
+    puts Style.dim("  Spotify info: ") +
+         Style.bold("\"" + spotify_metadata["name"].to_s) + "\" by \"" +
+         Style.bold(spotify_metadata["artists"][0]["name"].to_s + "\"") +
+         " @ " + Style.blue((spotify_metadata["duration_ms"].as_i / 1000).to_i.to_s) + "s"
+    puts "  Choose video to download:"
+    index = 1
+    yt_metadata.each do |vid|
+      print "    " + Style.bold(index.to_s + " ")
+      puts "\"" + vid["title"] + "\" @ " + Style.blue((vid["duration_ms"].to_i / 1000).to_i.to_s) + "s"
      index += 1
-    end
-
-    # Sort first by points and then by original index of the song
-    points.sort! { |a, b|
-      if b["points"] == a["points"]
-        a["index"] <=> b["index"]
-      else
-        b["points"] <=> a["points"]
-      end
-    }
-
-    return points
-  end
-
-  # Returns an `Int` based off the number of points worth assigning to the
-  # matchiness of the string. First the strings are downcased and then all
-  # nonalphanumeric characters are stripped.
-  # If *item1* includes *item2*, return 3 pts.
-  # If after the items have been blanked, *item1* includes *item2*,
-  #   return 1 pts.
-  # Else, return 0 pts.
-  private def points_compare(item1 : String, item2 : String) : Int32
-    if item2.includes?(item1)
-      return 3
-    end
-
-    item1 = item1.downcase.gsub(/[^a-z0-9]/, "")
-    item2 = item2.downcase.gsub(/[^a-z0-9]/, "")
-
-    if item2.includes?(item1)
-      return 1
-    else
-      return 0
-    end
-  end
-
-  # Checks if there are any phrases in the title of the video that would
-  # indicate audio having what we want.
-  # *video_name* is the title of the video, and *query* is what the user the
-  # program searched for. *query* is needed in order to make sure we're not
-  # subtracting points from something that's naturally in the title
-  private def count_buzzphrases(query : String, video_name : String) : Int32
-    good_phrases = 0
-    bad_phrases = 0
-
-    GOLDEN_PHRASES.each do |gold_phrase|
-      gold_phrase = gold_phrase.downcase.gsub(/[^a-z0-9]/, "")
-
-      if query.downcase.gsub(/[^a-z0-9]/, "").includes?(gold_phrase)
-        next
-      elsif video_name.downcase.gsub(/[^a-z0-9]/, "").includes?(gold_phrase)
-        good_phrases += 1
+      if index > 5
+        break
      end
    end

-    GARBAGE_PHRASES.each do |garbage_phrase|
-      garbage_phrase = garbage_phrase.downcase.gsub(/[^a-z0-9]/, "")
-
-      if query.downcase.gsub(/[^a-z0-9]/, "").includes?(garbage_phrase)
-        next
-      elsif video_name.downcase.gsub(/[^a-z0-9]/, "").includes?(garbage_phrase)
-        bad_phrases += 1
+    input = 0
+    while true # not between 1 and 5
+      begin
+        print Style.bold("  > ")
+        input = gets.not_nil!.chomp.to_i
+        if input < 6 && input > 0
+          break
+        end
+      rescue
+        puts Style.red("  Invalid input, try again.")
      end
    end

-    return good_phrases - bad_phrases
+    return yt_metadata[input-1]["href"]
+
  end

  # Finds valid video links from a `HTTP::Client.get` request
-  # Returns an `Array` of `XML::Node`
-  private def get_video_link_nodes(response_body : String) : NODES_CLASS
+  # Returns an `Array` of `NODES_CLASS` containing additional metadata from Youtube
+  private def get_yt_search_metadata(response_body : String) : YT_METADATA_CLASS
    yt_initial_data : JSON::Any = JSON.parse("{}")

    response_body.each_line do |line|
-      if line.includes?("window[\"ytInitialData\"]")
-        yt_initial_data = JSON.parse(line.split(" = ")[1][0..-2])
+      # timestamp 11/8/2020:
+      # youtube's html page has a line previous to this literally with 'scraper_data_begin' as a comment
+      if line.includes?("var ytInitialData")
+        # Extract JSON data from line
+        data = line.split(" = ")[2].delete(';')
+        dataEnd = (data.index("</script>") || 0) - 1
+
+        begin
+          yt_initial_data = JSON.parse(data[0..dataEnd])
+        rescue
+          break
+        end
      end
    end

    if yt_initial_data == JSON.parse("{}")
      puts "Youtube has changed the way it organizes its webpage, submit a bug"
-      puts "on https://github.com/cooperhammond/irs"
+      puts "saying it has done so on https://github.com/cooperhammond/irs"
      exit(1)
    end

    # where the vid metadata lives
    yt_initial_data = yt_initial_data["contents"]["twoColumnSearchResultsRenderer"]["primaryContents"]["sectionListRenderer"]["contents"]

-    video_metadata = [] of Hash(String, String)
+    video_metadata = [] of VID_METADATA_CLASS

    i = 0
    while true
@ -179,11 +141,16 @@ module Youtube
        # video title
        raw_metadata = yt_initial_data[0]["itemSectionRenderer"]["contents"][i]["videoRenderer"]

-        metadata = {} of String => String
+        metadata = {} of String => VID_VALUE_CLASS

        metadata["title"] = raw_metadata["title"]["runs"][0]["text"].as_s
        metadata["href"] = raw_metadata["navigationEndpoint"]["commandMetadata"]["webCommandMetadata"]["url"].as_s
-    
+        timestamp = raw_metadata["lengthText"]["simpleText"].as_s
+        metadata["timestamp"] = timestamp
+        metadata["duration_ms"] = ((timestamp.split(":")[0].to_i * 60 +
+                               timestamp.split(":")[1].to_i) * 1000).to_s
+
+
        video_metadata.push(metadata)
      rescue IndexError
        break
@ -194,4 +161,40 @@ module Youtube

    return video_metadata
  end
+
+  # Returns as a valid URL if possible
+  #
+  # ```
+  # Youtube.validate_url("https://www.youtube.com/watch?v=NOTANACTUALVIDEOID")
+  # => nil
+  # ```
+  def validate_url(url : String) : String | Nil
+    uri = URI.parse url
+    return nil if !uri
+
+    query = uri.query
+    return nil if !query
+
+    # find the video ID
+    vID = nil
+    query.split('&').each do |q|
+      if q.starts_with?("v=")
+        vID = q[2..-1]
+      end
+    end
+    return nil if !vID
+
+    url = "https://www.youtube.com/watch?v=#{vID}"
+
+    # this is an internal endpoint to validate the video ID
+    params = HTTP::Params.encode({"format" => "json", "url" => url})
+    response = HTTP::Client.get "https://www.youtube.com/oembed?#{params}"
+    return nil unless response.success?
+
+    res_json = JSON.parse(response.body)
+    title = res_json["title"].as_s
+    puts Style.dim("  Video: ") + title
+
+    return url
+  end
 end
Author	SHA1	Message	Date
cooperhammond	c99e8257e9	updated documentation #85	2022-02-23 10:59:15 -07:00
Cooper Hammond	3bbb0e767a	Merge pull request #84 from imsamuka/master add option to apply metadata in existing file. Apologies for the late merge, you sent this pull request right as school was beginning to pick up in earnest and I forgot about it in that rush. Thanks for the great work!	2022-01-27 11:18:58 -07:00
imsamuka	61120f21b0	add option to apply metadata in existing file	2022-01-07 22:15:27 -03:00
Cooper Hammond	390d59b9a0	Merge pull request #83 from imsamuka/fix-options Fix options and add --ask-skip	2022-01-04 08:46:37 -07:00
imsamuka	3263ff4e07	fix GET requests url encoding	2022-01-03 01:01:58 -03:00
imsamuka	3d4acdeaea	add option to skip tracks on albums/playlists	2022-01-02 20:25:47 -03:00
imsamuka	72938a9b6a	show video title from url	2022-01-02 19:24:54 -03:00
imsamuka	f962a0ab75	make youtube url validation safer	2022-01-02 18:04:05 -03:00
imsamuka	ac7bc02ec5	fix youtube urls validation	2022-01-02 17:20:37 -03:00
imsamuka	bdc63b4c35	fix --url ignoring argument on song.cr	2022-01-02 17:05:00 -03:00
imsamuka	289f1d8c63	fix video selection offset	2022-01-02 15:19:16 -03:00
Cooper Hammond	f3776613b4	update version for new binary	2021-07-12 09:10:43 -06:00
Cooper Hammond	ff3019e207	Merge pull request #78 from cooperhammond/select-vid-dl added search terms config option and cli menu	2021-04-15 11:23:38 -06:00
Cooper Hammond	fa5f3bb3b7	added search terms config option and cli menu -S or --select will allow you to choose your song, for playlists or for albums	2021-04-15 11:22:01 -06:00
Cooper Hammond	8d348031d3	update to 1.3.0	2021-04-15 09:46:55 -06:00
Cooper Hammond	92e8885ae9	Merge pull request #77 from cooperhammond/search-improvement Search improvement based on song duration	2021-04-15 09:45:27 -06:00
Cooper Hammond	5eaac33345	minor fix to include duration_ms in all song metadata	2021-04-15 09:41:13 -06:00
Cooper Hammond	8c15f7b5e2	song duration now included in ranking	2021-04-14 09:12:08 -06:00
Cooper Hammond	3f12a880e9	minor fix for cross device linking	2021-04-13 22:39:33 -06:00
Cooper Hammond	8f25eae1cb	update version	2021-01-09 15:15:19 -07:00
Cooper Hammond	124b425f55	Merge pull request #74 from luca-schlecker/GH-1 make the way mp3s are saved configurable Looks great dude, happy to see someone take an interest in the project	2021-01-03 13:16:16 -08:00
Luca Schlecker	2e8bc6c8c5	make the way mp3s are saved configurable Signed-off-by: Luca Schlecker <luca.schlecker@hotmail.com>	2020-12-30 00:52:38 +01:00
Cooper Hammond	b38bcd4ad8	Merge pull request #73 from luca-schlecker/master fix #72: Find the JSON data inside the line and trim the rest	2020-12-29 14:38:56 -08:00
Luca Schlecker	2c364c38c2	fix #72 : Find the JSON data inside the line and trim the rest Signed-off-by: Luca Schlecker <luca.schlecker@hotmail.com>	2020-12-29 21:10:45 +01:00
Cooper Hammond	c20f4309d8	Merge pull request #69 from Who23/bug-fix-68 Fix #68 Can't believe I didn't see this earlier. Thanks for the great work! I'll merge it now.	2020-11-08 01:21:30 -07:00
Cooper Hammond	047cc71b0d	added a very simple test suite	2020-11-08 01:17:45 -07:00
Cooper Hammond	a8a1c4d1c3	fix for updated yt render/metadata code	2020-11-08 00:47:23 -07:00
Cooper Hammond	bf29194042	fix for #70	2020-11-08 00:37:16 -07:00
Who23	843a5b9db1	Fix #68 Fix bug where it was assumed that every artist would be tagged with a genre	2020-09-11 13:48:12 -04:00
Cooper Hammond	58895e2e87	Merge pull request #66 from Who23/youtube-sources Add ability to specifiy youtube URLs manually Looks like wonderful stuff, I appreciate the contribution and time investment to my small homebrew project, I hope you've been getting good stuff out of it.	2020-09-08 13:54:58 -06:00
Who23	dd8c74520c	URL source for albums/playlists & Youtube module improvements - Add ability to source youtube URls for albums and playlists, through the -g flag, which prompts for user input on each song - Fix the Youtube.is_valid_url function, which now actually checks whether the given URL points to an actual video	2020-09-07 18:16:31 -04:00
Who23	e8a71b2530	Add ability to specifiy youtube URL source Added a new flag (-u) to specify a youtube URL source when downloading a single song.	2020-09-04 20:49:07 -04:00