Daily Programming Thread

bemis

33334
Low rep power
Joined
Mar 7, 2024
Posts
74
Rep Power
133
What are you working on? What are you building? What are you learning about?
It doesn't have to be something specifically related to programming. Doing something cool with AI, Crypto, Devices, Content?
This thread is also for shilling your favorite programming language, and calling everyone else a faggot for preferring their retarded language instead of yours.
Here is an important function you can put in your ~/.bashrc file:
Bash:
penis() {
    # prints a weiner with a length corresponding to the computer's uptime
    uptime | perl -naE 'say+8,"="x$F[2],D'
}
 

DonDon Patch

11122
Staff member
High rep power
Joined
Nov 30, 2023
Posts
2,593
Rep Power
2,151
The most powerful programming language is Lisp. If you don't know Lisp (or its variant, Scheme), you don't know what it means for a programming language to be powerful and elegant. Once you learn Lisp, you will see what is lacking in most other languages.
 

bemis

33334
Low rep power
Joined
Mar 7, 2024
Posts
74
Rep Power
133
The most powerful programming language is Lisp. If you don't know Lisp (or its variant, Scheme), you don't know what it means for a programming language to be powerful and elegant. Once you learn Lisp, you will see what is lacking in most other languages.
I'm using a particular Lisp dialect with one of my projects right now as it happens. Every time I transcribe a video, I parse the transcripts into a format and then load those into PostgreSQL. Here is a little portion of the .srt/.vtt parser. It's a rough first draft, I'm sure I'll look at it later and simplify it.
Code:
...
(def cli-options {:filename {:alias :f
                             :desc  "input file, must be a .srt"}})

(def options (cli/parse-opts *command-line-args* {:spec cli-options}))

(defn subtitle-split-entries [entries]
  (string/split entries #"\n\n"))

(defn ts->range [ts]
  (str "[" (first ts) ", " (second ts) ")"))

(defn subtitle-parse-ts [ts]
  (-> ts
      (string/replace #"," ".")
      (string/split #"\s*-->\s*")
      (ts->range)))

(defn subtitle-parse-entry [entry]
  (let [[_ t txt] (string/split entry #"\n")
        ts        (subtitle-parse-ts t)]
    {:span ts :contents txt}))

(defn subtitle-read [file]
  {:pre [(fs/exists? file)]}
  (slurp file))

(defn subtitles-parse [subtitles]
  (mapv #(subtitle-parse-entry %) (subtitle-split-entries subtitles)))

(comment
  (def example (let [file (subtitle-read (:filename options))]
                 (subtitles-parse file))))
...
 

bemis

33334
Low rep power
Joined
Mar 7, 2024
Posts
74
Rep Power
133
@hackinzoomer, didn't want to spam the other thread with walls of text, but the search stuff ended up being really easy. the .srt files are simple
Code:
1
00:00:00,400 --> 00:00:05,160
Good evening, everybody. You're watching America first. My name is Nicholas J. Fuentes

2
...
so the schema ends up being small. at least for things like "between x and y look for occurrences of z"
SQL:
-- custom range type so we can exclude date
CREATE TYPE segrange AS RANGE (
       subtype = time (3) without time zone
);
CREATE TABLE episode (
       id serial PRIMARY KEY UNIQUE,
       release_date date NOT NULL
);
CREATE TABLE subtitle (
       id serial PRIMARY KEY UNIQUE,
       span segrange NOT NULL,
       contents text,
       ts tsvector GENERATED ALWAYS AS (to_tsvector('english', contents)) STORED,
       episode_id serial references episode(id)
);
CREATE INDEX ts_idx ON subtitle USING GIN (ts);
CREATE INDEX span_idx on subtitle USING GIST (span);
-- example text query using & to build a phrase from words
SELECT span,contents FROM subtitle WHERE ts @@ ts_tsquery('english', 'christ & is & king');
-- example timespan range looking for times overlapping with this range
SELECT span,contents from subtitle where span && '[01:29:27.00,01:29:32.00)'
this is what it looks like after it's loaded up
SQL:
af_fts=# select span,contents,episode_id from subtitle where ts @@ to_tsquery('english', 'christ & is & king');
-[ RECORD 1 ]--------------------------------------------------------------------------------------------------------------
span       | [01:29:27.16,01:29:31.24)
contents   | Keep up the great work big guy america first christ is king
episode_id | 1
...
af_fts=# select contents from subtitle where span && '[01:29:27.00,01:29:32.00)';
-[ RECORD 1 ]-----------------------------------------------------------------------------------------------------------------
contents | Keep up the great work big guy america first christ is king
...
 

Attachments

  • test_srt.txt
    154.6 KB · Views: 20

bemis

33334
Low rep power
Joined
Mar 7, 2024
Posts
74
Rep Power
133
from what I can tell, the NJF archive has 985 files with a video/mp4 mimetype. There were already some good examples out there of how to go about downloading videos from Telegram, so I'm not sure how much time I'll spend writing my own code for just getting old videos
Python:
from telethon import TelegramClient

api_id=123
api_hash="p3n15"
channel_name="NJFArchive"

client = TelegramClient('a-client-name', api_id, api_hash)

async def download_videos():
    chat = await client.get_entity(channel_name)

    # client.iter_messages(chat, offset_id=message.id)
    async for message in client.iter_messages(chat):
        if (
                hasattr(message, 'media') and
                hasattr(message.media, 'document') and
                hasattr(message.media.document, 'mime_type') and
                message.media.document.mime_type == 'video/mp4'
        ):
            published_date = message.date.strftime("%Y-%m-%d")
            await client.download_media(message=message, file=f"{published_date}.mp4")
            print(message.id)

with client:
    client.loop.run_until_complete(download_videos())
I think in total there will be about ~3TB of video files, but I could be way off. Telegram has a separate API for bots that has an event driven aspect to it that I'll use for downloading new videos as they appear.
 

bemis

33334
Low rep power
Joined
Mar 7, 2024
Posts
74
Rep Power
133
If anybody sees me sperging out and wonders what I'm up to, here is my basic plan for the subtitle search/clip renderer:
plan.png
 

nationalism_tv

34444
Low rep power
Joined
Mar 17, 2024
Posts
8
Rep Power
18
I built a bunch of stuff for cozy in the last few months - https://cozy.nationalism.tv/

1710857662850.png


it's like a social blade equivalent for cozy. It also logs every chat message ever sent and makes it searchable. AND it archives every video replay it can find and uploads it to archive.org for posterity. THOUSANDS of videos archived so far.

Had similar idea to make all cozy videos "searchable" by running some speech to text software through each of those videos, but it's unbelievably expensive whether you use something like AWS transcribe, or try to do it yourself and realize how computationally expensive the whole process is …. no good solution there
 

bemis

33334
Low rep power
Joined
Mar 7, 2024
Posts
74
Rep Power
133
I built a bunch of stuff for cozy in the last few months - https://cozy.nationalism.tv/

View attachment 1492

it's like a social blade equivalent for cozy. It also logs every chat message ever sent and makes it searchable. AND it archives every video replay it can find and uploads it to archive.org for posterity. THOUSANDS of videos archived so far.

Had similar idea to make all cozy videos "searchable" by running some speech to text software through each of those videos, but it's unbelievably expensive whether you use something like AWS transcribe, or try to do it yourself and realize how computationally expensive the whole process is …. no good solution there
this site looks awesome, if you don't already work in financial tech you should start. Little things like indicating the percentage change, historic highs, derived analytics, adds up to a nice interface. I'm going to have to use your VOD archive as a source here, the NJF Archive on Telegram is thorough, but Telegram itself has a lot of features on media that make it annoying to work with so far. You're right there isn't any good solution to the transcribing problem. I'm limited now to just serially transcribing videos as I download them, with the understanding being that as long as I'm persistent one day I catch up with Nick. There are those services for it but as you point out really costly. One thought was to try and solve it socially since there wasn't any technical solution. I was considering making a list of all videos, and then begging groypers to download a video, transcribe it, and send me the .srt file (which is pretty small it's just text).
 

DonDon Patch

11122
Staff member
High rep power
Joined
Nov 30, 2023
Posts
2,593
Rep Power
2,151
you now remember

 
Top