Sub-to-SRT updated for YouTube transcripts

Want to talk about something that isn't covered by another category?

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller

Post Reply
jameshale
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 474
Joined: Thu Sep 04, 2008 6:23 am
Location: Melbourne Australia

Sub-to-SRT updated for YouTube transcripts

Post by jameshale » Sun Oct 09, 2022 1:49 am

I have just updated the sample stack "sub-to-srt" to now convert YouTube's transcripts (sbv extension) to set files.

Perhaps it is age, but I am increasingly finding it difficult to catch all the words spoken in videos.
The LiveCode videos posted on YouTube are a case in point. Although I have subscribed to most of the "global" events, and watch them when I can be given the timezone differences, I find myself usually waiting until they are posted on my account page.
Previously I would then download the YouTube version in order to have closed captions.
Unfortunately the apps I use to download YouTubes no longer capture the captions.
However while on YouTube you can show the transcript and copy the text.
Doing so and saving the text file with a ".sbv" extension the "sub-to-srt" stack will now convert this into a correctly formatted ".srt" file which most media players can access.

I use "sub-to-srt" as a standalone app on my Mac.

To see the transcript of a YouTube video;
Turn on captions, if not already turned on. It's the "CC" icon at the bottom of the video.
Click the gear icon to adjust any available settings.
At the end of the line of icons with "like" "dislike" etc, you will see an ellipsis "…"
Click on this and select "show transcript".
It will appear to the right of the video.
Simply select and copy the text.

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 9454
Joined: Fri Feb 19, 2010 10:17 am
Location: Bulgaria

Re: Sub-to-SRT updated for YouTube transcripts

Post by richmond62 » Sun Oct 09, 2022 8:01 am

One of the reasons you may find it difficult to make out what YouTubers are saying is because an awful lot of them speak through their fundament. LOL.

jameshale
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 474
Joined: Thu Sep 04, 2008 6:23 am
Location: Melbourne Australia

Re: Sub-to-SRT updated for YouTube transcripts

Post by jameshale » Sun Oct 09, 2022 9:43 am

Its not YouTubers, its Kevin (that accent) and Mark (only speaks at high speed.
As for the LiveCoders themselves, sorry.

jacque
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 7258
Joined: Sat Apr 08, 2006 8:31 pm
Location: Minneapolis MN
Contact:

Re: Sub-to-SRT updated for YouTube transcripts

Post by jacque » Sun Oct 09, 2022 5:40 pm

It took me a couple of years to fully understand Kevin, but now I don't notice the accent any more. Mark was harder and took longer than that.

At the first conference in Edinburgh I remarked to Ollie (he was on the team at the time) that I had trouble understanding Mark and he said, "I have trouble understanding him and I'm from the same town!"

You are not alone.
Jacqueline Landman Gay | jacque at hyperactivesw dot com
HyperActive Software | http://www.hyperactivesw.com

bobcole
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 136
Joined: Tue Feb 23, 2010 10:53 pm
Location: Saint Louis, Missouri USA

Re: Sub-to-SRT updated for YouTube transcripts

Post by bobcole » Mon Oct 10, 2022 2:44 am

by jameshale :
...
Click on this and select "show transcript".
It will appear to the right of the video.
Simply select and copy the text.
I didn't know about the transcript feature of YouTube videos, thanks for pointing it out. Thanks also for pointing out the Sub-to-srt Sample Stack.
Unfortunately, when I tried to copy the transcript from YouTube, I was only able to select one line of caption at a time, without the timecode. Awkward and time consuming to get even a little of the transcript.
Is there an easy way to get the whole transcript at once, in the "sub" or "sbv" format? Perhaps a YouTube account is required (I don't have one at the moment)?
Having the text of the transcripts would be a wonderful enhancement to my learning resources.
Thanks,
Bob

jameshale
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 474
Joined: Thu Sep 04, 2008 6:23 am
Location: Melbourne Australia

Re: Sub-to-SRT updated for YouTube transcripts

Post by jameshale » Tue Oct 11, 2022 12:24 am

Yeah it is a strange thing.
Cick on one line and that’s it. You cannot extend the selection.
BUT simply click AND drag down from the first line and a selection extends.
One it has started, the complete transcript has been selected. You do not need to drag all the way down.

bobcole
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 136
Joined: Tue Feb 23, 2010 10:53 pm
Location: Saint Louis, Missouri USA

Re: Sub-to-SRT updated for YouTube transcripts

Post by bobcole » Wed Oct 12, 2022 5:45 am

jameshale:
Thanks for the suggestion. I was able to select whole transcripts!
I reformatted them in BBEdit. The text is difficult to read without punctuation or capitalization and with a timecode at the start of each line. Also, I see a number of goofs in the transcriptions (e.g., "Web Camp" is sometimes transcribed as "webcam").
I am happy with the results however because, despite transcription errors, the text of videos can now be searched.
The plain text (UTF-8) files from BBEdit are fairly small: Web Camp Sessions 1 and 2 are 76KB and 52 KB respectively.
Here is a sample from the first Web Camp video:
...
2:03 hello and uh very warm welcome to session one of webcamp thanks very much
2:08 for taking the time to join us today i'm kevin miller live code ceo
2:14 and i am truly delighted to be able to present our brand new and very exciting vision
2:20 for live code on the web in today's webinar i'll talk a bit about what that vision is
...
I played with your sub-to-srt Sample Stack and worked through it to create srt files.
The srt files are recognized by the VLC app on my Mac. They don't play or do anything but I didn't expect they would.

I'm quite happy to have the plain text files.
Thank you for your guidance,
Bob

jameshale
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 474
Joined: Thu Sep 04, 2008 6:23 am
Location: Melbourne Australia

Re: Sub-to-SRT updated for YouTube transcripts

Post by jameshale » Wed Oct 12, 2022 6:01 am

When you say the srt files don’t play or do anything what do you mean?
If the srt file is in the same folder as the video file , when you play the video you should be able to select the srt file , or if the video and srt file have the Sam name VLC should see and play it automatically with the video.

bobcole
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 136
Joined: Tue Feb 23, 2010 10:53 pm
Location: Saint Louis, Missouri USA

Re: Sub-to-SRT updated for YouTube transcripts

Post by bobcole » Wed Oct 12, 2022 6:11 pm

jameshale:
My initial goal was just to get the text so I could conduct searches on various terms.
Following your suggestion, I downloaded the video file and put the srt file in the same folder. The transcript displays perfectly.
Thank you for your good advice.
Bob

jameshale
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 474
Joined: Thu Sep 04, 2008 6:23 am
Location: Melbourne Australia

Re: Sub-to-SRT updated for YouTube transcripts

Post by jameshale » Wed Oct 12, 2022 6:46 pm

_______👍____

jacque
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 7258
Joined: Sat Apr 08, 2006 8:31 pm
Location: Minneapolis MN
Contact:

Re: Sub-to-SRT updated for YouTube transcripts

Post by jacque » Wed Oct 12, 2022 7:11 pm

BTW if you only want the text without the time codes (not for use with VLC) then BBEdit can strip those out with a regex expression. Let us know if that's something you want.
Jacqueline Landman Gay | jacque at hyperactivesw dot com
HyperActive Software | http://www.hyperactivesw.com

bobcole
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 136
Joined: Tue Feb 23, 2010 10:53 pm
Location: Saint Louis, Missouri USA

Re: Sub-to-SRT updated for YouTube transcripts

Post by bobcole » Wed Oct 12, 2022 8:01 pm

Jacque:
I think I’ll keep the time codes in my text files to identify the locations in the videos. In case I want to view a video I’ll know where to look.
Thanks for the idea though.
Bob

Post Reply

Return to “Off-Topic”