Closed Captions in YouTube API v3
Asked Answered
B

2

13

I need to read closed caption text from 3rd party, publicly available YouTube videos in my java webapp i.e. I have NOT uploaded the content.

Whilst v2 of the YouTube Data API restricted access to the caption information to the person who uploaded the video it seems like a very odd restriction to give access to everything except this one piece of data. I expected to see this restriction removed in v3 of the API but now the only reference to closed caption is a boolean method to confirm if CC is attached to the video. Even the owner can't seem to download it now. (Are Google going to add it back at least?)

Boolean hasCaptions = video.getContentDetails().getCaption()

Using YouTube Data API v3 (using the Google Java API client) I have been able to find, authenticate and retrieve YouTube resources (videos, playlists, channels, etc.). I can do pretty much everything the API has made available I just can't read the actual caption text.

I've also tried the unpublished timed text link workaround but this is inconsistent, doesn't work for newer content and has many encoding errors in the content it does cover.

I'm wondering if anyone knows of a method for retrieving caption text from a YouTube video from java (not a .js plugin)?

[ Worst case, does anyone know of a library that allows me to programmatically interact with YouTube like a browser and allows me to click the transcript button on the screen and I can pull the transcript from there? Prowser doesn't allow click interaction and JxBrowser is $1,300+ ]

The code below works fine and gets me to all the video data so it's the last step I need help on. I've included it here in case it's helpful to anyone who needs to get this far.

// Build a YouTube resource
YouTube youtube = new YouTube.Builder(new NetHttpTransport(),
                            new JacksonFactory(), 
                            new HttpRequestInitializer())
                    .setApplicationName("caption-retrieval")
                    .build();

// Create the video list request, it should only return one
// result
YouTube.Videos.List listVideosRequest = youtube.videos().list("id, snippet, contentDetails");
listVideosRequest.setKey(API_KEY));
listVideosRequest.setId(VIDEO_ID);

// Request is executed and video list response is returned
VideoListResponse listVideosResponse = listVideosRequest.execute();

List<Video> videos = listVideosResponse.getItems();

// Since a unique video id is given, it will only return
// one video. Would check if video has been removed in 
// production code.
Video video = videos.get(0);

// Read the remaining meta information
title = video.getSnippet().getTitle().trim();
author = video.getSnippet().getChannelTitle();

captionText = ???????

Any help is gratefully received.

Thanks,

Greg.

Bilbo answered 12/8, 2013 at 16:37 Comment(0)
A
14

We are hoping to have Captions support on Data API v3 soon. You won't need to scrape the website.

Update: This has been implemented now. The docs can be found here.

Artwork answered 12/8, 2013 at 18:40 Comment(10)
Thanks @IbrahimUlukaya. That's great news. Do you have any sense on how soon that might be? Weeks, months, etc?Bilbo
Soon? Api v2 is going away in a year, and we would like to keep uploading captions.Syncopated
Please, what is status of Captions support on Data API v3? When will it be available?Kizer
We are hoping to have it in trusted tester in Q1 and available to all developers right after.Artwork
@IbrahimUlukaya For sale: 1 soul asking for v3 caption support in return.Bummer
Add +1 soul if the new API should provide (documented) support for binary caption files.Devotion
New API will support binary caption files and we hope to release it in weeks time.Artwork
@ibrahim-ulukaya, as it seems, this odd restriction remains in v3.0, does it? "Whilst v2 of the YouTube Data API restricted access to the caption information to the person who uploaded the video it seems like a very odd restriction"Regenaregency
@IbrahimUlukaya It's still impossible to list/download captions without authenticating and getting permission from the video owner? I can't think of a reason for this.Burhans
@ibrahim-ulukaya there is mysterious behaviour, we can't understand why some captions are downloadable and other not, please have a look at my response here and if you can add some insight would be great. It's not about who owns the video btw.Extrusion
D
0

They finally introduced the feature:

https://developers.google.com/youtube/v3/docs/captions
Deflagrate answered 9/12, 2016 at 5:10 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.