What is the best way to extract closed caption from movie files? [closed]
Asked Answered
K

3

7

I need to extract closed caption information from movie files, I have tried ccextractor but it does not seem to work.

I have captured a video stream (with closed caption in it) and saved to a file and then I run ccextractor... but it can't find anything!

My video samples are below:

http://dl.dropbox.com/u/10244901/gsd.mpg

http://dl.dropbox.com/u/10244901/gsd_b.mpg

First try:

cvlc -I dummy v4l2:///dev/video1:width=720:height=480:norm=ntsc:standard=ntsc:pixelformat=2:aspect-ratio=4\:3:channel=0 --sout "#transcode{vcodec=mp2v}:standard{access=file,mux=dummy,dst=gsd.mpg}"

lzzz@ideiatu:~/Downloads/ccextractor.0.64/linux$ ./ccextractor gsd.mpg 
CCExtractor 0.64, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc
--------------------------------------------------------------------------
Input: gsd.mpg
[Raw Mode: Broadcast] [Extract: 1] [Stream mode: Autodetect]
[Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto]
[Timing mode: Auto] [Debug: No] [Buffer input: No]
[Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No]
[Target format: .srt] [Encoding: Latin-1] [Delay: 0] [Trim lines: No]
[Add font color data: Yes] [Add font typesetting: Yes]
[Convert case: No] [Video-edit join: No]
[Extraction start time: not set (from start)]
[Extraction end time: not set (to end)]
[Live stream: No] [Clock frequency: 90000]
Teletext page: Autodetect]
Start credits text: [None]
Creating gsd.srt

-----------------------------------------------------------------
Opening file: gsd.mpg
File seems to be an elementary stream, enabling ES mode
Analyzing data in general mode


New video information found
[720 * 480] [AR: 02 - 4:3] [FR: 03 - 25] [progressive: yes]

133%  |  01:40
Number of NAL_type_7: 0
Number of VCL_HRD: 0
Number of NAL HRD: 0
Number of jump-in-frames: 0
Number of num_unexpected_sei_length: 0

Total frames time:      00:01:41:200  (2530 frames at 25.00fps)

Min PTS:                00:00:00:000
Max PTS:                00:01:41:200
Length:                 00:01:41:200

Initial GOP time:       00:00:00:000
Final GOP time:         00:01:40:800+10F
Diff. GOP length:       00:01:40:800+10F    (00:01:41:133)
Done, processing time = 0 seconds
This is beta software. Report issues to cfsmp3 at gmail...

Second try:

cvlc -I dummy gsd.mpg --sout "#standard{access=file,mux=ts,dst=gsd_b.mpg}"



lzzz@ideiatu:~/Downloads/ccextractor.0.64/linux$ ./ccextractor gsd_b.mpg
CCExtractor 0.64, Carlos Fernandez Sanz, Volker Quetschke.
--------------------------------------------------------------------------
Input: gsd_b.mpg
[Raw Mode: Broadcast] [Extract: 1] [Stream mode: Autodetect]
[Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto]
[Timing mode: Auto] [Debug: No] [Buffer input: No]
[Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No]
[Target format: .srt] [Encoding: Latin-1] [Delay: 0] [Trim lines: No]
[Add font color data: Yes] [Add font typesetting: Yes]
[Convert case: No] [Video-edit join: No]
[Extraction start time: not set (from start)]
[Extraction end time: not set (to end)]
[Live stream: No] [Clock frequency: 90000]
Teletext page: Autodetect]
Start credits text: [None]
Creating gsd_b.srt

-----------------------------------------------------------------
Opening file: gsd_b.mpg
File seems to be a transport stream, enabling TS mode
Analyzing data in general mode
Decode captions from MPEG-2 video stream [0x02]  -  PID: 68

New PID found: 68


New video information found
[720 * 480] [AR: 02 - 4:3] [FR: 03 - 25] [progressive: yes]

100%  |  00:00
Number of NAL_type_7: 0
Number of VCL_HRD: 0
Number of NAL HRD: 0
Number of jump-in-frames: 0
Number of num_unexpected_sei_length: 0

Total frames time:      00:01:41:040  (2526 frames at 25.00fps)

Min PTS:                02:59:52:437
Max PTS:                02:59:52:677
Length:                 00:00:00:240

Initial GOP time:       00:00:00:000
Final GOP time:         00:01:40:800 +6F
Diff. GOP length:       00:01:40:800 +6F    (00:01:41:000)
Done, processing time = 0 seconds
This is beta software. Report issues to cfsmp3 at gmail...
Knowling answered 25/11, 2012 at 20:50 Comment(1)
This question, as formulated, has nothing to do with programming. There could be a question on this subject which is programming related, but this question is about finding tools to extract closed captioning, not about how to write a tool that extracts closed captions.Nomenclator
I
2

Some movies don't have a hidden file containing the captions, but the subtitles are hardcoded into the video, meaning they are actually part of the video and cannot be distinguished.

You can try to google for a standalone version of subtitles for a movie.

Ithaca answered 26/2, 2015 at 12:54 Comment(0)
A
-1

Video capture cards would probably only keep the video data.

Amundson answered 21/10, 2020 at 0:14 Comment(0)
B
-1

install cc extractor. for ubuntu,

sudo apt update
sudo apt -y install ccextractor

Clone cc extractor from github to videos folder in our system https://github.com/CCExtractor/ccextractor

git clone https://github.com/CCExtractor/ccextractor 

cd ~/Videos
cd ccextractor/linux
./build 

Right click video -properties- parent folder path copy

./ccextractor pathname
./ccextractor /home/user/Videos

now open the video. then you can see the video with subtitles :)

Benildas answered 20/5, 2023 at 2:42 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.