Can someone explain .wav(WAVE) file headers?
Asked Answered
A

3

15

OK, so I'm trying to make a program that will manipulate .wav files, and I've seen this question/answers, but I'm not entirely sure as to what each piece of data in the header refers to. For example, what does a "chunk" refer to? Is that a specific number of bits/bytes?

If somebody could just tell me, at least in the format used in this question, what each datum being written to the .wav, aside from the constant String Literals and the 'data' array, refer to? In particular I'd especially like to know what a "chunk" is, and how Sample Rate, Byte Rate, Bytes per Sample, and Bytes per Sample for all Channels relate?(I suspect Byte Rate is Sample Rate * Bytes per Sample, but what about the 'for all channels' one?)

Any help is appreciated.

Adagio answered 25/1, 2015 at 14:35 Comment(2)
You can probably find a few libraries that will handle all the header stuff for you, all you need to do is find them.If you use such libraries all you will have to worry about is how to actually manipulate the audio rather than just how to encode or decode them.Cytoplast
For a java example please refer to my answer here: https://mcmap.net/q/417097/-writing-pcm-recorded-data-into-a-wav-file-java-androidYardmaster
A
29

It is against the board rules to just post a link, so here is the table I took from http://www.topherlee.com/software/pcm-tut-wavformat.html

Positions   Sample Value         Description
1 - 4       "RIFF"               Marks the file as a riff file. Characters are each 1. byte long.
5 - 8       File size (integer)  Size of the overall file - 8 bytes, in bytes (32-bit integer). Typically, you'd fill this in after creation.
9 -12       "WAVE"               File Type Header. For our purposes, it always equals "WAVE".
13-16       "fmt "               Format chunk marker. Includes trailing null
17-20       16                   Length of format data as listed above
21-22       1                    Type of format (1 is PCM) - 2 byte integer
23-24       2                    Number of Channels - 2 byte integer
25-28       44100                Sample Rate - 32 bit integer. Common values are 44100 (CD), 48000 (DAT). Sample Rate = Number of Samples per second, or Hertz.
29-32       176400               (Sample Rate * BitsPerSample * Channels) / 8.
33-34       4                    (BitsPerSample * Channels) / 8.1 - 8 bit mono2 - 8 bit stereo/16 bit mono4 - 16 bit stereo
35-36       16                   Bits per sample
37-40       "data"               "data" chunk header. Marks the beginning of the data section.
41-44       File size (data)     Size of the data section, i.e. file size - 44 bytes header.

Sample values are given above for a 16-bit stereo source.

Update/Reminder

The header integers are all in Least significant byte order, so the two byte channel information 0x01 0x00 are actually 0x00001 e.g. mono.

Alfieri answered 25/1, 2015 at 15:1 Comment(3)
So the File size values are 32 bit integers containing the number of 8-bit bytes in the entire file and the data section respectively? Is sample rate 32-bit or 32-byte, because if I'm not mistaken the latter is much larger than necessary at 128 bits.Adagio
32 bit sample is the answer. 4 bytes header space to be precise.Alfieri
fyi, that "trailing null" in fmt isn't actually a null byte, but a space. ffprobe was very adamant about thatFlounder
F
7

enter image description here

I know OP tagged the question as Java, but here's complete Kotlin code for reading the header that could pass for Java. Reading Little Endian could be tricky, but thankfully we don't have to do that.

class WaveHeader(bytes: ByteArray) {
    init {
        require(bytes.size >= SIZE) { "Input size is must be at least $SIZE bytes" }
    }

    private var start = 0
    private val riff = RiffChunk(
        String(bytes.copyOfRange(start, start + 4))
            .also {
                require(it == "RIFF") { "$it must be 'RIFF'" }
                start += it.length
            },
        ByteBuffer.wrap(bytes.copyOfRange(start, start + 4)).order(ByteOrder.LITTLE_ENDIAN)
            .also { start += it.capacity() }.int,
        String(bytes.copyOfRange(start, start + 4))
            .also {
                require(it == "WAVE") { "$it must be 'WAVE'" }
                start += it.length
            }
    )
    private val format = FormatChunk(
        // null terminated
        String(bytes.copyOfRange(start, start + 3))
            .also {
                require(it == "fmt") { "$it must be 'fmt'" }
                start += 4
            },
        ByteBuffer.wrap(bytes.copyOfRange(start, start + 4)).order(ByteOrder.LITTLE_ENDIAN)
            .also { start += it.capacity() }.int,
        ByteBuffer.wrap(bytes.copyOfRange(start, start + 2)).order(ByteOrder.LITTLE_ENDIAN)
            .also { start += it.capacity() }
            .let { if (it.short == 1.toShort()) "PCM" else "OTHER (${it.short})" },
        ByteBuffer.wrap(bytes.copyOfRange(start, start + 2)).order(ByteOrder.LITTLE_ENDIAN)
            .also { start += it.capacity() }.short,
        ByteBuffer.wrap(bytes.copyOfRange(start, start + 4)).order(ByteOrder.LITTLE_ENDIAN)
            .also { start += it.capacity() }.int,
        ByteBuffer.wrap(bytes.copyOfRange(start, start + 4)).order(ByteOrder.LITTLE_ENDIAN)
            .also { start += it.capacity() }.int,
        ByteBuffer.wrap(bytes.copyOfRange(start, start + 2)).order(ByteOrder.LITTLE_ENDIAN)
            .also { start += it.capacity() }.short,
        ByteBuffer.wrap(bytes.copyOfRange(start, start + 2)).order(ByteOrder.LITTLE_ENDIAN)
            .also { start += it.capacity() }.short
    )
    private val `data` = DataChunk(
        String(bytes.copyOfRange(start, start + 4))
             // remove all null chars
            .replace("\u0000", "")
            .also { start += it.length },
        ByteBuffer.wrap(bytes.copyOfRange(start, start + 4)).order(ByteOrder.LITTLE_ENDIAN)
            .also { start += it.capacity() }.int
    )

    init {
        assert(start == 44) { "Illegal state" }
    }

    data class RiffChunk(val id: String, val size: Int, val format: String)
    data class FormatChunk(
        val id: String, val size: Int, val format: String, val numChannels: Short,
        val sampleRate: Int, val byteRate: Int, val blockAlign: Short, val bitsPerSample: Short
    )

    data class DataChunk(val id: String, val size: Int)

    override fun toString(): String {
        val ls = System.lineSeparator()
        return "WaveHeader($ls\t$riff}$ls\t$format$ls\t$`data`$ls)"
    }

    companion object {
        const val SIZE = 44

        fun fromPath(path: String): WaveHeader  = fromInputStream(WaveHeader::class.java.getResourceAsStream(path))

        fun fromUrl(url: String): WaveHeader  = fromInputStream(URL(url).openStream())

        private fun fromInputStream(input: InputStream): WaveHeader {
            val bytes = input.use {
                it.readNBytes(SIZE)
            }
            return WaveHeader(bytes)
        }
    }
}

fun main(args: Array<String>) {
    if (args.isEmpty()) {
        System.err.println("Argument is missing")
    }
    println(WaveHeader.fromUrl(args[0]))
}

Running with this URL produces the output:

WaveHeader(
    RiffChunk(id=RIFF, size=168050, format=WAVE)}
    FormatChunk(id=fmt, size=18, format=PCM, numChannels=1, sampleRate=16000, byteRate=32000, blockAlign=2, bitsPerSample=16)
    DataChunk(id=fa, size=1952670054)
)
Faddish answered 27/7, 2020 at 10:34 Comment(4)
Are you sure about those endianesses in picture? My wav properties java code works with setting ByteBuffer order to little endian for all those fields. Especially the second field called ChunkSize gets read correctly with ByteBuffer set to bb.order(ByteOrder.LITTLE_ENDIAN);Medulla
@Kurskinen Yes, I’m sure that the information in the pic is correct. I don’t understand your question. The picture shows ChunkSize stored as Little Endian, and you say your code is able to read it that way, so, where’s the problem? Are you saying you’re able to correctly read Big Endian fields using Little Endian order?Faddish
Thanks for your kind reply. Yes! That is exactly what I mean. I play with ByteBuffer.order and it has no effect on bigendian fields(say for example first "RIFF") even it internally(when debugging) claims it has changed from bigEndian=true to false and also "nativeByteOrder" did chaged to true. So I have a problem: why on earth changing ByteBuffer.order() to little endian for those bigendian fields does not matter but little endians are destroyed without it? I can see those succeeding with both endians are character fields and those failing are numbers. But what is the mechanism?Medulla
Ok. I think I got it(maybe...). Because character is taking only 1 byte there is no order. Order is meaningful only when something is combined from two or more bytes. So as well those character fields in the picture could be marked as little endian and no difference could be detected? My sequence is: fis.read(properties_bytes,0,4); byte[]chunkID_bytes = new byte[4]; bb_properties.get(chunkID_bytes);Medulla
L
1

The sizes always mean the remaining size (i.e. not including the ID and size fields).

Lexington answered 5/8, 2023 at 1:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.