I have a trouble with looking into the read() function code defined in <unistd.h>
Asked Answered
C

2

4

I am now trying to understand how read(2) function works by looking into the actual code implementation and first, I try to see how it is defined in #include header file.

In that file, I found this :

ssize_t  read(int, void *, size_t) __DARWIN_ALIAS_C(read);

And then, I googled to find the actual read() function declaration.

And,

https://github.com/lattera/glibc/blob/master/io/read.c

I found this. In this code,

/* Read NBYTES into BUF from FD.  Return the number read or -1.  */
ssize_t
__libc_read (int fd, void *buf, size_t nbytes)
{
  if (nbytes == 0)
    return 0;
  if (fd < 0)
    {
      __set_errno (EBADF);
      return -1;
    }
  if (buf == NULL)
    {
      __set_errno (EINVAL);
      return -1;
    }

  __set_errno (ENOSYS);
  return -1;
}

And here is my questions now.

  1. what is __libc_ before read ? why is it needed? And when user calls read(2), how this function can be called?

  2. The way I see it, this code has nothing to do with reading the buffer from file descriptor, rather it has only the code dealing with the possible errors : fd < 0 or buff is NULL , etc. So, where is the code actually implement the actual function of read(2) function?

Am I look and found in the wrong way or source?

Chanachance answered 30/8, 2019 at 13:56 Comment(4)
actually I am system programmer beginner. I wanna be a system programmer so I studied OS and want to look into read(2) how it actually implemented...Chanachance
Oh, ok. That makes more sense, sorry!Sheathe
On Unix-like systems, at least, read is a system call. So all you'll find in the library implementation is the necessary hooks to invoke it. The "meat" of the function is all in the OS kernel. (And the example you found, for some reason, doesn't even do that.)Stability
@SteveSummit yup. I am aware that read(2) function calls SYS_read, which is system call implemented in OS kernel. I first want to know how read(2) calls SYS_read, I mean the actual code for that. And second, I have no idea of that code I posted.. that __libc_read function's code.Chanachance
O
8

read (and, traditionally, all of the functions defined in "section 2" of the Unix manual -- that's what that (2) means) is a system call. That means most of the work is done by the operating system kernel, not by code in your own process. The C library only contains a system-call wrapper that executes a special instruction that transfers control to the kernel.

The code you found is a placeholder, not a system-call wrapper. As you surmised, it doesn't actually implement read. It would only ever be used temporarily, in an incomplete port to an operating system that doesn't have a system call named read. None of the complete ports in the C library you are looking at actually use that code. They instead use a real system-call wrapper. This C library automatically generates system-call wrappers at build time, so I can't link to actual code, but I can show you an example of what the generated code for a system-call wrapper might look like. (Note: this is NOT the actual code used on any operating system I am familiar with. I deliberately removed some complications.)

    .text
    .globl read
    .type read, @function
read:
    movl $SYS_read, %eax
    syscall
    testq %rax
    js .error
    ret
.error:
    negl %eax
    movq errno@gottpoff(%rip), %rdx
    movl %eax, %fs:(%rdx)
    movq $-1, %rax
    ret

I wrote this example in x86 assembly language on purpose, because there's no way to get the special syscall instruction from plain C. Some C libraries use an "assembly insert" extension for the syscall instruction and write the rest of the wrapper in C, but for what you're trying to understand, the assembly language is what you should think about.

Inside the kernel, there's a special "trap handler" that receives control from the syscall instruction. It looks at the value in %eax, sees that it is the system call number SYS_read (the actual numeric value may vary from OS to OS), and calls the code that actually implements the read operation.

After the system call returns, the wrapper tests whether it returned a negative number. If so, that indicates an error. (Note: this is one of the places where I removed some complications.) It flips the sign of that number, copies it into errno (which is more complicated than just mov %eax, errno because errno is a thread-local variable), and returns −1. Otherwise the value returned is the number of bytes read and it returns that directly.

The other answer links to an implementation of read but unfortunately it's from an OS kernel that's popular but complicated and difficult to understand. And I regret to say I don't have a better teaching example to point you at.


The __libc_ prefix on the read placeholder implementation is there because there are actually three different names for read in this C library: read, __read, and __libc_read. As the other answer points out, there's some special macros below the code you quoted that arrange for them all to be names for the same function. The auto-generated real system-call wrapper for read will also have all of those names.

This is a hack to achieve "namespace cleanliness", which you only need to worry about if you ever set out to implement a full-fledged and fully standards compliant C library. The short version is that there are many functions in the C library that need to call read, but they cannot use the name read to call it, because a C program is technically allowed to define a function named read itself.

Incidentally, you need to take care to look at headers and implementation code belonging to the same C library. You appear to have the unistd.h from MacOS on your computer, but the read code you found belongs to the GNU C Library, which is a completely different implementation. The basic declaration of read,

ssize_t read(int, void *, size_t);

is specified by the POSIX standard, so it will be the same in both, but the __DARWIN thing after that is a quirk of the MacOS C library. The GNU library has a declaration with different quirks:

extern ssize_t read (int __fd, void *__buf, size_t __nbytes) __wur;
Orpiment answered 30/8, 2019 at 14:37 Comment(9)
You're forgetting about the __syscall function. That's how the real read invoked the syscall.Whodunit
@JL2210 __syscall has extra overhead because it's variadic. I've actually never seen a C library that implemented the other system call wrappers in terms of it, I think because of that.Orpiment
I think, with a good compiler, that __syscall shouldn't cause any issues with efficiency (as the number of arguments is fixed).Whodunit
@JL2210 In principle, yes; in practice, nobody has ever bothered to do the legwork required to make it possible for the compiler to optimize it. It's not inlined, and it has to assume it's always being called with the maximum possible number of system call arguments. See for instance <sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/…> or <sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/…> and note all of the register shuffling.Orpiment
Yes, the register shuffling is horrible (had to implement it myself three times). The compiler can't just recognize that the number of va_arg arguments doesn't change, and just use the registers? That seems odd (considering that it can optimize things that I'd never think of).Whodunit
@JL2210 Maybe the compiler could do that if syscall were an inline function, but it's not, because nobody thinks it's worth bothering to optimize, because it's supposed only to be used for weird system calls that haven't been given official wrapper functions yet. (GNU libc's backlog of Linux system calls that haven't been given wrappers is longer than it ought to be but that's a separate issue.)Orpiment
You might move the answer after "The code you found is a placeholder" up to the top; that is probably more important. Also, what are the details on "namespace cleanliness"?Whodunit
@JL2210 Good idea about the reorganizing, I did that. Regarding namespace cleanliness, I think you should ask a new question, it's really complicated.Orpiment
#57733629Whodunit
W
4

You are missing the important part of the posted code.

weak_alias (__libc_read, __read)
weak_alias (__libc_read, read)

It does not matter what prefix is used. This function __libc_read is used as a stub function of the system call read. If the linker does not find the system call read than the stub is used, that will return the error code ENOSYS.

Since read is the system call, you should search its implementation in the OS source files. The implementation depends on the file descriptor used. For example if read is called in Linux for the filesystem, the code of read is here: http://lxr.linux.no/linux+v4.15.14/fs/read_write.c#L566

Warmhearted answered 30/8, 2019 at 14:14 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.