How to code an epoll based sockets client in C
Asked Answered
P

1

6

All the examples I can find online are servers. I want to build a basic web crawler using epoll. So I need a basic client example to get me started.

When I say basic I really mean a complete example that demonstrates multiple connections with sending and receiving of data to live web hosts. A simple HEAD request and its response for example.

Pullulate answered 10/8, 2018 at 0:5 Comment(1)
What do you think makes it any different in a client than a server? epoll() doesn't know the difference. You use it the same way whenever you have multiple sockets that you're listening for input on.Casebound
R
12

Here is a sample c code for client socket with epoll.

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <sys/socket.h>
#include <resolv.h>
#include <sys/epoll.h>
#include <arpa/inet.h>
#include <unistd.h>

#define PORT 22
#define SERVER "127.0.0.1"
#define MAXBUF 1024
#define MAX_EPOLL_EVENTS 64

int main() {
    int sockfd;
    struct sockaddr_in dest;
    char buffer[MAXBUF];
    struct epoll_event events[MAX_EPOLL_EVENTS];
    int i, num_ready;

    /*---Open socket for streaming---*/
    if ( (sockfd = socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, 0)) < 0 ) {
        perror("Socket");
        exit(errno);
    }

    /*---Add socket to epoll---*/
    int epfd = epoll_create(1);
    struct epoll_event event;
    event.events = EPOLLIN; // Can append "|EPOLLOUT" for write events as well
    event.data.fd = sockfd;
    epoll_ctl(epfd, EPOLL_CTL_ADD, sockfd, &event);

    /*---Initialize server address/port struct---*/
    bzero(&dest, sizeof(dest));
    dest.sin_family = AF_INET;
    dest.sin_port = htons(PORT);
    if ( inet_pton(AF_INET, SERVER, &dest.sin_addr.s_addr) == 0 ) {
        perror(SERVER);
        exit(errno);
    }

    /*---Connect to server---*/
    if ( connect(sockfd, (struct sockaddr*)&dest, sizeof(dest)) != 0 ) {
        if(errno != EINPROGRESS) {
            perror("Connect ");
            exit(errno);
        }
    }

    /*---Wait for socket connect to complete---*/
    num_ready = epoll_wait(epfd, events, MAX_EPOLL_EVENTS, 1000/*timeout*/);
    for(i = 0; i < num_ready; i++) {
        if(events[i].events & EPOLLIN) {
            printf("Socket %d connected\n", events[i].data.fd);
        }
    }

    /*---Wait for data---*/
    num_ready = epoll_wait(epfd, events, MAX_EPOLL_EVENTS, 1000/*timeout*/);
    for(i = 0; i < num_ready; i++) {
        if(events[i].events & EPOLLIN) {
            printf("Socket %d got some data\n", events[i].data.fd);
            bzero(buffer, MAXBUF);
            recv(sockfd, buffer, sizeof(buffer), 0);
            printf("Received: %s", buffer);
        }
    }

    close(sockfd);
    return 0;
}
Riplex answered 10/8, 2018 at 2:13 Comment(8)
Is this edge triggered or level triggered versionPullulate
By default, epoll is level triggered. For it to behave as edge triggered one needs to or " | EPOLLET" bit in events flag (event.events)Riplex
How would one extend this code to handle 10,000 simultaneous connections given a file with 100,000 ip addresses?Pullulate
When you are dealing with 100K peers there are lot more things to deal with than just the epoll interface. For example, connection failures, retries, partial reads, slow peers, HTTP parsing errors, etc. There are many different ways it could be designed. Typically I would start with one "connection" thread for connecting to peers, and a pool of "read" threads for reading data from sockets. Have an IPC like pipes to pass connected socket fd from "connection" to "read" thread. I am sure when you start on it there will be much more scenarios to handle like this. All the best!!Riplex
I mean do I have to call epoll_ctl(epfd, EPOLL_CTL_ADD, sockfd, &event) for every sockfd or can I pass an array of sockfd's and just call it once? Your example is unclear how it would work with multiple sockets.Pullulate
epoll_ctl with EPOLL_CTL_ADD should be called once for each socket. And when epoll_wait is called, it will monitor all the sockets that are added and return only those FDs that have some events to consume.Riplex
Why do you need two epoll_wait here? One for checking the complete of a connection and the other for checking any data.Either
@Either First epoll_wait is to for the socket connection establishment to complete and the second epoll_wait is to wait for data.Riplex

© 2022 - 2024 — McMap. All rights reserved.