How do I run a ldap query using R?
Asked Answered
B

5

12

I want to make a query against a LDAP directory of how employees are distributed in departments and groups...

Something like: "Give me the department name of all the members of a group" and then use R to make a frequency analysis, but I can not find any examples on how to connect and run a LDAP query using R.

RCurl seems to have some kind of support ( http://cran.r-project.org/web/packages/RCurl/index.html ):

Additionally, the underlying implementation is robust and extensive, supporting FTP/FTPS/TFTP (uploads and downloads), SSL/HTTPS, telnet, dict, ldap, and also supports cookies, redirects, authentication, etc.

But I am no expert in R and have not been able to find a single example using RCurl (or any other R library) to do this..

Right now I am using CURL like this to obtain the members of a group:

curl "ldap://ldap.replaceme.com/o=replaceme.com?memberuid?sub?(cn=group-name)"

Anyone here knows how to do the same in R with RCurl?

Braswell answered 1/4, 2014 at 18:23 Comment(5)
We'd need to know a bit more about the LDAP server config. An example LDAP query via curl -u USERNAME 'ldap://192.168.0.66/CN=Users,DC=training,DC=local\?sAMAccountName?sub?(ObjectClass=*)' (that's from an IBM example). It won't work for you since you need to know the proper search parameters. It's pretty straightforward to run that via RCurl and then process the results, but if you should get the query working from curl on the command line first.Archivist
Right now I am retrieving the list of members of a group like this: ldapsearch -t -h ldap.replaceme.com -x -b "o=replaceme.com" "(cn=group-name)" memberuidBraswell
@Archivist if you can translate my ldapsearch to curl and then to R with RCurl, that would be the exact answer I am looking for...Braswell
Hi @Archivist I have translated my ldapsearch query to curl... Can you tell me how do I run it with RCurl?Braswell
Already did it my self... but thanks a lot for your guidance @Archivist :-)Braswell
B
12

Found the answer myself:

First run this commands to make sure RCurl is installed (as described in http://www.programmingr.com/content/webscraping-using-readlines-and-rcurl/ ):

install.packages("RCurl", dependencies = TRUE)
library("RCurl")

And then user getURL with an ldap URL (as described in http://www.ietf.org/rfc/rfc2255.txt although I couldn't understand it until I read http://docs.oracle.com/cd/E19396-01/817-7616/ldurl.html and saw ldap[s]://hostname:port/base_dn?attributes?scope?filter):

getURL("ldap://ldap.replaceme.com/o=replaceme.com?memberuid?sub?(cn=group-name)")
Braswell answered 1/4, 2014 at 22:9 Comment(1)
On a related note this is an excellent guide on the usage of RCurl omegahat.org/RCurl/RCurlJSS.pdfBraswell
T
5

I've written a function here to parse ldap output into a dataframe, and I used the examples provided as a reference for getting everything going.

I hope it helps someone!

library(RCurl)
library(gtools)

parseldap<-function(url, userpwd=NULL)
{
  ldapraw<-getURL(url, userpwd=userpwd)
  # seperate by two new lines
  ldapraw<-gsub("(DN: .*?)\n", "\\1\n\n", ldapraw)
  ldapsplit<-strsplit(ldapraw, "\n\n")
  ldapsplit<-unlist(ldapsplit)
  # init list and count
  mylist<-list()
  count<-0
  for (ldapline in ldapsplit) {
    # if this is the beginning of the entry
    if(grepl("^DN:", ldapline)) {
      count<-count+1
      # after the first 
      if(count == 2 ) {
        df<-data.frame(mylist)
        mylist<-list()
      }
      if(count > 2) {
        df<-smartbind(df, mylist)
        mylist<-list()
      }
      mylist["DN"] <-gsub("^DN: ", "", ldapline)
    } else {
      linesplit<-unlist(strsplit(ldapline, "\n"))
      if(length(linesplit) > 1) {
        for(line in linesplit) {
          linesplit2<-unlist(strsplit(line, "\t"))
          linesplit2<-unlist(strsplit(linesplit2[2], ": "))
          if(!is.null(unlist(mylist[linesplit2[1]]))) {
            x<-strsplit(unlist(mylist[linesplit2[1]]), "|", fixed=TRUE)

            x<-append(unlist(x), linesplit2[2])
            x<-paste(x, sep="", collapse="|")
            mylist[linesplit2[1]] <- x
          } else {
            mylist[linesplit2[1]] <- linesplit2[2]  
          }
        }
      } else {
        ldaplinesplit<-unlist(strsplit(ldapline, "\t"))
        ldaplinesplit<-unlist(strsplit(ldaplinesplit[2], ": "))
        mylist[ldaplinesplit[1]] <- ldaplinesplit[2]
      }

    }

  }
  if(count == 1 ) {
    df<-data.frame(mylist)
  } else {
    df<-smartbind(df, mylist)
  }
  return(df)
}
Tsarevna answered 1/7, 2014 at 4:49 Comment(0)
X
1
loginLDAP <- function(username, password) {

  ldap_url <- "ldap://SERVER-NAME-01.companyname.com"

  handle <- curl::new_handle(timeout = 10)

  curl::handle_setopt(handle = handle, userpwd = paste0("companyname\\", username, ":", password))

  tryCatch(
    {
      response <- curl::curl_fetch_memory(url = ldap_url, handle = handle)

      if (response$status_code == 0) {
        return(list(success = TRUE, username = username))
      } else {
        print("Invalid login credentials.")
      }
    },
    error = function(e) {
      return(list(success = FALSE, username = username))
    }
  )
}
Xantho answered 24/7, 2023 at 20:5 Comment(0)
C
0

I followed this strategy:

  1. run a Perl script with an LDAP query, write data to disc as JSON.
  2. read in the json structure with R, create a dataframe.

For step (1), I used this script:

#use Modern::Perl;
use strict;
use warnings;
use feature 'say';
use Net::LDAP;
use JSON;
chdir("~/git/_my/R_one-offs/R_grabbag");
my $ldap = Net::LDAP->new( 'ldap.mydomain.de' ) or die "$@";
my $outfile = "ldapentries_mydomain_ldap.json";
my $mesg = $ldap->bind ;    # an anonymous bind
# get all cn's (= all names)
$mesg = $ldap->search(
                base   => " ou=People,dc=mydomain,dc=de",
                filter => "(cn=*)"
              );

my $json_text = "";
my @entries;

foreach my $entry ($mesg->entries){
 my %entry;
 foreach my $attr ($entry->attributes) {
    foreach my $value ($entry->get_value($attr)) {
      $entry{$attr} = $value;
    }
  }
  push @entries, \%entry;
}

$json_text = to_json(\@entries);
say "Length json_text: " . length($json_text);


open(my $FH, ">", $outfile);
print $FH $json_text;
close($FH);
$mesg = $ldap->unbind;

You might need check the a max size limit of entries returned by the ldap server. See https://serverfault.com/questions/328671/paging-using-ldapsearch

For step (2), I used this R code:

setwd("~/git/_my/R_one-offs/R_grabbag")
library(rjson)
# read into R list, from file, created from perl script
json <- rjson::fromJSON(file="ldapentries_mydomain_ldap.json",method = "C")
head(json)

# create a data frame from list
library(reshape2)
library(dplyr)
library(tidyr)

# not really efficient, maybe thre's a better way to do it
df.ldap <- json %>% melt %>% spread( L2,value)

# optional:
# turn factors into characters
i <- sapply(df.ldap, is.factor)
df.ldap[i] <- lapply(df.ldap[i], as.character)
Cavein answered 4/3, 2016 at 8:42 Comment(0)
K
0

I wrote a R library for accessing ldap servers using the openldap library. In detail, the function searchldap is a wrapper for the openldap method searchldap. https://github.com/LukasK13/ldapr

Kike answered 25/9, 2018 at 15:11 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.