How to expand Posixct field in R str()?
Asked Answered
A

1

12

I am trying to expand the amount of factors shown in one custom Posixct field where the normal way (str(DF, list.len=ncol(DF), vec.len=20)) does not work. I request here 20 but it shows all the time two ("2017-01-01 08:40:00" "2017-01-01 08:50:00" ...) regardless the length of the list (here 3). Data data.csv

"AAA", "BBB"
1, 01012017-0940+0100
2, 01012017-0950+0100
3, 01012017-0838+0100

Code

library('methods') # setClass

# https://unix.stackexchange.com/a/363290/16920
setClass('iso8601')

# https://mcmap.net/q/143826/-only-read-selected-columns
setAs("character","iso8601",function(from) strptime(from,format="%d%m%Y-%H%M%z"))

DF <- read.csv(file='data.csv',
        sep=',',
        header=TRUE,
        colClasses=c('numeric','iso8601'),
        strip.white=TRUE)

DF

str(DF, list.len=ncol(DF), vec.len=20)

Output in R 3.3.3

 AAA                 BBB
1  1 2017-01-01 08:40:00
2  2 2017-01-01 08:50:00
3  3 2017-01-01 07:38:00
'data.frame':  3 obs. of  2 variables:
 $ AAA : num  1 2 3
 $ BBB : POSIXlt, format: "2017-01-01 08:40:00" "2017-01-01 08:50:00" ...

Output in R 3.4.0

Same as above, reproducing the same problem.

  AAA                 BBB
1   1 2017-01-01 08:40:00
2   2 2017-01-01 08:50:00
3   3 2017-01-01 07:38:00
'data.frame':   3 obs. of  2 variables:
 $ AAA: num  1 2 3
 $ BBB: POSIXlt, format: "2017-01-01 08:40:00" "2017-01-01 08:50:00" ...
  1. How can you expand str(DF, list.len=ncol(DF), vec.len=20) to many factors per variable?

  2. How can you show the amount of items per variable in str(DF)? Etc without the expansion of the parameters itself in the variable.

Eliminate terminal width and column factor in etiology

I did

  1. increased the defaults: width from 80 to 150, and columns from 24 to 38
  2. restarted the terminal prompt
  3. run Rscript myScript.r
  4. Output same again so the terminal width and column amount do not seem to play a factor here

Roland's proposal

The code does not work in all occasions, but in limited number of cases, so it should be possible apply it dynamically

# Roland's comment
str(DF, list.len=ncol(DF), vec.len=20, width = 100)

R: 3.3.3, 3.4.0 (2017-04-21, backports)
OS: Debian 8.7
Window manager: Gnome 3.14.1

Aspect answered 17/5, 2017 at 13:41 Comment(14)
You write POSIXct in your question title but then create a POSIXlt variable. If you created a POSIXct variable you probably wouldn't have this problem.Caterer
1. I can't reproduce with R 3.4.0. I see the expected result. 2. Use as.POSIXct instead of strptime. There is rarely a reason to store time stamps as POSIXlt.Caterer
@Caterer Can you please propose a differential solution here? I really do not understand the reason for the output in R 3.3.3. - - I really would like to get something more stable. - - I replaced strptime(...) with as.POSIXct(from,format=..."). Studynig stp(...) shows that same data type there. What are benefits of as.POSIXct(...) here?Denman
I don't know what you expect beyond an issue with str has been fixed and you should update R to the current version.Caterer
Read the official instructions for debian on cran.r-project.org.Caterer
I tried again on my mac to make sure it's not a win vs *nix thing. It works as expected if the console's width is sufficient. Try str(DF, list.len=ncol(DF), vec.len=20, width = 100).Caterer
Are you asking about the output for the times being different between the two versions? Because other than that they appear to be the same. I would like to know for sure before I answer.Wafd
It also works correctly on mine, when I run it.......I created 17 dates and asked for 10, got all 10. I cut and pasted your code directly That is not completely true...It showed as many as would fit in the width of my window, by adjusting the window, I get more dates. You are constrained by the width of the display space.Wafd
I ran your code exactly and it works. But it the window you have in your system is too small to show all the fields in the str() it does not wrap. I simply truncates the returned data. If you drag the width of the console window wider, you get more dates.I spanned two 20" cinemas with it and got all the values on screen for the 10 I requested. You need to adjust the console window to be wider to see it all.Wafd
Here is a link to changing the terminal size in ubuntu, it should work for debian too. askubuntu.com/questions/64652/set-terminal-size-permanently, just set the terminal size permanently to be larger and the console should execute to that size from Rscript in the command line too.Wafd
@Wafd I increased significantly those values. The output is independent of them. Please, see the body. What can you think about next? - - What is your window manager?Denman
When I am working in the command line I am doing one of two things, running quick analyses or implementing a tested script to run quickly or on a crontab. I do all my my exploratory work in rStudio. For me str() falls into exploratory because it is there to check the structure of a data file. So, I do not have the expertise to suggest how to accomplish this task as a command function if expanding the window is not sufficient.Wafd
setting width = parameter in str to a large enough number works for me. @Caterer already suggested, this, but I did not see a response comment from you. Can you confirm that this does not work for you?Miyokomizar
@Miyokomizar Sorry, I can confirm that the width alone is not sufficient now. It seems to work in very simple cases but not in most cases.Denman
E
1

Proposal width

In order to achieve "wider" output, you can change default width in R options.

According to options {base} help:

width:

controls the maximum number of columns on a line used in printing vectors, matrices and arrays, and when filling by cat.

Here is an example:
# initial try
str(DF, list.len=ncol(DF), vec.len=20)

it gives:

    'data.frame':   3 obs. of  2 variables:
 $ AAA: num  1 2 3
 $ BBB: POSIXlt, format: "2017-01-01 11:40:00" "2017-01-01 11:50:00" ...

Proposal options(width)

And now, with different width:

# retain default options
op <- options()

# set apropriate width
n_cols <- 22 * 20 # n columns for 20 POSIXlt strings
n_cols <- n_cols + 50 # 50 columns for column description
# actually you can use any sufficiently big number
# for example n_cols = 1000
options(width = n_cols)
str(DF, list.len=ncol(DF), vec.len=20)
options(op)

The result is:

'data.frame':   3 obs. of  2 variables:
 $ AAA: num  1 2 3
 $ BBB: POSIXlt, format: "2017-01-01 11:40:00" "2017-01-01 11:50:00" "2017-01-01 10:38:00"

Roland's width parameter

It seems like you can achieve this as well with width parameter in str. Just as Roland suggested. But again you have to provide big enough value for output. 1 POSIXlt string contains 21 characters + whitespace. So for 20 strings, you need more than 440 columns.

Three parameter approach

I have tried it with your example:

DF <- rbind(DF, DF, DF) # nrows = 24

# Calculate string width
string_size <- nchar(as.character(DF[1, 2])) + 3 # string width + "" and \w
N <- 20 # number of items
n_cols <- string_size * N

str(DF, list.len=ncol(DF), vec.len=20, width = n_cols)

Output:

'data.frame':   24 obs. of  2 variables:
 $ AAA: num  1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3
 $ BBB: POSIXlt, format: "2017-01-01 11:40:00" "2017-01-01 11:50:00" "2017-01-01 10:38:00" "2017-01-01 11:40:00" "2017-01-01 11:50:00" "2017-01-01 10:38:00" "2017-01-01 11:40:00" "2017-01-01 11:50:00" "2017-01-01 10:38:00" "2017-01-01 11:40:00" "2017-01-01 11:50:00" "2017-01-01 10:38:00" "2017-01-01 11:40:00" "2017-01-01 11:50:00" "2017-01-01 10:38:00" "2017-01-01 11:40:00" "2017-01-01 11:50:00" "2017-01-01 10:38:00" "2017-01-01 11:40:00" "2017-01-01 11:50:00" ...

There are exactly 20 POSIXlt strings.

Explanation

The problem with output arises from utils:::str.POSIXt method which is called for POSIXlt object. The interesting part is in next line:

larg[["vec.len"]] <- min(larg[["vec.len"]], (larg[["width"]] - 
                nchar(larg[["indent.str"]]) - 31)%/%19)

This line computes the number of POSIXlt strings in output. Roughly saying output will consist of NOT more than vec.len POSIXlt strings AND the length of output in characters will be NOT more than width.

Here, larg is a list of arguments passed to str. By default they are: vec.len = 4; width = 80; indent.str = " ".

So, the recomputed vec.len by default will be 2.

As to last example, we set vec.len = 20, width = 440 and our data frame has 24 rows. Recomputed vec.length is 20. So the output str(DF) contains 20 POSIXlt strings and tailed with '...', which means that there are more than 20 elements in the POSIXlt vector.

Exclamation answered 27/5, 2017 at 7:48 Comment(5)
You could have N <- 20; stringSize <- 22; n_cols <- stringSize * N; making the algorithm dependent on the amount of characters per string times the amount of strings. So the problem seems to be dependent on at least two variables. Can you think about anything else?Denman
I have updated the answer for proper variables declaration. Now, string_size is calculated with nchar, so it is easier to use with a different format. Also, I can not think of anything else now.Exclamation
Can you say why you have ... at the end of your last output? I think it should be a complete output but it is not. Can you offer any reasons for that?Denman
I offer you the bounty because you tried to answer the difficult question. I do not yet know how reliable your proposal is in the end. There are still some things which need clarifications.Denman
I have added some explanation of how the length of output string is computed. Sorry for my English, hope it makes sense.Exclamation

© 2022 - 2024 — McMap. All rights reserved.