Join iterator of &str
Asked Answered
P

5

58

How do I convert an Iterator<&str> to a String, interspersed with a constant string such as "\n"? For instance, given:

let xs = vec!["first", "second", "third"];
let it = xs.iter();

One may produce a string s by collecting into a Vec<&str> and joining the result:

let s = it
    .map(|&x| x)
    .collect::<Vec<&str>>()
    .join("\n");

However, this unnecessarily allocates memory for a Vec<&str>.

Is there a more direct method?

Ploce answered 8/5, 2019 at 3:53 Comment(10)
Apologies - my original answer removed the iterator but your question is asking how to join an iterator and not allocate the extra vector.Feminize
Looks like the itertools crate doesn't allocate the vectorFeminize
Note that depending on the exact characteristics of your iterator, collecting into a vector of slices and then joining could actually be faster than using Websterix's method or itertools, since SliceConcatExt::join can calculate the needed size for the full string ahead of time and thus definitely doesn't need to reallocate during accumulation; whereas the other methods may have to reallocate the string. You should definitely benchmark.Phenocryst
@SebastianRedl But collect::<Vec<&str>> would need to reallocate, but it's a lot smaller than the string buffer, so i guess that would be a faster?Sake
@Sake It has to allocate, but not reallocate if the iterator gives a good size hint.Phenocryst
The once plus skip trick works nicely for this. See also this answer.Polybasite
How is this a duplicate?!Kulda
Sorry, but I dont understand the .map(|&x| x) part... why is that?Selfmade
I also agree with @MattJoiner, this is not a duplicate and there are better answers to this question in 2023. I voted to reopen.Isoprene
Question reopened!Kulda
S
34

You could use the itertools crate for that. I use the intersperse helper in the example, it is pretty much the join equivalent for iterators.

cloned() is needed to convert &&str items to &str items, it is not doing any allocations. It can be eventually replaced by copied() when [email protected] gets a stable release.

use itertools::Itertools; // 0.8.0

fn main() {
    let words = ["alpha", "beta", "gamma"];
    let merged: String = words.iter().cloned().intersperse(", ").collect();
    assert_eq!(merged, "alpha, beta, gamma");
}

Playground

Sake answered 8/5, 2019 at 9:0 Comment(0)
S
26

You can do it by using fold function of the iterator easily:

let s = it.fold(String::new(), |a, b| a + b + "\n");

The Full Code will be like following:

fn main() {
    let xs = vec!["first", "second", "third"];
    let it = xs.into_iter();

    // let s = it.collect::<Vec<&str>>().join("\n");

    let s = it.fold(String::new(), |a, b| a + b + "\n");
    let s = s.trim_end();

    println!("{:?}", s);
}

Playground

EDIT: After the comment of Sebastian Redl I have checked the performance cost of the fold usage and created a benchmark test on playground.

You can see that fold usage takes significantly more time for the many iterative approaches.

Did not check the allocated memory usage though.

Salverform answered 8/5, 2019 at 5:15 Comment(9)
The reason this is slow is because you're using + to create two new Strings on every iteration. If you use a single string (playground) it can work better than collect and join (playground).Gypsie
added black_box and create the vec for each test individually (because of cache warming) (playgroud). Playground isn't that good for benchmarking due to massive variance in latency/duration, but the fold variant seems to be slightly slower (over multiple runs).Sake
v2 with black_box(xs).iter().copied() takes now twice as long for collect+join over fold (the black_box(xs) doesn't matter, xs is the same). <3 for microbenchmarkingSake
I guess they optimize something like vec!["hey"; 100_000].into_iter().collect<Vec<_>> to just return the original vec?!Sake
Yes, they doSake
@mcdonoughe could this be improved with String::with_capacity(xs.len())?Prorogue
Another solution is let mut it = xs.into_iter(); let first = it.next().unwrap_or("").to_owned(); let r = it.fold(first, |a, b| a + "\n" + b); Then you end up with a String instead of &strAffricative
@akiner-alkan I've tested today and I see the opposite: fold approach takes 7.405799ms over 10.380536ms in my browser. So I'm still not sure what to use.Mandalay
@EirNym you can run the test several times with 'RELEASE' mode instead of 'DEBUG'. On 'DEBUG' mode running benchmark tests can result incorrect than expected since there will be breakpoint pointers and different CPU/memory utilization strategies applied.Salverform
S
15

With itertools, you have not only intersperse() but also join():

use itertools::Itertools;

let s = it.join("\n");

It is more general than intersperse() (it accepts any Display-implementing type) but therefore may be slower (I didn't benchmark though).

Separatist answered 10/5, 2023 at 23:1 Comment(0)
M
1

use Iterator::reduce.

fn main() {
    let it = ["1", "2", "3"].into_iter();
    let res = it.map(String::from).reduce(|acc, s| format!("{acc}, {s}")).unwrap_or_default();
    assert_eq!(&res, "1, 2, 3");
}

You can use Cow to avoid unnecessary allocation.

use std::borrow::Cow;

fn main() {
    let it = ["1", "2", "3"].into_iter();
    let res = it.map(Cow::from).reduce(|mut acc, s| {
        acc.to_mut().push('\n');
        acc.to_mut().push_str(&s);
        acc
    }).unwrap_or_default();
    assert_eq!(&res, "1\n2\n3");
}

Moldboard answered 3/10, 2023 at 13:34 Comment(0)
S
-4

there's relevant example in rust documentation: here.

let words = ["alpha", "beta", "gamma"];

// chars() returns an iterator
let merged: String = words.iter()
                          .flat_map(|s| s.chars())
                          .collect();
assert_eq!(merged, "alphabetagamma");

You can also use Extend trait:

fn f<'a, I: Iterator<Item=&'a str>>(data: I) -> String {
    let mut ret = String::new();
    ret.extend(data);
    ret
}
Saucier answered 8/5, 2019 at 5:15 Comment(2)
This answer does not reproducing the OP's needs. OP is asking about interspersed with some constant string (e.g. "\n")?.Salverform
also this should work without flat_map, as String already implements Extend<&str>.Sake

© 2022 - 2025 — McMap. All rights reserved.