How to zip two iterators of unequal length with a default?
Asked Answered
M

7

22

I'm trying to zip two iterators of unequal length, it only returns when when there is value in both and ignores the rest in the longest iterator.

fn main() {
    let num1 = vec![1, 2];
    let num2 = vec![3];

    for i in num1.iter().rev().zip(num2.iter().rev()) {
        println!("{:?}", i);
    }
}

This returns (2, 3). How do i make it return:

(2, 3)
(1, 0) // default is the 0 here.

Is there any other way to do it?

Marven answered 4/8, 2019 at 8:53 Comment(2)
Do you know beforehand which is longest?Elmira
Well, i can at least find out, and call iter on the longest. And the other thing is even if i make the iter call on num2 instead of num1, it stops when any one of the iterator gets exhausted.Marven
C
23

You could use the zip_longest provided by the itertools crate.

use itertools::{
    Itertools,
    EitherOrBoth::*,
};

fn main() {
    let num1 = vec![1, 2];
    let num2 = vec![3];

    for pair in num1.iter().rev().zip_longest(num2.iter().rev()) {
        match pair {
            Both(l, r) => println!("({:?}, {:?})", l, r),
            Left(l) => println!("({:?}, 0)", l),
            Right(r) => println!("(0, {:?})", r),
        }
    }
}

Which would produce the following output:

(2, 3)
(1, 0)
Clang answered 4/8, 2019 at 9:16 Comment(0)
E
20

Zip will stop as soon as one of iterators stops producing values. If you know which is the longest, you can pad the shorter one with your default value:

use std::iter;

fn main() {
    let longer = vec![1, 2];
    let shorter = vec![3];

    for i in longer
        .iter()
        .rev()
        .zip(shorter.iter().rev().chain(iter::repeat(&0)))
    {
        println!("{:?}", i);
    }
}

If you don't know which is longest, you should use itertools, as Peter Varo suggests.

Elmira answered 4/8, 2019 at 9:19 Comment(1)
In some cases one could do first.iter().chain(iter::repeat(DEFAULT)).zip(second.chain(iter::repeat(DEFAULT))).take(length) if length is known. In my case to work with two bitflags-like structures it perfectly fitsGeese
C
9

The key is to detect that one iterator is shorter then the other, you could do it before before in your case vector implement ExactSizeIterator but a general solution would be to have a custom .zip().

itertools already offer a general solution, .zip_longest():

use itertools::EitherOrBoth::{Both, Left, Right};
use itertools::Itertools;

fn main() {
    let num1 = vec![1, 2];
    let num2 = vec![3];

    for i in num1
        .iter()
        .rev()
        .zip_longest(num2.iter().rev())
        .map(|x| match x {
            Both(a, b) => (a, b),
            Left(a) => (a, &0),
            Right(b) => (&0, b),
        })
    {
        println!("{:?}", i);
    }
}

This require you write the closure everytime, if you need this feature a lot maybe implement a custom trait on iterator with .zip_default() where A and B implement Default:

use std::default::Default;
use std::iter::Fuse;

pub trait MyIterTools: Iterator {
    fn zip_default<J>(self, other: J) -> ZipDefault<Self, J::IntoIter>
    where
        J: IntoIterator,
        Self: Sized,
    {
        ZipDefault::new(self, other.into_iter())
    }
}

#[derive(Clone, Debug)]
pub struct ZipDefault<I, J> {
    i: Fuse<I>,
    j: Fuse<J>,
}

impl<I, J> ZipDefault<I, J>
where
    I: Iterator,
    J: Iterator,
{
    fn new(i: I, j: J) -> Self {
        Self {
            i: i.fuse(),
            j: j.fuse(),
        }
    }
}

impl<T, U, A, B> Iterator for ZipDefault<T, U>
where
    T: Iterator<Item = A>,
    U: Iterator<Item = B>,
    A: Default,
    B: Default,
{
    type Item = (A, B);

    fn next(&mut self) -> Option<Self::Item> {
        match (self.i.next(), self.j.next()) {
            (Some(a), Some(b)) => Some((a, b)),
            (Some(a), None) => Some((a, B::default())),
            (None, Some(b)) => Some((A::default(), b)),
            (None, None) => None,
        }
    }
}

impl<T: ?Sized> MyIterTools for T where T: Iterator {}

fn main() {
    let num1 = vec![1, 2];
    let num2 = vec![3];

    for i in num1
        .iter()
        .copied()
        .rev()
        .zip_default(num2.iter().copied().rev())
    {
        println!("{:?}", i);
    }
}

Using itertools we can delegate some logic:

use std::default::Default;
use itertools::Itertools;
use itertools::ZipLongest;
use itertools::EitherOrBoth::{Both, Left, Right};

pub trait MyIterTools: Iterator {
    fn zip_default<J>(self, j: J) -> ZipDefault<Self, J::IntoIter>
    where
        Self: Sized,
        J: IntoIterator,
    {
        ZipDefault::new(self, j.into_iter())
    }
}

#[derive(Clone, Debug)]
pub struct ZipDefault<I, J> {
    inner: ZipLongest<I, J>,
}

impl<I, J> ZipDefault<I, J>
where
    I: Iterator,
    J: Iterator,
{
    fn new(i: I, j: J) -> Self {
        Self {
            inner: i.zip_longest(j),
        }
    }
}

impl<T, U, A, B> Iterator for ZipDefault<T, U>
where
    T: Iterator<Item = A>,
    U: Iterator<Item = B>,
    A: Default,
    B: Default,
{
    type Item = (A, B);

    fn next(&mut self) -> Option<Self::Item> {
        match self.inner.next()? {
            Both(a, b) => Some((a, b)),
            Left(a) => Some((a, B::default())),
            Right(b) => Some((A::default(), b)),
        }
    }

    fn size_hint(&self) -> (usize, Option<usize>) {
        self.inner.size_hint()
    }
}

impl<T: ?Sized> MyIterTools for T where T: Iterator {}

fn main() {
    let num1 = vec![1, 2];
    let num2 = vec![3];

    for i in num1
        .iter()
        .copied()
        .rev()
        .zip_default(num2.iter().copied().rev())
    {
        println!("{:?}", i);
    }
}
Canny answered 4/8, 2019 at 9:31 Comment(0)
N
4

I saw this neat trick in other guy code in leetcode solution. If you have access to length, you can swap iterators, making iter1 the longest.

fn main() {
    let num1 = vec![1, 2];
    let num2 = vec![3];
    let mut iter1 = num1.iter();
    let mut iter2 = num2.iter();
    if iter1.len() < iter2.len(){
        std::mem::swap(&mut iter1, &mut iter2);
    } // now iter1 is the largest
    for i in iter1.rev().zip(iter2.rev().chain(std::iter::repeat(&0))) {
        println!("{:?}", i);
    }
}
Nottinghamshire answered 24/11, 2021 at 22:29 Comment(0)
M
1

If you can get the length of the iterators, as is in this case, a quick and dirty way could be:

use std::iter::repeat;

fn main() {
    let a = vec![1, 2, 3];
    let b = vec![4, 5, 6, 7];

    for i in a
        .iter()
        .rev()
        .chain(repeat(&0).take(b.len().saturating_sub(a.len())))
        .zip(
            b.iter()
                .rev()
                .chain(repeat(&0).take(a.len().saturating_sub(b.len()))),
        )
    {
        println!("{:?}", i);
    }
}

You can also implement a trait containing a zip_default() using this approach:


pub trait MyIterTools<X: Default + Clone>: ExactSizeIterator<Item = X> {
    fn zip_default<J, Y>(self, j: J) -> ZipDefault<Self, J::IntoIter, X, Y>
    where
        Self: Sized,
        J: IntoIterator<Item = Y>,
        J::IntoIter: ExactSizeIterator,
        Y: Default + Clone,
    {
        ZipDefault::new(self, j.into_iter())
    }
}

#[derive(Clone, Debug)]
pub struct ZipDefault<
    I: ExactSizeIterator<Item = X>,
    J: ExactSizeIterator<Item = Y>,
    X: Default + Clone,
    Y: Default + Clone,
> {
    inner: Zip<Chain<I, Take<Repeat<X>>>, Chain<J, Take<Repeat<Y>>>>,
}

impl<
        I: ExactSizeIterator<Item = X>,
        J: ExactSizeIterator<Item = Y>,
        X: Default + Clone,
        Y: Default + Clone,
    > ZipDefault<I, J, X, Y>
{
    fn new(a: I, b: J) -> Self {
        let a_len = a.len();
        let b_len = b.len();
        Self {
            inner: a
                .chain(repeat(X::default()).take(b_len.saturating_sub(a_len)))
                .zip(b.chain(repeat(Y::default()).take(a_len.saturating_sub(b_len)))),
        }
    }
}

impl<
        I: ExactSizeIterator<Item = X>,
        J: ExactSizeIterator<Item = Y>,
        X: Default + Clone,
        Y: Default + Clone,
    > Iterator for ZipDefault<I, J, X, Y>
{
    type Item = (X, Y);

    fn next(&mut self) -> Option<Self::Item> {
        self.inner.next()
    }

    fn size_hint(&self) -> (usize, Option<usize>) {
        self.inner.size_hint()
    }
}

impl<T: ExactSizeIterator<Item = X>, X: Default + Clone> MyIterTools<X> for T {}
fn main() {
    let a = vec![1, 2, 3];
    let b = vec![4, 5, 6, 7];

    a.into_iter()
        .zip_default(b.into_iter())
        .for_each(|i| println!("{:?}", i));
}
Meanly answered 7/11, 2020 at 16:24 Comment(0)
C
1

For what it's worth, I'm posting my simple ZipLongest solution I threw together for a LeetCode problem. LC doesn't make the itertools crate available. This type can easily be dropped into the LC editor and used.

struct ZipLongest<I, J, D, E> {
    iter1: I,
    iter2: J,
    fill : (D, E),
}
impl<I, J, D, E> ZipLongest<I, J, D, E> {
    fn new(iter1: I, iter2: J, fill: (D, E)) -> Self {
        Self { iter1, iter2, fill }
    }
}

impl<I, J, D, E> Iterator for ZipLongest<I, J, D, E> 
where
    I: Iterator<Item=D>,
    J: Iterator<Item=E>,
    D: Clone,
    E: Clone,
{
    type Item = (D, E);

    fn next(&mut self) -> Option<Self::Item> {
        match (self.iter1.next(), self.iter2.next()) {
            (Some(a), Some(b)) => Some((a, b)),
            (Some(a), None   ) => Some((a, self.fill.1.clone())),
            (None,    Some(a)) => Some((self.fill.0.clone(), a)),
            (None,    None   ) => None,
        }
    }
}

This was used to create a version number comparison function.

use std::num::ParseIntError;
use std::cmp::Ordering::{self, Equal};

fn compare_version(version1: String, version2: String) 
                   
    -> Result<Ordering, ParseIntError> 
{
    let iter1 = version1.split('.');
    let iter2 = version2.split('.');

    for (v1, v2) in ZipLongest::new(iter1, iter2, ("0", "0")) { // <-- usage
    
        let n1 = v1.parse::<i32>()?;
        let n2 = v2.parse::<i32>()?;

        match n1.cmp(&n2) {
            Equal => (),
            ord   => return Ok(ord),
        }
    }
    Ok(Equal)
}
Carpentaria answered 3/5 at 19:56 Comment(0)
I
1

If you don't want to use itertools for some reason, here's a simple generic solution that doesn't require ExactSizeIterator and supports two iterators of different types:

pub fn zip_longest<I, J>(iter0: I, iter1: J) -> impl Iterator<Item = (I::Item, J::Item)>
where
    I: IntoIterator,
    I::Item: Default,
    J: IntoIterator,
    J::Item: Default,
{
    let mut iter0 = iter0.into_iter();
    let mut iter1 = iter1.into_iter();
    std::iter::from_fn(move || match (iter0.next(), iter1.next()) {
        (None, None) => None,
        (x, y) => Some((x.unwrap_or_default(), y.unwrap_or_default())),
    })
}

If you want to supply a custom fill value, you could use a newtype with a custom Default implementation (or just modify the function to your needs).

Impish answered 7/5 at 11:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.