How to remove duplicates strings or int from Slice in Go
Asked Answered
C

18

31

Let's say I have a list of student cities and the size of it could be 100 or 1000, and I want to filter out all duplicates cities.

I want a generic solution that I can use to remove all duplicate strings from any slice.

I am new to Go Language, So I tried to do it by looping and checking if the element exists using another loop function.

Students' Cities List (Data):

studentsCities := []string{"Mumbai", "Delhi", "Ahmedabad", "Mumbai", "Bangalore", "Delhi", "Kolkata", "Pune"}

Functions that I created, and it's doing the job:

func contains(s []string, e string) bool {
    for _, a := range s {
        if a == e {
            return true
        }
    }
    return false
}

func removeDuplicates(strList []string) []string {
    list := []string{}
    for _, item := range strList {
        fmt.Println(item)
        if contains(list, item) == false {
            list = append(list, item)
        }
    }
    return list
}

My solution test

func main() {
    studentsCities := []string{"Mumbai", "Delhi", "Ahmedabad", "Mumbai", "Bangalore", "Delhi", "Kolkata", "Pune"}

    uniqueStudentsCities := removeDuplicates(studentsCities)
    
    fmt.Println(uniqueStudentsCities) // Expected output [Mumbai Delhi Ahmedabad Bangalore Kolkata Pune]
}

I believe that the above solution that I tried is not an optimum solution. Therefore, I need help from you guys to suggest the fastest way to remove duplicates from the slice?

I checked StackOverflow, this question is not being asked yet, so I didn't get any solution.

Cheadle answered 15/3, 2021 at 18:52 Comment(3)
See this if you want to remove adjacent duplicatesReverent
Similar question: Finding Unique Items in a Go Slice or ArrayConsonantal
Demo of sorting a struct with two propertiesStepfather
C
52

I found Burak's and Fazlan's solution helpful. Based on that, I implemented the simple functions that help to remove or filter duplicate data from slices of strings, integers, or any other types with generic approach.

Here are my three functions, first is generic, second one for strings and last one for integers of slices. You have to pass your data and return all the unique values as a result.

Generic solution: => Go v1.18

func removeDuplicate[T comparable](sliceList []T) []T {
    allKeys := make(map[T]bool)
    list := []T{}
    for _, item := range sliceList {
        if _, value := allKeys[item]; !value {
            allKeys[item] = true
            list = append(list, item)
        }
    }
    return list
}

To remove duplicate strings from slice:

func removeDuplicateStr(strSlice []string) []string {
    allKeys := make(map[string]bool)
    list := []string{}
    for _, item := range strSlice {
        if _, value := allKeys[item]; !value {
            allKeys[item] = true
            list = append(list, item)
        }
    }
    return list
}

To remove duplicate integers from slice:

func removeDuplicateInt(intSlice []int) []int {
    allKeys := make(map[int]bool)
    list := []int{}
    for _, item := range intSlice {
        if _, value := allKeys[item]; !value {
            allKeys[item] = true
            list = append(list, item)
        }
    }
    return list
}

You can update the slice type, and it will filter out all duplicates data for all types of slices.

Here is the GoPlayground link: https://go.dev/play/p/iyb97KcftMa

Cheadle answered 22/3, 2021 at 17:34 Comment(1)
Go 1.20 adds the comparable type. See my answer for an example. (based on Riyaz' code above - thank you!)Valerio
R
25

Sort and then compact, e.g. like this:

package main

import (
    "fmt"

    "slices"
)

func main() {
    list := []string{"c", "a", "b", "c", "b", "b"}
    slices.Sort(list)
    fmt.Println(slices.Compact(list)) // [a b c]
}
Rhaetian answered 14/6, 2023 at 7:57 Comment(2)
Great answer! The slices package was also added to the standard library in Go 1.21 pkg.go.dev/slices#CompactUte
This is now the better answer with the standard slices package. Updated playground link: go.dev/play/p/n6ozDU8RFgNPrefix
B
10

Adding this answer which worked for me, does require/include sorting, however.

func removeDuplicateStrings(s []string) []string {
    if len(s) < 1 {
        return s
    }

    sort.Strings(s)
    prev := 1
    for curr := 1; curr < len(s); curr++ {
        if s[curr-1] != s[curr] {
            s[prev] = s[curr]
            prev++
        }
    }

    return s[:prev]
}

For fun, I tried using generics! (Go 1.18+ only)

type SliceType interface {
    ~string | ~int | ~float64 // add more *comparable* types as needed
}

func removeDuplicates[T SliceType](s []T) []T {
    if len(s) < 1 {
        return s
    }

    // sort
    sort.SliceStable(s, func(i, j int) bool {
        return s[i] < s[j]
    })

    prev := 1
    for curr := 1; curr < len(s); curr++ {
        if s[curr-1] != s[curr] {
            s[prev] = s[curr]
            prev++
        }
    }

    return s[:prev]
}

Go Playground Link with tests: https://go.dev/play/p/bw1PP1osJJQ

Befoul answered 13/4, 2022 at 23:19 Comment(0)
F
7

You can do in-place replacement guided with a map:

processed := map[string]struct{}{}
w := 0
for _, s := range cities {
    if _, exists := processed[s]; !exists {
        // If this city has not been seen yet, add it to the list
        processed[s] = struct{}{}
        cities[w] = s
        w++
    }
}
cities = cities[:w]
Frizzell answered 15/3, 2021 at 18:59 Comment(3)
Thanks man, Just I want to know that, is it the best solution that I can rely on to loop more than 1000+ cities slice?Cheadle
It is a good way, until someone else comes up with a better way.Frizzell
If you have a sorted list of cities you can do this without the map. In a sorted array, duplicates will be right next to each other. If you really want to not have a map you might sort the cities, if you do not mind sorting them. Possible optimizations depend on your problem.Sorrows
N
7

reduce memory usage:

package main

import (
    "fmt"
    "reflect"
)

type void struct{}

func main() {

    digits := [6]string{"one", "two", "three", "four", "five", "five"}

    set := make(map[string]void, 6)

    for _, element := range digits {
        set[element] = void{}
    }

    fmt.Println(reflect.ValueOf(set).MapKeys())
}

p.s. playground

Newhall answered 9/6, 2022 at 14:58 Comment(0)
E
2

try: https://github.com/samber/lo#uniq

names := lo.Uniq[string]([]string{"Samuel", "John", "Samuel"})
// []string{"Samuel", "John"}
Eradis answered 13/2, 2023 at 9:29 Comment(0)
P
1

It can also be done with a set-like map:

ddpStrings := []string{}
m := map[string]struct{}{}

for _, s := range strings {
    if _, ok := m[scopeStr]; ok {
        continue
    }
    ddpStrings = append(ddpStrings, s)
    m[s] = struct{}{}
}
Pasho answered 14/9, 2021 at 10:31 Comment(0)
B
1

Simple to understand.

func RemoveDuplicate(array []string) []string {
    m := make(map[string]string)
    for _, x := range array {
        m[x] = x
    }
    var ClearedArr []string
    for x, _ := range m {
        ClearedArr = append(ClearedArr, x)
    }
    return ClearedArr
}
Bienne answered 31/1, 2022 at 17:3 Comment(1)
You don't need map[string]string. map[string]struct{} is sufficientVagrom
V
1

Go 1.20 adds the comparable generic type, which makes it possible to expand on Riyaz' answer like so:

func dedupeSlice[T comparable](sliceList []T) []T {
    dedupeMap := make(map[T]struct{})
    list := []T{}

    for _, slice := range sliceList {
        if _, exists := dedupeMap[slice]; !exists {
            dedupeMap[slice] = struct{}{}
            list = append(list, slice)
        }
    }

    return list
}

This allows you to feed in slices of any comparable type in the language, including strings, integers, floats, booleans, pointers, and more.

Valerio answered 21/8, 2023 at 21:9 Comment(0)
A
0

If you want to don't waste memory allocating another array for copy the values, you can remove in place the value, as following:

package main

import "fmt"

var studentsCities = []string{"Mumbai", "Delhi", "Ahmedabad", "Mumbai", "Bangalore", "Delhi", "Kolkata", "Pune"}

func contains(s []string, e string) bool {
    for _, a := range s {
        if a == e {
            return true
        }
    }
    return false
}

func main() {
    fmt.Printf("Cities before remove: %+v\n", studentsCities)
    for i := 0; i < len(studentsCities); i++ {
        if contains(studentsCities[i+1:], studentsCities[i]) {
            studentsCities = remove(studentsCities, i)
            i--
        }
    }
    fmt.Printf("Cities after remove: %+v\n", studentsCities)
}
func remove(slice []string, s int) []string {
    return append(slice[:s], slice[s+1:]...)
}

Result:

Cities before remove: [Mumbai Delhi Ahmedabad Mumbai Bangalore Delhi Kolkata Pune]
Cities after remove: [Ahmedabad Mumbai Bangalore Delhi Kolkata Pune]
Anemoscope answered 15/3, 2021 at 19:6 Comment(1)
depending on the data, allocating a new slice may be more efficient than continually copying the same data over and over again. If order doesn't matter (which the question doesn't specify), remove the items with individual swaps.Nilotic
G
0
func UniqueNonEmptyElementsOf(s []string) []string {
    unique := make(map[string]bool, len(s))
    var us []string
    for _, elem := range s {
        if len(elem) != 0 {
            if !unique[elem] {
                us = append(us, elem)
                unique[elem] = true
            }
        }
    }

    return us
}

send the duplicated splice to the above function, this will return the splice with unique elements.

func main() {
    studentsCities := []string{"Mumbai", "Delhi", "Ahmedabad", "Mumbai", "Bangalore", "Delhi", "Kolkata", "Pune"}

    uniqueStudentsCities := UniqueNonEmptyElementsOf(studentsCities)
    
    fmt.Println(uniqueStudentsCities)
}
Gynecology answered 16/3, 2021 at 11:5 Comment(0)
C
0

Based on Riyaz's solution, you can use generics since Go 1.18

func removeDuplicate[T string | int](tSlice []T) []T {
    allKeys := make(map[T]bool)
    list := []T{}
    for _, item := range tSlice {
        if _, value := allKeys[item]; !value {
            allKeys[item] = true
            list = append(list, item)
        }
    }
    return list
}

Generics minimizes code duplication.

Go Playground link : https://go.dev/play/p/Y3fEtHJpP7Q

Conley answered 5/7, 2022 at 8:29 Comment(0)
I
0

So far @snassr has given the best answer as it is the most optimized way in terms of memory (no extra memory) and runtime (nlogn). But one thing I want to emphasis here is if we want to delete any index/element of an array we should loop from end to start as it reduces complexity. If we loop from start to end then if we delete nth index then we will accidentally miss the nth element (which was n+1th before deleting nth element) as in the next iteration we will get the n+1th element.

Example Code

func Dedup(strs []string) {
    sort.Strings(strs)
    for i := len(strs) - 1; i > 0; i-- {
        if strs[i] == strs[i-1] {
            strs = append(strs[:i], strs[i+1:]...)
        }
    }
}
Indecision answered 11/10, 2022 at 11:42 Comment(0)
A
0

Here is a little bit simpler solution than the accepted answer

func RemoveDuplicates[T comparable](slice []T) []T {
    for i := 0; i < len(slice); i++ {
        for j := len(slice) - 1; j > i; j-- {
            if slice[i] == slice[j] {
                slice = append(slice[:j], slice[j+1:]...)
            }
        }
    }
    return slice
}
Amos answered 3/9, 2023 at 7:44 Comment(0)
P
0

Updated answer based on Jasper's original answer now that slices is part of the standard library:

package main

import (
    "fmt"
    "slices"
)

func main() {
    list := []string{"a", "b", "c", "a", "a", "c", "d", "d"} //[a b c a a c d d]
    slices.Sort(list) //[a a a b c c d d]
    list = slices.Compact(list) //[a b c d]

    fmt.Printf("Compacted List: %v\n", list)
}

Playground: https://go.dev/play/p/n6ozDU8RFgN

Prefix answered 26/1, 2024 at 16:1 Comment(0)
V
0

Modern, generic solution to remove duplicates without reordering (stable):

func RemoveDuplicates[T comparable](s []T) []T {
    alreadySeen := make(map[T]struct{}, len(s))
    return slices.DeleteFunc(s, func(val T) bool {
        _, duplicate := alreadySeen[val]
        alreadySeen[val] = struct{}{}
        return duplicate
    })
}

It works on all comparable types and uses slices.DeleteFunc for memory efficiency.

Vagrom answered 13/3, 2024 at 11:0 Comment(0)
B
-1

Here's a mapless index based slice's duplicate "remover"/trimmer. It use a sort method.

The n value is always 1 value lower than the total of non duplicate elements that's because this methods compare the current (consecutive/single) elements with the next (consecutive/single) elements and there is no matches after the lasts so you have to pad it to include the last.

Note that this snippet doesn't empty the duplicate elements into a nil value. However since the n+1 integer start at the duplicated item's indexes, you can loop from said integer and nil the rest of the elements.

sort.Strings(strs)
for n, i := 0, 0; ; {
    if strs[n] != strs[i] {
        if i-n > 1 {
            strs[n+1] = strs[i]
        }
        n++
    }
    i++
    if i == len(strs) {
        if n != i {
            strs = strs[:n+1]
        }
        break
    }
}
fmt.Println(strs)
Botti answered 28/1, 2022 at 17:41 Comment(0)
S
-4

To remove duplicate elements from array in go

        func main() {
        arr := []int{1,3,2,3,3,5}
        var i,j int
        var size = len(arr)
        for i=0;i<size;i++ {
                for j=i+1;j<size;j++ {
                        if arr[i]==arr[j] {
                                for k:=j;k<size-1;k++ {
                                        arr[k]=arr[k+1]
                                }
                                size--
                                j--
                        }
                }
        }
        for z:=0;z<size;z++ {
                fmt.Printf("%d\n",arr[z])
        }
}
Steepen answered 16/5, 2023 at 9:57 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.