I was given this problem in an interview. How would you have answered?
Design a data structure that offers the following operations in O(1) time:
- insert
- remove
- contains
- get random element
I was given this problem in an interview. How would you have answered?
Design a data structure that offers the following operations in O(1) time:
Consider a data structure composed of a hashtable H and an array A. The hashtable keys are the elements in the data structure, and the values are their positions in the array.
since the array needs to auto-increase in size, it's going to be amortize O(1) to add an element, but I guess that's OK.
O(1) lookup hints at a hashed data structure.
By comparison:
hashtable.get((int)(Math.random()*hashtable.size()));
–
Possing O(1)
, but inserting into an array (moving elements down) is O(n)
, as is appending to an array (which requires reallocation and copying). You can also have O( log n )
lookup in an array using binary search if the array is sorted. –
Salutatory For this Question i will use two Data Structure
Steps :-
Code :-
import java.util.ArrayList;
import java.util.Collections;
import java.util.HashMap;
import java.util.HashSet;
import java.util.LinkedList;
import java.util.List;
import java.util.Random;
import java.util.Scanner;
public class JavaApplication1 {
public static void main(String args[]){
Scanner sc = new Scanner(System.in);
ArrayList<Integer> al =new ArrayList<Integer>();
HashMap<Integer,Integer> mp = new HashMap<Integer,Integer>();
while(true){
System.out.println("**menu**");
System.out.println("1.insert");
System.out.println("2.remove");
System.out.println("3.search");
System.out.println("4.rendom");
int ch = sc.nextInt();
switch(ch){
case 1 : System.out.println("Enter the Element ");
int a = sc.nextInt();
if(mp.containsKey(a)){
System.out.println("Element is already present ");
}
else{
al.add(a);
mp.put(a, al.size()-1);
}
break;
case 2 : System.out.println("Enter the Element Which u want to remove");
a = sc.nextInt();
if(mp.containsKey(a)){
int size = al.size();
int index = mp.get(a);
int last = al.get(size-1);
Collections.swap(al, index, size-1);
al.remove(size-1);
mp.put(last, index);
System.out.println("Data Deleted");
}
else{
System.out.println("Data Not found");
}
break;
case 3 : System.out.println("Enter the Element to Search");
a = sc.nextInt();
if(mp.containsKey(a)){
System.out.println(mp.get(a));
}
else{
System.out.println("Data Not Found");
}
break;
case 4 : Random rm = new Random();
int index = rm.nextInt(al.size());
System.out.println(al.get(index));
break;
}
}
}
}
-- Time complexity O(1). -- Space complexity O(N).
You might not like this, because they're probably looking for a clever solution, but sometimes it pays to stick to your guns... A hash table already satisfies the requirements - probably better overall than anything else will (albeit obviously in amortised constant time, and with different compromises to other solutions).
The requirement that's tricky is the "random element" selection: in a hash table, you would need to scan or probe for such an element.
For closed hashing / open addressing, the chance of any given bucket being occupied is size() / capacity()
, but crucially this is typically kept in a constant multiplicative range by a hash-table implementation (e.g. the table may be kept larger than its current contents by say 1.2x to ~10x depending on performance/memory tuning). This means on average we can expect to search 1.2 to 10 buckets - totally independent of the total size of the container; amortised O(1).
I can imagine two simple approaches (and a great many more fiddly ones):
search linearly from a random bucket
try random buckets repeatedly until you find a populated one
Not a great solution, but may still be a better overall compromise than the memory and performance overheads of maintaining a second index array at all times.
The best solution is probably the hash table + array, it's real fast and deterministic.
But the lowest rated answer (just use a hash table!) is actually great too!
People might not like this because of "possible infinite loops", and I've seen very smart people have this reaction too, but it's wrong! Infinitely unlikely events just don't happen.
Assuming the good behavior of your pseudo-random source -- which is not hard to establish for this particular behavior -- and that hash tables are always at least 20% full, it's easy to see that:
It will never happen that getRandom() has to try more than 1000 times. Just never. Indeed, the probability of such an event is 0.8^1000, which is 10^-97 -- so we'd have to repeat it 10^88 times to have one chance in a billion of it ever happening once. Even if this program was running full-time on all computers of humankind until the Sun dies, this will never happen.
Here is a C# solution to that problem I came up with a little while back when asked the same question. It implements Add, Remove, Contains, and Random along with other standard .NET interfaces. Not that you would ever need to implement it in such detail during an interview but it's nice to have a concrete solution to look at...
using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
using System.Threading;
/// <summary>
/// This class represents an unordered bag of items with the
/// the capability to get a random item. All operations are O(1).
/// </summary>
/// <typeparam name="T">The type of the item.</typeparam>
public class Bag<T> : ICollection<T>, IEnumerable<T>, ICollection, IEnumerable
{
private Dictionary<T, int> index;
private List<T> items;
private Random rand;
private object syncRoot;
/// <summary>
/// Initializes a new instance of the <see cref="Bag<T>"/> class.
/// </summary>
public Bag()
: this(0)
{
}
/// <summary>
/// Initializes a new instance of the <see cref="Bag<T>"/> class.
/// </summary>
/// <param name="capacity">The capacity.</param>
public Bag(int capacity)
{
this.index = new Dictionary<T, int>(capacity);
this.items = new List<T>(capacity);
}
/// <summary>
/// Initializes a new instance of the <see cref="Bag<T>"/> class.
/// </summary>
/// <param name="collection">The collection.</param>
public Bag(IEnumerable<T> collection)
{
this.items = new List<T>(collection);
this.index = this.items
.Select((value, index) => new { value, index })
.ToDictionary(pair => pair.value, pair => pair.index);
}
/// <summary>
/// Get random item from bag.
/// </summary>
/// <returns>Random item from bag.</returns>
/// <exception cref="System.InvalidOperationException">
/// The bag is empty.
/// </exception>
public T Random()
{
if (this.items.Count == 0)
{
throw new InvalidOperationException();
}
if (this.rand == null)
{
this.rand = new Random();
}
int randomIndex = this.rand.Next(0, this.items.Count);
return this.items[randomIndex];
}
/// <summary>
/// Adds the specified item.
/// </summary>
/// <param name="item">The item.</param>
public void Add(T item)
{
this.index.Add(item, this.items.Count);
this.items.Add(item);
}
/// <summary>
/// Removes the specified item.
/// </summary>
/// <param name="item">The item.</param>
/// <returns></returns>
public bool Remove(T item)
{
// Replace index of value to remove with last item in values list
int keyIndex = this.index[item];
T lastItem = this.items[this.items.Count - 1];
this.items[keyIndex] = lastItem;
// Update index in dictionary for last item that was just moved
this.index[lastItem] = keyIndex;
// Remove old value
this.index.Remove(item);
this.items.RemoveAt(this.items.Count - 1);
return true;
}
/// <inheritdoc />
public bool Contains(T item)
{
return this.index.ContainsKey(item);
}
/// <inheritdoc />
public void Clear()
{
this.index.Clear();
this.items.Clear();
}
/// <inheritdoc />
public int Count
{
get { return this.items.Count; }
}
/// <inheritdoc />
public void CopyTo(T[] array, int arrayIndex)
{
this.items.CopyTo(array, arrayIndex);
}
/// <inheritdoc />
public bool IsReadOnly
{
get { return false; }
}
/// <inheritdoc />
public IEnumerator<T> GetEnumerator()
{
foreach (var value in this.items)
{
yield return value;
}
}
/// <inheritdoc />
IEnumerator IEnumerable.GetEnumerator()
{
return this.GetEnumerator();
}
/// <inheritdoc />
public void CopyTo(Array array, int index)
{
this.CopyTo(array as T[], index);
}
/// <inheritdoc />
public bool IsSynchronized
{
get { return false; }
}
/// <inheritdoc />
public object SyncRoot
{
get
{
if (this.syncRoot == null)
{
Interlocked.CompareExchange<object>(
ref this.syncRoot,
new object(),
null);
}
return this.syncRoot;
}
}
}
ArgumentException
with the message "An item with the same key has already been added." will be thrown (from the underlying index Dictionary). –
Crumpton We can use hashing to support operations in Θ(1) time.
insert(x) 1) Check if x is already present by doing a hash map lookup. 2) If not present, then insert it at the end of the array. 3) Add in hash table also, x is added as key and last array index as index.
remove(x) 1) Check if x is present by doing a hash map lookup. 2) If present, then find its index and remove it from hash map. 3) Swap the last element with this element in array and remove the last element. Swapping is done because the last element can be removed in O(1) time. 4) Update index of last element in hash map.
getRandom() 1) Generate a random number from 0 to last index. 2) Return the array element at the randomly generated index.
search(x) Do a lookup for x in hash map.
Though this is way old, but since there's no answer in C++, here's my two cents.
#include <vector>
#include <unordered_map>
#include <stdlib.h>
template <typename T> class bucket{
int size;
std::vector<T> v;
std::unordered_map<T, int> m;
public:
bucket(){
size = 0;
std::vector<T>* v = new std::vector<T>();
std::unordered_map<T, int>* m = new std::unordered_map<T, int>();
}
void insert(const T& item){
//prevent insertion of duplicates
if(m.find(item) != m.end()){
exit(-1);
}
v.push_back(item);
m.emplace(item, size);
size++;
}
void remove(const T& item){
//exits if the item is not present in the list
if(m[item] == -1){
exit(-1);
}else if(m.find(item) == m.end()){
exit(-1);
}
int idx = m[item];
m[v.back()] = idx;
T itm = v[idx];
v.insert(v.begin()+idx, v.back());
v.erase(v.begin()+idx+1);
v.insert(v.begin()+size, itm);
v.erase(v.begin()+size);
m[item] = -1;
v.pop_back();
size--;
}
T& getRandom(){
int idx = rand()%size;
return v[idx];
}
bool lookup(const T& item){
if(m.find(item) == m.end()) return false;
return true;
}
//method to check that remove has worked
void print(){
for(auto it = v.begin(); it != v.end(); it++){
std::cout<<*it<<" ";
}
}
};
Here's a piece of client code to test the solution.
int main() {
bucket<char>* b = new bucket<char>();
b->insert('d');
b->insert('k');
b->insert('l');
b->insert('h');
b->insert('j');
b->insert('z');
b->insert('p');
std::cout<<b->random()<<std::endl;
b->print();
std::cout<<std::endl;
b->remove('h');
b->print();
return 0;
}
In C# 3.0 + .NET Framework 4, a generic Dictionary<TKey,TValue>
is even better than a Hashtable because you can use the System.Linq
extension method ElementAt()
to index into the underlying dynamic array where the KeyValuePair<TKey,TValue>
elements are stored :
using System.Linq;
Random _generator = new Random((int)DateTime.Now.Ticks);
Dictionary<string,object> _elements = new Dictionary<string,object>();
....
Public object GetRandom()
{
return _elements.ElementAt(_generator.Next(_elements.Count)).Value;
}
However, as far as I know, a Hashtable (or its Dictionary progeny) is not a real solution to this problem because Put() can only be amortized O(1) , not true O(1) , because it is O(N) at the dynamic resize boundary.
Is there a real solution to this problem ? All I can think of is if you specify a Dictionary/Hashtable initial capacity an order of magnitude beyond what you anticipate ever needing, then you get O(1) operations because you never need to resize.
I agree with Anon. Except for the last requirement where getting a random element with equal fairness is required all other requirements can be addressed only using a single Hash based DS. I will choose HashSet for this in Java. The modulo of hash code of an element will give me the index no of the underlying array in O(1) time. I can use that for add, remove and contains operations.
Cant we do this using HashSet of Java? It provides insert, del, search all in O(1) by default. For getRandom we can make use of iterator of Set which anyways gives random behavior. We can just iterate first element from set without worrying about rest of the elements
public void getRandom(){
Iterator<integer> sitr = s.iterator();
Integer x = sitr.next();
return x;
}
/* Java program to design a data structure that support folloiwng operations
in Theta(n) time
a) Insert
b) Delete
c) Search
d) getRandom */
import java.util.*;
// class to represent the required data structure
class MyDS
{
ArrayList<Integer> arr; // A resizable array
// A hash where keys are array elements and vlaues are
// indexes in arr[]
HashMap<Integer, Integer> hash;
// Constructor (creates arr[] and hash)
public MyDS()
{
arr = new ArrayList<Integer>();
hash = new HashMap<Integer, Integer>();
}
// A Theta(1) function to add an element to MyDS
// data structure
void add(int x)
{
// If ekement is already present, then noting to do
if (hash.get(x) != null)
return;
// Else put element at the end of arr[]
int s = arr.size();
arr.add(x);
// And put in hash also
hash.put(x, s);
}
// A Theta(1) function to remove an element from MyDS
// data structure
void remove(int x)
{
// Check if element is present
Integer index = hash.get(x);
if (index == null)
return;
// If present, then remove element from hash
hash.remove(x);
// Swap element with last element so that remove from
// arr[] can be done in O(1) time
int size = arr.size();
Integer last = arr.get(size-1);
Collections.swap(arr, index, size-1);
// Remove last element (This is O(1))
arr.remove(size-1);
// Update hash table for new index of last element
hash.put(last, index);
}
// Returns a random element from MyDS
int getRandom()
{
// Find a random index from 0 to size - 1
Random rand = new Random(); // Choose a different seed
int index = rand.nextInt(arr.size());
// Return element at randomly picked index
return arr.get(index);
}
// Returns index of element if element is present, otherwise null
Integer search(int x)
{
return hash.get(x);
}
}
// Driver class
class Main
{
public static void main (String[] args)
{
MyDS ds = new MyDS();
ds.add(10);
ds.add(20);
ds.add(30);
ds.add(40);
System.out.println(ds.search(30));
ds.remove(20);
ds.add(50);
System.out.println(ds.search(50));
System.out.println(ds.getRandom());`enter code here`
}
}
This solution properly handles duplicate values. You can:
To make this possible, we just need to keep a hash-set of indexes for each element.
class RandomCollection:
def __init__(self):
self.map = {}
self.list = []
def get_random_element(self):
return random.choice(self.list)
def insert(self, element):
index = len(self.list)
self.list.append(element)
if element not in self.map:
self.map[element] = set()
self.map[element].add(index)
def remove(self, element):
if element not in self.map:
raise Exception("Element not found", element)
# pop any index in constant time
index = self.map[element].pop()
# find last element
last_index = len(self.list) - 1
last_element = self.list[last_index]
# keep map updated, this also works when removing
# the last element because add() does nothing
self.map[last_element].add(index)
self.map[last_element].remove(last_index)
if len(self.map[element]) == 0:
del self.map[element]
# copy last element to index and delete last element
self.list[index] = self.list[last_index]
del self.list[last_index]
# Example usage:
c = RandomCollection()
times = 1_000_000
for i in range(times):
c.insert("a")
c.insert("b")
for i in range(times - 1):
c.remove("a")
for i in range(times):
c.remove("b")
print(c.list) # prints ['a']
Although this question can be categorized into two cases : without duplicates and with duplicates :
//code without duplicates
class MyDataStructure
{
public:
// Initialize your data structure.
unordered_map<int,int>mp;
vector<int>v;
MyDataStructure()
{
}
// Insert element 'X'. Returns true if the element was not present, and false otherwise.
bool insert(int x)
{
if(mp.find(x)==mp.end())
{
mp[x]=v.size();
v.push_back(x);
return true;
}
return false;
}
// Removes element 'X', if present. Returns true if the element was present and false otherwise.
bool remove(int x)
{
if(mp.find(x)!=mp.end())
{
int lastVal = v.back();
int idx = mp[x]; //cuurent indx
v[idx] = lastVal;
mp[lastVal] = idx;
v.pop_back();
mp.erase(x);
return true;
}
return false;
}
// Search element 'X'. Returns true if the element was present, and false otherwise.
bool search(int x)
{
if(mp.find(x)==mp.end())return false;
return true;
}
int getRandom()
{
int randomIdx = rand() % v.size();
return v[randomIdx];
}
};
//with duplicates
class RandomizedCollection {
public:
unordered_map<int,int>mp;
vector<int>v;
RandomizedCollection(){
}
bool insert(int x) {
if(mp[x]==0)
{
mp[x]++;
v.push_back(x);
return true;
}
else if(mp[x]>0)
{
mp[x]++;
v.push_back(x);
return false;
}
return false;
}
bool remove(int x) {
if(mp[x]>0)
{
auto it = find(v.begin(),v.end(),x);
v.erase(it);
mp[x]--;
return true;
}
return false;
}
int getRandom() {
int randomIdx = rand() % v.size();
return v[randomIdx];
}
};
Why don't we use epoch%arraysize to find random element. Finding array size is O(n) but amortized complexity will be O(1).
I think we can use doubly link list with hash table. key will be element and its associated value will be node in doubly linklist.
© 2022 - 2024 — McMap. All rights reserved.