In the Ashton String task, the goal is to:
Arrange all the distinct substrings of a given string in lexicographical order and concatenate them. Print the Kth character of the concatenated string. It is assured that given value of K will be valid i.e. there will be a Kth character.
The Input Format
:
First line will contain a number T i.e. number of test cases. First line of each test case will contain a string containing characters (a−z) and second line will contain a number K.
The Output Format
:
Print Kth character ( the string is 1 indexed )
And the Constraints
are
1 ≤ T ≤ 5
1 ≤ length ≤ 105
K will be an appropriate integer.
For example, given the input:
1
dbac
3
The output would be: c
I've attempted the task with this code and it works for relatively short strings:
from itertools import chain
def zipngram(text, n=2):
words = list(text)
return zip(*[words[i:] for i in range(n)])
for _ in input():
t = input()
position = int(input())-1 # 0th indexing
chargrams = chain(*[zipngram(t,i) for i in range(1,len(t)+1)])
concatstr = ''.join(sorted([''.join(s) for s in chargrams]))
print (concatstr[position])
But if the input file looks like this: http://pastebin.com/raw/WEi2p09H and the desired output is:
l
s
y
h
s
The interpreter will throw a MemoryError
:
Traceback (most recent call last):
File "solution.py", line 11, in <module>
chargrams = chain(*[zipngram(t,i) for i in range(1,len(t)+1)])
File "solution.py", line 11, in <listcomp>
chargrams = chain(*[zipngram(t,i) for i in range(1,len(t)+1)])
File "solution.py", line 6, in zipngram
return zip(*[words[i:] for i in range(n)])
File "solution.py", line 6, in <listcomp>
return zip(*[words[i:] for i in range(n)])
MemoryError
How can the MemoryError be resolved? Is it solvable in another way using native python2 or python3?
I tried to resolve the MemoryError
by pruning the stack using heapq
but now it goes into an uber slow runtime pushing and popping the heap instead of taking up too much memory.
from itertools import chain
import heapq
t = int(input())
s = list(input())
k = int(input())
stack = []
for n in range(1,len(s)+1):
for j in range(n):
ngram = (''.join(s[j:]))
ngram_len = len(ngram)
# Push the ngram into the heapq.
heapq.heappush(stack, ngram)
pruned_stack = []
temp_k = 0
# Prune stack.
while stack != [] and temp_k < k:
x = heapq.heappop(stack)
pruned_stack.append(x)
temp_k+=len(x)
# Replace stack with pruend_stack.
stack = pruned_stack
print (''.join(chain(*pruned_stack))[k])
Is there a way to balance between not using too much memory that it leads to MemoryError
and too slow runtime with heapq
pushing and popping?