You seem to be looking for a way to encrypt the strings in your dataframe. There are a bunch of python encryption libraries such as cryptography
How to use it is pretty simple, just apply it to each element.
import pandas as pd
from cryptography.fernet import Fernet
df =pd.DataFrame([{'a':'a','b':'b'}, {'a':'a','b':'c'}])
f = Fernet('password')
res = df.applymap(lambda x: f.encrypt(byte(x, 'utf-8'))
# Decrypt
res.applymap(lambda x: f.decrypt(x))
That is probably the best way in terms of security but it would generate a long byte/string and be hard to look at.
# 'a' -> b'gAAAAABaRQZYMjB7wh-_kD-VmFKn2zXajMRUWSAeridW3GJrwyebcDSpqyFGJsCEcRcf68ylQMC83G7dyqoHKUHtjskEtne8Fw=='
Another simple way so solve your problem is to create a function that maps a key to a value and creates a new value if a new key is present.
mapper = {}
def encode(string):
if x not in mapper:
# This part can be changed with anything really
# Such as mapper[x]=randint(-10**10,10**10)
# Just ensure it would not repeat
mapper[x] = len(mapper)+1
return mapper[x]
res = df.applymap(encode)
ssn
column? Can there be a decryption for this type of method? I assume not – Bield