How to generate short unique names for uploaded files in nodejs
Asked Answered
W

8

21

I need to name uploaded files by short unique identifier like nYrnfYEv a4vhAoFG hwX6aOr7. How could I ensure uniqueness of files?

Waterbuck answered 13/4, 2015 at 12:44 Comment(9)
See: #1349904Rheinlander
From the first comment of this post: Warning: None of the answers have a true-random result! They are only pseudo-random. When using random strings for protection or security, don't use any of them!!! Try one of these api's I need to realy unique namesWaterbuck
@Waterbuck you can check my answer nowDivergent
Check shortid. It sounds exactly like the one you're looking for.Sletten
Is it generate unique not repeated ids?Waterbuck
@Waterbuck no it's still something probabilistic. Which isn't necessarily bad, depending on the requirements.Sletten
@Qualcuno I need generate completlly unique idsWaterbuck
@Waterbuck then either you check the id against a database (or the filesystem), or you add a timestamp at the beginning of the file name. Modules like shortid are designed so the chance for collisions is extremely low, almost 0, and if you prepend something like a timestamp you can be 100% certain it's never going to result in a collision (and it's pretty hard to guess too)Sletten
Ok, but why modules like shortid don't add something like a timestamp to be sure it's 100% unique?Waterbuck
S
25

Update: shortid is deprecated. Use Nano ID instead. The answer below applies to Nano ID as well.


(Posting my comments as answer, with responses to your concerns)

You may want to check out the shortid NPM module, which generates short ids (shockingly, I know :) ) similar to the ones you were posting as example. The result is configurable, but by default it's a string between 7 and 14 characters (length is random too), all URL-friendly (A-Za-z0-9\_\- in a regex).

To answer your (and other posters') concerns:

  • Unless your server has a true random number generator (highly unlikely), every solution will use a PRNG (Pseudo-Random Number Generator). shortid uses Node.js crypto module to generate PRNG numbers, however, which is a much better generator than Math.random()
  • shortid's are not sequential, which makes it even harder to guess them
  • While shortid's are not guaranteed to be unique, the likelihood of a collision is extremely small. Unless you generate billions of entries per year, you could safely assume that a collision will never happen.
  • For most cases, relying on probability to trust that collisions won't happen is enough. If your data is too important to risk even that tiny amount, you could make the shortid basically 100% unique by just prepending a timestamp to it. As an additional benefit, the file names will be harder to guess too. (Note: I wrote "basically 100% unique" because you could still, in theory, have a collision if two items are generated in the same timestamp, i.e. the same second. However, I would never be concerned of this. To have a real 100% certainty your only option is to run a check against a database or the filesystem, but that requires more resources.)
  • The reason why shortid doesn't do that by itself is because for most applications the likelihood of a collision is too small to be a concern, and it's more important to have the shortest possible ids.
Sletten answered 13/4, 2015 at 14:39 Comment(5)
Appending a timestamp isn't necessary. shortid ensures multiple calls within the same second generate unique IDs.Domett
@RichardPoole That's exactly why I'm suggesting adding the timestamp as an option (absolutely not necessary). In different seconds, the same ID could theoretically be generated again. This of course depends on how many IDs you generate, and all the points I made above stand validSletten
An approach I've been using, regardless of the generator used (it could be shortid or xxhash, for instance) is incrementing an integer counter each time a file is uploaded that essentially keeps track of the number of files that were uploaded since the server is running and to insert this counter in the seed for the filename. With this method combined with inserting a unix timestamp, I don't see any collision possible, unless you upload 2^53 files without ever restarting your server.Interbedded
In fact, for a collision to happen using this method, 2^53 files would have to be uploaded in one second, I'm fairly certain that's impossible.Interbedded
@Gaboik1the problem with that is that you need a way to store the unique index - for example a database. It will take two DB calls on every ID generation: one to read the index, and another one to update it. With shortId (et similia) plus the timestamp, chances of a collision are already incredibly small, even at large scale. It's also stateless and so suitable for using with distributed systems, without a centralized database/repository which would be a bottleneck.Sletten
M
19

One option could be to generate unique identifiers (UUID) and rename the file(s) accordingly.

Have a look at the kelektiv/node-uuid npm module.


EXAMPLE:

$ npm install uuid

...then in your JavaScript file:

const uuidv4 = require('uuid/v4'); // I chose v4 ‒ you can select others
var filename = uuidv4(); // '110ec58a-a0f2-4ac4-8393-c866d813b8d1'

Any time you execute uuidv4() you'll get a very-fresh-new-one.

NOTICE: There are other choices/types of UUIDs. Read the module's documentation to familiarize with those.

Mage answered 13/4, 2015 at 13:24 Comment(5)
I need to name uploaded files.Waterbuck
@Erik: That's why. You receive some stream from somewhere, and when you save it, you name it. Even so, you can rename it.Mage
@ Machina yes but I still need to generate unique nameWaterbuck
why the sudo? IMHO, npm install --save node-uuidBrinn
It's a bad idea to shorten UUIDs, as it doesn't guarantee uniqueness when you shorten them.Thermomotor
G
5

Very simple code. produce a filename almost unique or if that's not enough you check if the file exists

function getRandomFileName() {
var timestamp = new Date().toISOString().replace(/[-:.]/g,"");  
var random = ("" + Math.random()).substring(2, 8); 
var random_number = timestamp+random;  
return random_number;
}
Gamache answered 12/1, 2021 at 9:10 Comment(0)
S
2
export default generateRandom = () => Math.random().toString(36).substring(2, 15) + Math.random().toString(23).substring(2, 5);

As simple as that!

Sain answered 25/2, 2022 at 8:22 Comment(1)
This is nice and functional for some simple usecasesSchweiker
B
0
function uniqueFileName( filePath, stub)
{
    let id = 0;
    let test = path.join(filePath, stub + id++);
    while (fs.existsSync(test))
    {
        test = path.join(filePath, stub + id++);
    }
    return test;
}
Blandish answered 1/9, 2021 at 13:15 Comment(0)
E
0
function GetTempName()
{
    function gen(count) 
    { 
        let str = '';
        for (let i = 1; i <= count; i++) 
        {
            str += Math.floor(Math.random() * "0123456789abcdef".length).toString(36);
        }
        return  str;
    }
    return gen(8)+'-'+gen(4)+'-'+gen(4)+'-'+gen(4)+'-'+gen(12);
    //return gen(5);
    
}

console.log(GetTempName());
Egarton answered 25/1 at 6:41 Comment(0)
R
-5

I think you might be confused about true-random and pseudo-random.

Pseudo-random strings 'typically exhibit stastical randomness while being generated by an entirely deterministic casual process'. What this means is, if you are using these random values as entropy in a cryptographic application, you do not want to use a pseudo-random generator.

For your use, however, I believe it will be fine - just check for potential (highly unlikely) clashes.

All you are wanting to do is create a random string - not ensure it is 100% secure and completely random.

Rheinlander answered 13/4, 2015 at 13:0 Comment(1)
Unless you have a TRNG hardware module, which is highly unlikely, every software solution uses a PRNG. Some, however, are "more random": using Node.js crypto module, for example, is a better PRNG than Math.random().Sletten
D
-6

Try following snippet:-

    function getRandomSalt() {
    var milliseconds = new Date().getTime();
    var timestamp = (milliseconds.toString()).substring(9, 13)
    var random = ("" + Math.random()).substring(2, 8);
    var random_number = timestamp+random;  // string will be unique because timestamp never repeat itself
    var random_string = base64_encode(random_number).substring(2, 8); // you can set size here of return string
    var return_string = '';
    var Exp = /((^[0-9]+[a-z]+)|(^[a-z]+[0-9]+))+[0-9a-z]+$/i;
    if (random_string.match(Exp)) {                 //check here whether string is alphanumeric or not
        return_string = random_string;
    } else {
        return getRandomSalt();  // call recursivley again
    }
    return return_string;
}

File name might have an alphanumeric name with uniqueness according to your requirement. Unique name based on the concept of timestamp of current time because current time never repeat itself in future and to make it strong i have applied a base64encode which will be convert it into alphanumeric.

   var file = req.files.profile_image;
   var tmp_path = file.path;
   var fileName = file.name;
   var file_ext = fileName.substr((Math.max(0, fileName.lastIndexOf(".")) || Infinity) + 1);
   var newFileName = getRandomSalt() + '.' + file_ext;

Thanks

Divergent answered 13/4, 2015 at 13:30 Comment(5)
getRandomSalt always generates same stringWaterbuck
Thanks for the code. How much a chanse that we get repeated id?Waterbuck
Never. Its a combination of timestamp and random number. After that base64 encoding has been done. You can check through billions of data and verify my answer after that.Divergent
it would be great if you choose large string as a return string of getRandomSalt functionDivergent
the comment about timestamp never repeating itself is perhaps misleading. When I type for(i=0; i < 60; ++i) console.log(new Date().getTime()); I can see the timestamp repeat itself dozens of times. I suppose the comment is correct if you're running on a machine where the rest of the code is guaranteed to consume more than a millisecond. Either way, most of the timestamp string is not changing, and therefore contributing very little to the uniqueness. I ran this on node and got a collision in the first 60 iterations. Ran it 600 times, got a stack overflow (try to prove the recursion halts!).Quixotic

© 2022 - 2024 — McMap. All rights reserved.