Set Django's FileField to an existing file
Asked Answered
S

7

112

I have an existing file on disk (say /folder/file.txt) and a FileField model field in Django.

When I do

instance.field = File(file('/folder/file.txt'))
instance.save()

it re-saves the file as file_1.txt (the next time it's _2, etc.).

I understand why, but I don't want this behavior - I know the file I want the field to be associated with is really there waiting for me, and I just want Django to point to it.

How?

Sublapsarianism answered 30/11, 2011 at 20:20 Comment(3)
Not sure you can get what you want without modifying Django or subclassing FileField. Whenever a FileField is saved, a new copy of the file is created. It would be fairly straightforward to add an option to avoid this.Provitamin
well yes, looks like I have to subclass and add a param. I don't wnat to create extra tables for this simple taskSublapsarianism
Put the file in a different location, create your field with this path, save it and then you have the file in the upload_to destination.Cruiserweight
S
25

If you want to do this permanently, you need to create your own FileStorage class

import os
from django.conf import settings
from django.core.files.storage import FileSystemStorage

class MyFileStorage(FileSystemStorage):

    # This method is actually defined in Storage
    def get_available_name(self, name):
        if self.exists(name):
            os.remove(os.path.join(settings.MEDIA_ROOT, name))
        return name # simply returns the name passed

Now in your model, you use your modified MyFileStorage

from mystuff.customs import MyFileStorage

mfs = MyFileStorage()

class SomeModel(model.Model):
   my_file = model.FileField(storage=mfs)
Serrate answered 1/12, 2011 at 6:14 Comment(5)
oh, looks promising. cuase the FileField's code is kinda non-intuitiveSublapsarianism
but... is it possible to change storage on a per-request basis, like: instance.field.storage = mfs; instance.field.save(name, file); but not doing it in a different branch of my codeSublapsarianism
No, since the storage engine is tied to the model. You can avoid all this by simply storing your file path in either a FilePathField or simply as plain text.Serrate
You can't just return a name. You need to remove existing file first.Seline
This solution is only seemingly correct as this solution actually removes the file already present and creates a new one with the same name. In the end, it does NOT "point to it" as the author wrote. Imagine a situation where the user wants to point to a large file but actually ends up removing it and uploading it from scratch.Chemoreceptor
D
158

just set instance.field.name to the path of your file

e.g.

class Document(models.Model):
    file = FileField(upload_to=get_document_path)
    description = CharField(max_length=100)


doc = Document()
doc.file.name = 'path/to/file'  # must be relative to MEDIA_ROOT
doc.file
<FieldFile: path/to/file>
Donegan answered 5/6, 2012 at 22:43 Comment(2)
The relative path from your MEDIA_ROOT, that is.Hasten
In this example, I think you can also just do doc.file = 'path/to/file'Somnolent
S
25

If you want to do this permanently, you need to create your own FileStorage class

import os
from django.conf import settings
from django.core.files.storage import FileSystemStorage

class MyFileStorage(FileSystemStorage):

    # This method is actually defined in Storage
    def get_available_name(self, name):
        if self.exists(name):
            os.remove(os.path.join(settings.MEDIA_ROOT, name))
        return name # simply returns the name passed

Now in your model, you use your modified MyFileStorage

from mystuff.customs import MyFileStorage

mfs = MyFileStorage()

class SomeModel(model.Model):
   my_file = model.FileField(storage=mfs)
Serrate answered 1/12, 2011 at 6:14 Comment(5)
oh, looks promising. cuase the FileField's code is kinda non-intuitiveSublapsarianism
but... is it possible to change storage on a per-request basis, like: instance.field.storage = mfs; instance.field.save(name, file); but not doing it in a different branch of my codeSublapsarianism
No, since the storage engine is tied to the model. You can avoid all this by simply storing your file path in either a FilePathField or simply as plain text.Serrate
You can't just return a name. You need to remove existing file first.Seline
This solution is only seemingly correct as this solution actually removes the file already present and creates a new one with the same name. In the end, it does NOT "point to it" as the author wrote. Imagine a situation where the user wants to point to a large file but actually ends up removing it and uploading it from scratch.Chemoreceptor
W
24

try this (doc):

instance.field.name = <PATH RELATIVE TO MEDIA_ROOT> 
instance.save()
Widower answered 23/8, 2013 at 15:8 Comment(0)
P
5

It's right to write own storage class. However get_available_name is not the right method to override.

get_available_name is called when Django sees a file with same name and tries to get a new available file name. It's not the method that causes the rename. the method caused that is _save. Comments in _save is pretty good and you can easily find it opens file for writing with flag os.O_EXCL which will throw an OSError if same file name already exists. Django catches this Error then calls get_available_name to get a new name.

So I think the correct way is to override _save and call os.open() without flag os.O_EXCL. The modification is quite simple however the method is a little be long so I don't paste it here. Tell me if you need more help :)

Persuader answered 23/3, 2012 at 12:48 Comment(3)
it's 50 lines of code that you have to copy, which is pretty bad. Overriding get_available_name seems is more isolated, shorter, and much more safer for, say, upgrading to the newer versions of Django in futureHyden
The problem of only overriding get_available_name is when you upload a file with same name, the server will get into an endless loop. Since _save checks the file name and decides to get a new one however get_available_name still returns the duplicate one. So you need to override both.Persuader
Oops, we're having this discussion in two questions, but only now I noticed that they are slightly different) So I'm right in that question, and you are in this)Hyden
A
3

The answers work fine if you are using the app's filesystem to store your files. But, If your are using boto3 and uploading to sth like AWS S3 and maybe you want to set a file already existing in an S3 bucket to your model's FileField then, this is what you need.

We have a simple model class with a filefield:

class Image(models.Model):
    
    img = models.FileField()
    owner = models.ForeignKey(get_user_model(), on_delete=models.CASCADE, related_name='images')

    date_added = models.DateTimeField(editable=False)
    date_modified = models.DateTimeField(editable=True)
from botocore.exceptions import ClientError
import boto3
    
s3 = boto3.client(
    's3',
    aws_access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
    aws_secret_access_key=os.getenv("AWS_SECRET_ACCESS_KEY")
)

s3_key = S3_DIR + '/' + filename
bucket_name = os.getenv("AWS_STORAGE_BUCKET_NAME")

try:
    s3.upload_file(local_file_path, bucket_name, s3_key)
    # we want to store it to our db model called **Image** after s3 upload is complete so,
    image_data = Image()
    image_data.img.name = s3_key # this does it !!
    image_data.owner = get_user_model().objects.get(id=owner_id)
    image_data.save()
except ClientError as e:
    print(f"failed uploading to s3 {e}")

Setting the S3 KEY into the name field of the FileField does the trick. As much i have tested everything related works as expected e.g previewing the image file in django admin. fetching the images from db appends the root s3 bucket prefix (or, the cloudfront cdn prefix) to the s3 keys of the files too. Ofcourse, its given that, i already had a working setup of the django settings.py for boto and s3.

Auschwitz answered 4/5, 2021 at 0:5 Comment(0)
W
1

I had exactly the same problem! then I realize that my Models were causing that. example I hade my models like this:

class Tile(models.Model):
  image = models.ImageField()

Then, I wanted to have more the one tile referencing the same file in the disk! The way that I found to solve that was change my Model structure to this:

class Tile(models.Model):
  image = models.ForeignKey(TileImage)

class TileImage(models.Model):
  image = models.ImageField()

Which after I realize that make more sense, because if I want the same file being saved more then one in my DB I have to create another table for it!

I guess you can solve your problem like that too, just hoping that you can change the models!

EDIT

Also I guess you can use a different storage, like this for instance: SymlinkOrCopyStorage

http://code.welldev.org/django-storages/src/11bef0c2a410/storages/backends/symlinkorcopy.py

Woodsman answered 30/11, 2011 at 20:51 Comment(8)
makes sense in your case, not in mine. I don't want it to be referenced multiple times. I create an object referencing a file, then I realize there're errors in other attrs, and I reopen the creation form. On its resubmission I don't want to loose the file which is already saved on the diskSublapsarianism
so I guess you can use my approach! because you will have a table FormFile which will hold the file only then you have ! then in your Form table you`ll have an FK for that file! so You can change/create new forms for the same file! (btw I am changing the order of the FK in my main example)Woodsman
If you want to post your domain(models) in your post ! i can have a better ideia too!Woodsman
the domain actually doesn't matter - I have a model with a photo associated with it, and I have custom editing screen. once uploaded I want the photo to remain on server, but I don't actually like spawning a separate model, table and FK lookup just because the're looks to be a framework limitationSublapsarianism
The limitation here I guess is because of when you save a FileField in django, always it passes through Django Storages! so it wont make sense you just force a file path! also how Django should know that the file already exist in the path? another approach that you can use is using the FilePathField instead! so you can just set path in your DB and make the lookup the way you think is best!Woodsman
I guess I found a django-storage that could help you aim what you want, check it your my post EDIT!Woodsman
thanks for your effort, but it looks really complicating things. I should probably just subclass the ImageField (the one I actually use) and give it an option to force it not to re-save the fileSublapsarianism
The URL pointing to SymLinkOrCopy is broken. Use this instead. Also, note that there are two django-storages repositories on github: 1. jschneier/django-storages, 2. e-loue/django-storagesArrowroot
I
1

You should define your own storage, inherit it from FileSystemStorage, and override OS_OPEN_FLAGS class attribute and get_available_name() method:

Django Version: 3.1

Project/core/files/storages/backends/local.py

import os

from django.core.files.storage import FileSystemStorage


class OverwriteStorage(FileSystemStorage):
    """
    FileSystemStorage subclass that allows overwrite the already existing
    files.
    
    Be careful using this class, as user-uploaded files will overwrite
    already existing files.
    """

    # The combination that don't makes os.open() raise OSError if the
    # file already exists before it's opened.
    OS_OPEN_FLAGS = os.O_WRONLY | os.O_TRUNC | os.O_CREAT | getattr(os, 'O_BINARY', 0)

    def get_available_name(self, name, max_length=None):
        """
        This method will be called before starting the save process.
        """
        return name

In your model, use your custom OverwriteStorage

myapp/models.py

from django.db import models

from core.files.storages.backends.local import OverwriteStorage


class MyModel(models.Model):
   my_file = models.FileField(storage=OverwriteStorage())
Incus answered 29/8, 2020 at 7:13 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.