MongoDB GridFS File Sizes huge for relatively small file
Asked Answered
P

2

2

I'm doing some tests to see whether we can use GridFS on MongoDB to store files for a future application; I'm using 10gen's C# driver to "Upload" an 80Mb file onto the database.

The first addition was fine and took approx 3 seconds which isn't too bad on my test machine; however future additions of the same file took much longer, up to 30 seconds eventually MongoDB told me it ran out of memory and crashed out.

Adding 10 files, 80Mb in size results in 8 files being created for my database before the system crashes named dbaseName.0 to dbaseName.7 with their file sizes increasing exponentially from 16Mb to 512Mb from files 0 to 5 then files 6 and 7 are both 512Mb.

Those files come to just under 2Gb, obviously adding the file for the 10th time takes the dbase to over 2Gb which is beyond my 32bit test version's limit.

Why does storing 800Mb worth of files take over 2Gb? Is there a setting I've missed somewhere?

Does MongoDB hold the entire GridFS in RAM constantly? If so what's the point of the disk? If I've only got 32Gb of RAM on my production server can I only store 32Gb in GridFS?

I used EnsureIndexes on my MongoGridFS object and I checked the datbase which shows indexes were created for GridFS so surely Mongo shouldn't try and fit the whole datastore into RAM?

MongoDB fits all of our needs, but we need it to be able to hold a large file collection; am I missing something obvious?

Stack Trace:

Mon Oct 15 11:57:15 [conn15] insert busyNow.fs.chunks keyUpdates:0 locks(micros) w:112892 113ms
Mon Oct 15 11:57:15 [conn15] MapViewOfFileEx for /data/db/busyNow.7 failed with errno:8 Not enough storage is available to process this command. (file size is 536608768) in MemoryMappedFile::map

Mon Oct 15 11:57:15 [conn15]  busyNow.fs.chunks Fatal Assertion 16166
Mon Oct 15 11:57:17 [conn15] mongod.exe  ...\src\mongo\util\assert_util.cpp(124)                               mongo::fassertFailed+0x75
Mon Oct 15 11:57:17 [conn15] mongod.exe  ...\src\mongo\util\mmap_win.cpp(211)                                  mongo::MemoryMappedFile::map+0x4ce
Mon Oct 15 11:57:17 [conn15] mongod.exe  ...\src\mongo\db\mongommf.cpp(182)                                    mongo::MongoMMF::create+0xa3
Mon Oct 15 11:57:17 [conn15] mongod.exe  ...\src\mongo\db\pdfile.cpp(469)                                      mongo::MongoDataFile::open+0x141
Mon Oct 15 11:57:17 [conn15] mongod.exe  ...\src\mongo\db\database.cpp(280)                                    mongo::Database::getFile+0x34f
Mon Oct 15 11:57:17 [conn15] mongod.exe  ...\src\mongo\db\database.cpp(332)                                    mongo::Database::suitableFile+0x129
Mon Oct 15 11:57:17 [conn15] mongod.exe  ...\src\mongo\db\database.cpp(359)                                    mongo::Database::allocExtent+0x41
Mon Oct 15 11:57:17 [conn15] mongod.exe  ...\src\mongo\db\pdfile.cpp(1271)                                     mongo::outOfSpace+0x107
Mon Oct 15 11:57:17 [conn15] mongod.exe  ...\src\mongo\db\pdfile.cpp(1293)                                     mongo::allocateSpaceForANewRecord+0x5d
Mon Oct 15 11:57:17 [conn15] mongod.exe  ...\src\mongo\db\pdfile.cpp(1463)                                     mongo::DataFileMgr::insert+0x493
Mon Oct 15 11:57:17 [conn15] mongod.exe  ...\src\mongo\db\pdfile.cpp(1217)                                     mongo::DataFileMgr::insertWithObjMod+0x33
Mon Oct 15 11:57:17 [conn15] mongod.exe  ...\src\mongo\db\instance.cpp(761)                                    mongo::checkAndInsert+0x72
Mon Oct 15 11:57:17 [conn15] mongod.exe  ...\src\mongo\db\instance.cpp(821)                                    mongo::receivedInsert+0x4cd
Mon Oct 15 11:57:17 [conn15] mongod.exe  ...\src\mongo\db\instance.cpp(434)                                    mongo::assembleResponse+0x62a
Mon Oct 15 11:57:17 [conn15] mongod.exe  ...\src\mongo\db\db.cpp(192)                                          mongo::MyMessageHandler::process+0xe8
Mon Oct 15 11:57:17 [conn15] mongod.exe  ...\src\mongo\util\net\message_server_port.cpp(86)                    mongo::pms::threadRun+0x424
Mon Oct 15 11:57:17 [conn15] mongod.exe  ...\src\third_party\boost\boost\thread\detail\thread.hpp(62)          boost::detail::thread_data<boost::_bi::bind_t<void,void (__cdecl*)(mongo::MessagingPort *),boost::_bi::list1<boost::_bi::value<mongo::MessagingPort *
> > > >::run+0x9Mon Oct 15 11:57:17 [conn15] mongod.exe  ...\src\third_party\boost\libs\thread\src\win32\thread.cpp(16707566)  boost::`anonymous namespace'::thread_start_function+0x47
Mon Oct 15 11:57:17 [conn15] mongod.exe  f:\dd\vctools\crt_bld\self_x86\crt\src\threadex.c(314)                _callthreadstartex+0x1b
Mon Oct 15 11:57:17 [conn15] mongod.exe  f:\dd\vctools\crt_bld\self_x86\crt\src\threadex.c(292)                _threadstartex+0x64
Mon Oct 15 11:57:17 [conn15]

***aborting after fassert() failure


Mon Oct 15 11:58:33 [initandlisten] connection accepted from 127.0.0.1:56308 #16 (3 connections now open)
Pantheas answered 15/10, 2012 at 11:20 Comment(0)
P
5

Ok; after much searching it seems that MongoDB pre-allocates space in the exponential sized files up to 2Gb after that each file will be 2G.

http://www.mongodb.org/display/DOCS/Excessive+Disk+Space

My test program adds the 80Mb files, within the background files (.0 - .7 etc) and as the data chunks start to be written into the last file Mongo preallocates another file exponentially bigger than the last.

So the first 80Mb file, fills up the 16Mb file, the 32Mb file and the 64Mb background files and due to the meta data takes up a bit more space and must encroach slightly onto the 128Mb file, this triggers mongo to preallocate a 256Mb file totalling 496Mb; as more files are added more files are preallocated and when 2Gb is hit on my test machine Mongo can't access the space and collapses.

So although it seems one 80Mb file takes up a lot more space than it should - it makes sense in a roundabout way.

This can be turned off by running mongod with --noprealloc though this is recommended for test machines only.

Thanks for your replies!

Pantheas answered 16/10, 2012 at 10:23 Comment(0)
N
0

GridFS does not store all the files only in RAM.

Do you have the stacktrace or can you reproduce the crash again?

Nicks answered 15/10, 2012 at 11:40 Comment(3)
I've edited my original question to include a stack trace. Thanks.Pantheas
The problem is that in a 32bit system you can't allocate more than 2GB files. --smallfiles limits the datafiles to 500mb, so that should help. Though it's not recommended to run MongoDB in 32bit system. More info here mongodb.org/display/DOCS/32+bitNicks
The 32 bit version is simply on my test machine; I'm running the 64 bit version on our server, however the individual files I'm using are only 80Mb, nowhere near 2Gb so I don't know why Mongo can't seem to handle them.Pantheas

© 2022 - 2024 — McMap. All rights reserved.