|
| 1 | +.. _ruby-gridfs: |
| 2 | + |
| 3 | +================================= |
| 4 | +Store Large Files by Using GridFS |
| 5 | +================================= |
| 6 | + |
| 7 | +.. contents:: On this page |
| 8 | + :local: |
| 9 | + :backlinks: none |
| 10 | + :depth: 1 |
| 11 | + :class: singlecol |
| 12 | + |
| 13 | +.. facet:: |
| 14 | + :name: genre |
| 15 | + :values: reference |
| 16 | + |
| 17 | +.. meta:: |
| 18 | + :keywords: binary large object, blob, storage |
| 19 | + |
| 20 | +Overview |
| 21 | +-------- |
| 22 | + |
| 23 | +In this guide, you can learn how to store and retrieve large files in |
| 24 | +MongoDB by using **GridFS**. GridFS is a specification that describes how to split files |
| 25 | +into chunks when storing them and reassemble those files when retrieving them. The {+driver-short+}'s |
| 26 | +implementation of GridFS is an abstraction that manages the operations and organization of |
| 27 | +the file storage. |
| 28 | + |
| 29 | +Use GridFS if the size of your files exceeds the BSON document |
| 30 | +size limit of 16MB. For more detailed information on whether GridFS is |
| 31 | +suitable for your use case, see :manual:`GridFS </core/gridfs>` in the |
| 32 | +{+mdb-server+} manual. |
| 33 | + |
| 34 | +The following sections describe GridFS operations and how to |
| 35 | +perform them. |
| 36 | + |
| 37 | +How GridFS Works |
| 38 | +---------------- |
| 39 | + |
| 40 | +GridFS organizes files in a **bucket**, a group of MongoDB collections |
| 41 | +that contain the chunks of files and information describing them. The |
| 42 | +bucket contains the following collections, named using the convention |
| 43 | +defined in the GridFS specification: |
| 44 | + |
| 45 | +- The ``chunks`` collection stores the binary file chunks. |
| 46 | +- The ``files`` collection stores the file metadata. |
| 47 | + |
| 48 | +When you create a new GridFS bucket, the driver creates the ``fs.chunks`` and ``fs.files`` |
| 49 | +collections, unless you specify a different name in the ``Grid::FSBucket.new`` method options. The |
| 50 | +driver also creates an index on each collection to ensure efficient retrieval of the files and related |
| 51 | +metadata. The driver creates the GridFS bucket, if it doesn't exist, only when the first write |
| 52 | +operation is performed. The driver creates indexes only if they don't exist and when the |
| 53 | +bucket is empty. For more information about |
| 54 | +GridFS indexes, see :manual:`GridFS Indexes </core/gridfs/#gridfs-indexes>` |
| 55 | +in the {+mdb-server+} manual. |
| 56 | + |
| 57 | +When storing files with GridFS, the driver splits the files into smaller |
| 58 | +chunks, each represented by a separate document in the ``chunks`` collection. |
| 59 | +It also creates a document in the ``files`` collection that contains |
| 60 | +a file ID, file name, and other file metadata. You can upload the file from |
| 61 | +memory or from a stream. The following diagram shows how GridFS splits |
| 62 | +the files when they're uploaded to a bucket. |
| 63 | + |
| 64 | +.. figure:: /includes/figures/GridFS-upload.png |
| 65 | + :alt: A diagram that shows how GridFS uploads a file to a bucket |
| 66 | + |
| 67 | +When retrieving files, GridFS fetches the metadata from the ``files`` |
| 68 | +collection in the specified bucket and uses the information to reconstruct |
| 69 | +the file from documents in the ``chunks`` collection. You can read the file |
| 70 | +into memory or output it to a stream. |
| 71 | + |
| 72 | +Create a GridFS Bucket |
| 73 | +---------------------- |
| 74 | + |
| 75 | +To store or retrieve files from GridFS, create a GridFS bucket by calling the |
| 76 | +``FSBucket.new`` method and passing in a ``Mongo::Database`` instance. |
| 77 | +You can use the ``FSBucket`` instance to |
| 78 | +perform read and write operations on the files in your bucket. |
| 79 | + |
| 80 | +.. literalinclude:: /includes/write/gridfs.rb |
| 81 | + :language: ruby |
| 82 | + :dedent: |
| 83 | + :start-after: start-create-bucket |
| 84 | + :end-before: end-create-bucket |
| 85 | + |
| 86 | +To create or reference a bucket with a name other than the default name |
| 87 | +``fs``, pass the bucket name as an optional parameter to the ``FSBucket.new`` |
| 88 | +constructor, as shown in the following example: |
| 89 | + |
| 90 | +.. literalinclude:: /includes/write/gridfs.rb |
| 91 | + :language: ruby |
| 92 | + :dedent: |
| 93 | + :start-after: start-create-custom-bucket |
| 94 | + :end-before: end-create-custom-bucket |
| 95 | + |
| 96 | +Upload Files |
| 97 | +------------ |
| 98 | + |
| 99 | +The ``upload_from_stream`` method reads the contents of an |
| 100 | +upload stream and saves it to the ``GridFSBucket`` instance. |
| 101 | + |
| 102 | +You can pass a ``Hash`` as an optional parameter to configure the chunk size or include |
| 103 | +additional metadata. |
| 104 | + |
| 105 | +The following example uploads a file into ``FSBucket`` and specifies metadata for the |
| 106 | +uploaded file: |
| 107 | + |
| 108 | +.. literalinclude:: /includes/write/gridfs.rb |
| 109 | + :language: ruby |
| 110 | + :dedent: |
| 111 | + :start-after: start-upload-files |
| 112 | + :end-before: end-upload-files |
| 113 | + |
| 114 | +Retrieve File Information |
| 115 | +------------------------- |
| 116 | + |
| 117 | +In this section, you can learn how to retrieve file metadata stored in the |
| 118 | +``files`` collection of the GridFS bucket. The metadata contains information |
| 119 | +about the file it refers to, including: |
| 120 | + |
| 121 | +- The ``_id`` of the file |
| 122 | +- The name of the file |
| 123 | +- The size of the file |
| 124 | +- The upload date and time |
| 125 | +- A ``metadata`` document in which you can store any other information |
| 126 | + |
| 127 | +To learn more about fields you can retrieve from the ``files`` collection, see the |
| 128 | +:manual:`GridFS Files Collection </core/gridfs/#the-files-collection>` documentation in the |
| 129 | +{+mdb-server+} manual. |
| 130 | + |
| 131 | +To retrieve files from a GridFS bucket, call the ``find`` method on the ``FSBucket`` |
| 132 | +instance. The following code example retrieves and prints file metadata from all files in |
| 133 | +a GridFS bucket: |
| 134 | + |
| 135 | +.. literalinclude:: /includes/write/gridfs.rb |
| 136 | + :language: ruby |
| 137 | + :dedent: |
| 138 | + :start-after: start-retrieve-file-info |
| 139 | + :end-before: end-retrieve-file-info |
| 140 | + |
| 141 | +To learn more about querying MongoDB, see :ref:`<ruby-retrieve>`. |
| 142 | + |
| 143 | +Download Files |
| 144 | +-------------- |
| 145 | + |
| 146 | +The ``download_to_stream`` method downloads the contents of a file. |
| 147 | + |
| 148 | +To download a file by its file ``_id``, pass the ``_id`` to the method. The ``download_to_stream`` |
| 149 | +method writes the contents of the file to the provided object. |
| 150 | +The following example downloads a file by its file ``_id``: |
| 151 | + |
| 152 | +.. literalinclude:: /includes/write/gridfs.rb |
| 153 | + :language: ruby |
| 154 | + :dedent: |
| 155 | + :start-after: start-download-files-id |
| 156 | + :end-before: end-download-files-id |
| 157 | + |
| 158 | +If you a file's name but not its ``_id``, you can use the ``download_to_stream_by_name`` |
| 159 | +method. The following example downloads a file named ``mongodb-tutorial``: |
| 160 | + |
| 161 | +.. literalinclude:: /includes/write/gridfs.rb |
| 162 | + :language: ruby |
| 163 | + :dedent: |
| 164 | + :start-after: start-download-files-name |
| 165 | + :end-before: end-download-files-name |
| 166 | + |
| 167 | +.. note:: |
| 168 | + |
| 169 | + If there are multiple documents with the same ``filename`` value, |
| 170 | + GridFS fetches the most recent file with the given name (as |
| 171 | + determined by the ``uploadDate`` field). |
| 172 | + |
| 173 | +Delete Files |
| 174 | +------------ |
| 175 | + |
| 176 | +Use the ``delete`` method to remove a file's collection document and associated |
| 177 | +chunks from your bucket. You must specify the file by its ``_id`` field rather than its |
| 178 | +file name. |
| 179 | + |
| 180 | +The following example deletes a file by its ``_id``: |
| 181 | + |
| 182 | +.. literalinclude:: /includes/write/gridfs.rb |
| 183 | + :language: ruby |
| 184 | + :dedent: |
| 185 | + :start-after: start-delete-files |
| 186 | + :end-before: end-delete-files |
| 187 | + |
| 188 | +.. note:: |
| 189 | + |
| 190 | + The ``delete`` method supports deleting only one file at a time. To |
| 191 | + delete multiple files, retrieve the files from the bucket, extract |
| 192 | + the ``_id`` field from the files you want to delete, and pass each value |
| 193 | + in separate calls to the ``delete`` method. |
| 194 | + |
| 195 | +API Documentation |
| 196 | +----------------- |
| 197 | + |
| 198 | +To learn more about using GridFS to store and retrieve large files, |
| 199 | +see the following API documentation: |
| 200 | + |
| 201 | +- `Mongo::Grid::FSBucket <{+api-root+}/Mongo/Grid/FSBucket.html>`__ |
0 commit comments