- class xarray_mongodb.XarrayMongoDB(database, collection='xarray', *, chunk_size_bytes=261120, embed_threshold_bytes=261120, ureg=None)
Synchronous driver for MongoDB to read/write xarray objects
collection (str) – prefix of the collections to store the xarray data. Two collections will actually be created, <collection>.meta and <collection>.chunks.
chunk_size_bytes (int) – Size of the payload in a document in the chunks collection. Not to be confused with dask chunks. dask chunks that are larger than chunk_size_bytes will be transparently split across multiple MongoDB documents.
embed_threshold_bytes (int) –
Cumulative size of variable buffers that will be embedded into the metadata documents in <collection>.meta. Buffers that exceed the threshold (starting from the largest) will be stored into the chunks documents in <collection>.chunks.
Embedded variables ignore the
dask variables are never embedded, regardless of size
embed_threshold_bytes=0to force all buffers to be saved to <collection>.chunks, with the only exception of size zero non-dask variables
size zero non-dask variables are always embedded
ureg (pint.registry.UnitRegistry) – pint registry to allow putting and getting arrays with units. If omitted, it defaults to the global registry defined with
pint.set_application_registry(). If the global registry was never set, it defaults to a standard registry built with
- get(_id, load=None)
Read an xarray object back from MongoDB
Determines which variables to load immediately and which instead delay loading with dask. Must be one of:
- None (default)
Match whatever was stored with put(), including chunk sizes
Immediately load all variables into memory. dask chunk information, if any, will be discarded.
Only load indices in memory; delay the loading of everything else with dask.
- collection of str
variable names that must be immediately loaded into memory. Regardless of this, indices are always loaded. Non-existing variables are ignored. When retrieving a DataArray, you can target the data with the special hardcoded variable name
Embedded variables (see
embed_threshold_bytes) are always loaded regardless of this flag.
_id not found in the MongoDB ‘meta’ collection, or one or more chunks are missing in the ‘chunks’ collection. This error typically happens when:
documents were deleted from the database
the Delayed returned by put() was never computed
one or more chunks of the dask variables failed to compute at any point during the graph resolution
If chunks loading is delayed with dask (see ‘load’ parameter), this exception may be raised at compute() time.
passparameter is valued None, False, or does not list any variables that were backed by dask during
The dask graph (if any) underlying the returned xarray object contains full access credentials to the MongoDB server. This commands caution if one pickles it and stores it on disk, or if he sends it over the network e.g. through dask distributed.
Write an xarray object to MongoDB. Variables that are backed by dask are not computed; instead their insertion in the database is delayed. All other variables are immediately inserted.
This method automatically creates an index on the ‘chunks’ collection if there isn’t one yet.
MongoDB _id of the inserted object
dask delayed object, or None if there are no variables using dask. It must be explicitly computed in order to fully store the Dataset/DataArray on the database.
The dask future contains access full credentials to the MongoDB server. This commands caution if one pickles it and stores it on disk, or if he sends it over the network e.g. through dask distributed.
- class xarray_mongodb.XarrayMongoDBAsyncIO(database, collection='xarray', *, chunk_size_bytes=261120, embed_threshold_bytes=261120, ureg=None)
asynciodriver for MongoDB to read/write xarray objects
- exception xarray_mongodb.DocumentNotFoundError
One or more documents not found in MongoDB