Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[Bug] MemoryError when trying create anndata object from query #3049

Open
loguille opened this issue Sep 24, 2024 · 2 comments
Open

[Bug] MemoryError when trying create anndata object from query #3049

loguille opened this issue Sep 24, 2024 · 2 comments
Assignees

Comments

@loguille
Copy link

Hello,
Thank for the great work. I would like to create an anndata object, that contain all the previous informations from my tiledb object, from a query but this lead to an error.

Here is the code that I wrote :

tiledb_object = read_tile_db_object("/home/loic/UCL_work/Tile-DB/tiledb_soma/lung_atlas_reprocessed")
 with tiledb_object.axis_query(
            measurement_name = "RNA",
            obs_query=tiledbsoma.AxisQuery(
                value_filter=("cell_type in ['Fibroblast'] and patientGroup in ['Parenchyma','Ctrl']")
            )
        ) as query:
            adata = query.to_anndata(
                X_name="counts",
                X_layers=['counts','normalized','lognormalized'],
                obsm_layers=['X_pca','X_umap'],
                varm_layers=['PCs']
            )

and the error :

MemoryError                               Traceback (most recent call last)
Cell In[6], [line 12]
      [5] query = tiledb_object.axis_query(measurement_name = "RNA",obs_query = AxisQuery(value_filter= "cell_type in ['Fibroblast'] and patientGroup in ['Parenchyma','Ctrl']"))
      [6] with tiledb_object.axis_query(
      [7]             measurement_name = "RNA",
      [8]             obs_query=tiledbsoma.AxisQuery(
      [9]                 value_filter=("cell_type in ['Fibroblast'] and patientGroup in ['Parenchyma','Ctrl']")
     [10]             )
     [11]         ) as query:
---> [12]             adata = query.to_anndata(
     [13]                 X_name="counts",
     [14]                 X_layers=['counts','normalized','lognormalized'],
     [15]                 obsm_layers=['X_pca','X_umap'],
     [16]                 varm_layers=['PCs']
     [17]             )

File ~/miniconda3/envs/tiledb_env/lib/python3.12/site-packages/somacore/query/query.py:299, in ExperimentAxisQuery.to_anndata(self, X_name, column_names, X_layers, obsm_layers, obsp_layers, varm_layers, varp_layers)
    [268](https://file+.vscode-resource.vscode-cdn.net/home/loic/UCL_work/Tile-DB/~/miniconda3/envs/tiledb_env/lib/python3.12/site-packages/somacore/query/query.py:268) def to_anndata(
    [269](https://file+.vscode-resource.vscode-cdn.net/home/loic/UCL_work/Tile-DB/~/miniconda3/envs/tiledb_env/lib/python3.12/site-packages/somacore/query/query.py:269)     self,
    [270](https://file+.vscode-resource.vscode-cdn.net/home/loic/UCL_work/Tile-DB/~/miniconda3/envs/tiledb_env/lib/python3.12/site-packages/somacore/query/query.py:270)     X_name: str,
   (...)
...
--> [554](https://file+.vscode-resource.vscode-cdn.net/home/loic/UCL_work/Tile-DB/~/miniconda3/envs/tiledb_env/lib/python3.12/site-packages/somacore/query/query.py:554) z = np.zeros(n_row * n_col, dtype=np.float32)
    [555](https://file+.vscode-resource.vscode-cdn.net/home/loic/UCL_work/Tile-DB/~/miniconda3/envs/tiledb_env/lib/python3.12/site-packages/somacore/query/query.py:555) np.put(z, idx * n_col + table["soma_dim_1"], table["soma_data"])
    [556](https://file+.vscode-resource.vscode-cdn.net/home/loic/UCL_work/Tile-DB/~/miniconda3/envs/tiledb_env/lib/python3.12/site-packages/somacore/query/query.py:556) return z.reshape(n_row, n_col)

MemoryError: Unable to allocate 118. TiB for an array with shape (32530082269608,) and data type float32

Version :
TileDB-SOMA version : 1.12.0
Python : 3.12.2
OS : Ubuntu

Thank you for your time.

Loïc Guille

@johnkerl johnkerl self-assigned this Sep 24, 2024
@johnkerl
Copy link
Member

@loguille this is fixed in TileDB-SOMA 1.14 -- can you give this a try please?

@loguille
Copy link
Author

@johnkerl sorry for the late reply, yes it work.
Many thanks for your time.

Do you know if this is possible to keep the obsp and uns array in the anndata or not ?

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants