Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

ipfs dag stat improvements #3955

Open
matthewrobertbell opened this issue Jun 4, 2017 · 17 comments
Open

ipfs dag stat improvements #3955

matthewrobertbell opened this issue Jun 4, 2017 · 17 comments
Labels
help wanted Seeking public contribution on this issue need/analysis Needs further analysis before proceeding

Comments

@matthewrobertbell
Copy link

Version information:

go-ipfs version: 0.4.9-
Repo version: 5
System version: amd64/darwin
Golang version: go1.8.1

Type:

Enhancement

Severity:

Low

Description:

ipfs object stat doesn't seem to work on dag objects:

➜  ~ echo '{"hello": "world"}' | ipfs dag put
zdpuAtX7ZibcWdSKQwiDCkPjWwRvtcKCPku9H7LhgA4qJW4Wk
➜  ~ ipfs object stat zdpuAtX7ZibcWdSKQwiDCkPjWwRvtcKCPku9H7LhgA4qJW4Wk
NumLinks: 0
BlockSize: 0
LinksSize: 0
DataSize: 0
CumulativeSize: 0

I was interested on getting the individual block size, and total dag size, is a dag stat command planned?

Thanks :)

@whyrusleeping
Copy link
Member

Ooooh, yeah. ipfs dag stat should probably exist. We could at a minimum include the size, number of links, maybe number of paths in the object, and print out some type information too (this is a cbor node, or this is a bitcoin block, etc)

@matthewrobertbell
Copy link
Author

Would it be possible to also include what type the object is (hash/list/integer/string etc)?

@whyrusleeping
Copy link
Member

@mattseh hrm... you mean the cbor type? That might be possible, but it would be awkward to apply that to other 'dag' types

@matthewrobertbell
Copy link
Author

I mean knowing the difference between these:

 ~ echo '["a", "b"]' | ipfs dag put
zdpuAnPcrJDq4zxgHbrT1QZxjastWCs3U8bewnfPboBVwxEE8                                                                     

~ echo '{"a": 1, "b": 2}' | ipfs dag put
zdpuAnBAFDeLpjMFUN8C5DbHkAcLtNw9FvrPbBHgwbX1hfj1C

So list vs hash. I don't have a solid usecase for it, but it might be more elegant than having to pull the object then do type() (in python) on the returned JSON in some cases.

@kevina
Copy link
Contributor

kevina commented Jun 11, 2017

@whyrusleeping this sounds like a fairly easy thing for me to do. Let me know how deep you want to go (for example should I implement @mattseh suggestion) . Should we also have a mode that prints as much as possible based on the multi-hash alone, so the node doesn't have to be downloaded the node.

@whyrusleeping
Copy link
Member

@mattseh hrm... i'm not so sure on that. Lets think through it a bit more.

@kevina Yeah, i would add most of the things above, but i'm not sure on the JSON type information bit.

@kevina kevina self-assigned this Jun 12, 2017
@kevina
Copy link
Contributor

kevina commented Jun 12, 2017

@whyrusleeping, so what exactly will be the difference between object stat and dag stat? The description for object stat says "Get stats for the DAG node named by <key>" and in fact the description of ipfs dag says "This subcommand is currently an experimental feature, but it is intended to deprecate and replace the existing 'ipfs object' command moving forward".

it sounds like the object stat should be enhanced to give more information. Or am i missing something?

@whyrusleeping
Copy link
Member

@kevina we don't want to add new functionality to the object subcommands. We are working on deprecating them in favor of the dag subcommands, the primary reason here is that making the object commands support all the dag stuff would require changing the apis. So for now, making ipfs dag stat do basically the same stuff as ipfs object stat is fine.

@kevina
Copy link
Contributor

kevina commented Jun 12, 2017

@whyrusleeping by bad, I misinterpreted the description of the ipfs dag

@whyrusleeping whyrusleeping added the help wanted Seeking public contribution on this issue label Sep 2, 2017
@whyrusleeping
Copy link
Member

@kevina still interested in working on this one?

@kevina
Copy link
Contributor

kevina commented Sep 2, 2017

@whyrusleeping Sure, do you want me to make it a priority?

@kevina
Copy link
Contributor

kevina commented Sep 2, 2017

So the reason "object stat" is not returning anything meaningful for a "cbor" object is that it calls the Stat method of the Node interface which is unimplemented in the cbornode package. However, there is this note:

type Node interface {
	...
	// TODO: not sure if stat deserves to stay
	Stat() (*NodeStat, error)
	...
}

Do we want the new command dag stat to continue to use the Stat method, or do we want to do something else?

Also (of possibly unrelated note) there is a new utility cid-fmt as part of the go-cid package which can be used to get basic information on a CID such as the fact the the CID is a chor object. For example:

$ cid-fmt prefix zdpuAnPcrJDq4zxgHbrT1QZxjastWCs3U8bewnfPboBVwxEE8
cidv1-cbor-sha2-256-32

@whyrusleeping
Copy link
Member

Hrm... we're planning on removing the Stat method from that interface entirely. I think @Stebalien will be best suited to figure out the best way forward here.

Also, on the cid-fmt tool, I wrote a similar thing a while back here: https://github.com/whyrusleeping/elcid
We should probably combine our efforts :)

@Stebalien
Copy link
Member

Yet another case where I wrote a response and then closed the window...

I agree with @whyrusleeping that we should remove Stat from the interface. For UnixFS, we'll probably want to add a replacement FSNode interface (which can provide this method along with other FSNode specific methods). For the general DAG, I'm not sure what info stat should return that's not already returned by ipfs block stat (although ipfs block stat should probably return the decoded CID).

Things we could consider including.

  1. Known inbound link count (number of nodes we have that link to this node).
  2. Number of outbound links.
  3. Number of children we have.
  4. Pinned status (direct, recursive, parents that keep it pinned etc).
  5. Whether or not we have the full recursive DAG rooted at this node.
  6. Total size of the part of the dag DAG rooted at this node that we have.

However, these are mostly gc/repo/pin related things, not properties of the node itself.

@kevina
Copy link
Contributor

kevina commented Sep 11, 2017

@Stebalien most of that information is currently not stored anywhere so will need to be computed, which is currently a very expensive operation. It may be useful for a stat command to output this information it should not be the default output.

Also what is the different from (2) and (3) does by (3) do you mean the number of children we have currently on the node?

@Stebalien
Copy link
Member

most of that information is currently not stored anywhere so will need to be computed, which is currently a very expensive operation.

I know, I'm just throwing out suggestions for information that might be useful (assuming we cached it somewhere).

Also what is the different from (2) and (3) does by (3) do you mean the number of children we have currently on the node?

Yes.


Basically, I couldn't think of any direct IPLD specific stats that might be useful that aren't already supplied by ipfs block stat (other than extended CID information but that can be added to the block stat command) so threw out some ideas for useful data.

Really, we could just make ipfs object stat an alias for ipfs block stat.

@kevina kevina added the need/analysis Needs further analysis before proceeding label Nov 28, 2017
@aschmahmann
Copy link
Contributor

aschmahmann commented Mar 9, 2021

Note that ipfs dag stat was added in #7553 and supports calculating the full (deduplicated) size of the DAG and (deduplicated) number of blocks in it. Sure it could be expensive to calculate, but we can always add optimizations + caching of the calculations in the future.

Not closing this at the moment as many of the suggestions from #3955 (comment) are still valid. They could probably be split into a new issue and then this one closed.

@lidel lidel changed the title DAG Stat? ipfs dag stat improvements Mar 21, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
help wanted Seeking public contribution on this issue need/analysis Needs further analysis before proceeding
Projects
No open projects
Status: No status
Development

No branches or pull requests

5 participants