-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Conversation
@mxnet-label-bot[pr-awaiting-review] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just one clarification.
972457 633 n07723039_1627.JPEG | ||
7534 11 n01630670_4486.JPEG | ||
1191261 249 n12407079_5106.JPEG | ||
95099 464.000000 n04467665_17283.JPEG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the decimal point really required? this is a bit weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the data.lst file is generated by im2rec.py instead of doing it manually, the label will have those decimal point. I think it would be less confused for users?
And the reason why it uses floating point is that the label value could be generated by the regression, e.g. 68.6 kg for a human body weight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
However, we need to revisit why do we need 2 versions and if required, why is there a discrepency in functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize this was already merged, but I have suggestions nonetheless.
@@ -6,35 +6,39 @@ RecordIO implements a file format for a sequence of records. We recommend storin | |||
* Packing data together allows continuous reading on the disk. | |||
* RecordIO has a simple way to partition, simplifying distributed setting. We provide an example later. | |||
|
|||
We provide the [im2rec tool](https://github.com/dmlc/mxnet/blob/master/tools/im2rec.cc) so you can create an Image RecordIO dataset by yourself. The following walkthrough shows you how. | |||
We provide the [im2rec tool](https://github.com/dmlc/mxnet/blob/master/tools/im2rec.cc) so you can create an Image RecordIO dataset by yourself. The following walkthrough shows you how. Note that there is python version of [im2rec tool](https://github.com/apache/incubator-mxnet/blob/master/tools/im2rec.py) and [example](https://mxnet.incubator.apache.org/tutorials/basic/data.html) using real-world data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We provide two tools for creating a RecordIO dataset.
- im2rec.cc - implements the tool using the C++ API.
- im2rec.py - implements the tool using the Python API.
Both provide the same output: a RecordIO dataset.
(Then take this mention and add it later for "Next Steps". I don't think you want them leaving this FAQ/tutorial quite yet.)
You may want to also review the example using real-world data with im2rec.py.
Download the data. You don't need to resize the images manually. You can use ```im2rec``` to resize them automatically. For details, see the "Extension: Using Multiple Labels for a Single Image," later in this topic. | ||
|
||
### Step 1. Make an Image List File | ||
|
||
* Note that the im2rec.py provide a param `--list` to generate the list for you but im2rec.cc doesn't support it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
provides
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you, but
@@ -315,6 +315,8 @@ print(mx.recordio.unpack_img(s)) | |||
You can also convert raw images into *RecordIO* format using the ``im2rec.py`` utility script that is provided in the MXNet [src/tools](https://github.com/dmlc/mxnet/tree/master/tools) folder. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the repo link
@@ -315,6 +315,8 @@ print(mx.recordio.unpack_img(s)) | |||
You can also convert raw images into *RecordIO* format using the ``im2rec.py`` utility script that is provided in the MXNet [src/tools](https://github.com/dmlc/mxnet/tree/master/tools) folder. | |||
An example of how to use the script for converting to *RecordIO* format is shown in the `Image IO` section below. | |||
|
|||
* Note that there is a C++ version of [im2rec](https://github.com/dmlc/mxnet/blob/master/tools/im2rec.cc), please refer to [here](https://mxnet.incubator.apache.org/faq/recordio.html) for more information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't link only "here". Provide the full description of what the link is going to.
Note that there is a C++ API implementation of im2rec
. Refer to the RecordIO FAQ for more information.
Description
Currently, we have two im2rec tools. One is python, the other one is C++. There are slightly different in terms of functionality. It helps to solve the #11884 as well.
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments
@aaronmarkham
Please let me know how can I make it less confused.