-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Retina encoder: biological encoder for vision WIP #691
base: master
Are you sure you want to change the base?
Conversation
eye.py is a biological implementation of retina (encoder). WIP
FYI @ctrl-z-9000-times if you have some insights, please share, I'll slowly try to work this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Steps:
- make this run locally
- add a unit test
- validate the encode on MNIST classification task
- fix dependencies (incl ChannelEncoder)
- refactoring
- should inherit from an Encoder, blocked by Provide bindings for BaseEncoder for python encoders #704
- split Sensor (eye), Motor-Control? -> motor-control removed, eye accepts
position,rotation,scale. - visualizations only optional
- Parameters need work
- split
ChannelEncoder
to a standalone file?
- fix new stuff: better split parvo/magno cell coutns, ...
- fix back the sparsity for 3D (3rd root)
- fix sparsity ratio between P-M cells (3:1) -
TODO I cannot find what the ratio is- ration is approx 8:1
- replace mode:parvo/magno/both
with a p_m_ratio?, or providesparsityP
,sparsityM
?
- resolve level of motion control? (x,y,rot,scale) on micro (=saccadic step)/macro (where to look at the scene) -> using micro/saccades for this PR
Thank you Breznak for getting the ball rolling on this PR. I did some basic cleanup. You should now be able to run this locally (without installing my old research repo). The encoder is now run-able from the command line. It requires a single argument: the image file-path or directory. |
some imports for Eye/Retina encoder needed fixing for newer versions
make it a const variable
keep it only as member variable
which retinal pathway to simulate
should be 1/3rd, not 3rd-root
let's you disable/enable plotting, useful for headless mode
self.image is used for both string/data functionality
This reverts commit f259ec6.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please have a look at the progress, main changes are
- removed Sampler code
Open issues:
- how to integrate n saccadic steps into the final image SDR? (logical AND?)
- and related, output SDR seems too dense
I'm running into an import problem, |
thank you for reviewing! Addressed some of your concerns, I have more cleanups in another PR, but I broke sth there..so it's probably a good idea to get this into a shippable shape and resolve, and then do followups. |
TODOs:
|
logpolar transform vs. retinal log sampling:
Tl;DR: can we use Retina.useRetinalLogSampling instead of manual logpolar transform? |
Yes. Assuming the motion is small, the eye's output should still have semantic similarity.
No. The area outside of the ROI is lost, not reduced. The things outside of the ROI are outside of the eye's field of view. The peripheral vision needs to be included inside of the ROI. |
ok, there might be some micommunication of terms on my side, but imagine this case:
This should illustrate that even for ROI (as implemented, the area of image that gets processed, other gets lost), if:
I think to summarize,
if this is true, we need to be able to specify the ratio of fovea/peripheral better |
The ROI is the entire field of view.
This is one of the many tuning parameters, IIRC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please test this out,
esp. look at:
- scaling, how to make it work
- if custom log-polar can be replaced by Retina's
@@ -115,7 +120,10 @@ def main(parameters=default_parameters, argv=None, verbose=True): | |||
# Training Loop | |||
for i in range(len(train_images)): | |||
img, lbl = random.choice(training_data) | |||
encode(img, enc) | |||
encoder.new_image(img) | |||
(enc, _) = encoder.compute() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WIP on MNIST, not yet tuned. I should revert these changes for now.
self.retina_diameter = int(self.resolution_factor * output_diameter) | ||
# Argument fovea_scale ... proportion of the image (ROI) which will be covered (seen) by | ||
# high-res fovea (parvo pathway) | ||
self.fovea_scale = 0.177 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have previously misinterpreted this and self.scale. Rename and change this to fovea_diameter
to be clearer?
inputSize = (self.retina_diameter, self.retina_diameter), | ||
colorMode = color, | ||
colorSamplingMethod = cv2.bioinspired.RETINA_COLOR_BAYER, | ||
useRetinaLogSampling = True,) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ctrl-z-9000-times please compare with this on/off. Can it replace our manual log-polar transformation?
roi.resize( (self.retina_diameter, self.retina_diameter, 3)) | ||
|
||
# Mask out areas the eye can't see by drawing a circle boarder. | ||
# this represents the "shape" of the sensor/eye (comment out to leave rectangural) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok to crop to circular region here (and not only in the visualization)? Makes encoder see only ROI as the inner circle.
py/htm/encoders/eye.py
Outdated
|
||
# apply field of view (FOV), rotation | ||
self.roi = self._crop_roi() | ||
self.roi = self.rotate_(self.roi, self.orientation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apply rotation to the image itself, instead of separately to output, visualizations, etc
where plot was broken with frational scaling. Using cv2.resize() rather than numpy's roi.resize() fixes the issue (numerical problems)
eye.py is a biological implementation of retina (encoder).
WIP
For #682