Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Mv optimizer from the network level to the layer level #184

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

jvdp1
Copy link
Collaborator

@jvdp1 jvdp1 commented Jun 14, 2024

As discussed, here is a draft in which I suggst to moved the optimizer from the network level to the layer level.

This is just a draft with an implementation for the dense layer only.

Here are the wall clock times using my dataset (with 2 hidden dense layers):

v0.17.0

  • Forward + backward: 4.79s
  • Update: 4.59s

Current PR

  • Forward + backward: 4.81s
  • Update: 1.40s

@OneAdder
Copy link
Collaborator

OneAdder commented Mar 5, 2025

@jvdp1 That's actually a great idea. Apart from obvious performance gains, it can simplify code for combined layers. I will arrange everything in similar fashion in my project here: https://github.com/OneAdder/llm.f
Then we can backport it here along with implementation for all other layers

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants