Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Function arguments not indented correctly #221

Open
DavHau opened this issue Feb 28, 2022 · 14 comments
Open

Function arguments not indented correctly #221

DavHau opened this issue Feb 28, 2022 · 14 comments
Labels
component | style Modifications to the formatting rules status | it is a good thing We agree it is good to implement this type | feature request New feature or request

Comments

@DavHau
Copy link

DavHau commented Feb 28, 2022

I have re-run alejandra against my code-base and have to say: This formatter is simply amazing. Thanks for putting in the effort and not getting tired by all the bikeshedding.

Another thing I noticed is that whenever function arguments are listed horizontally, the way alejandra changes the indentation doesn't feel right.

this

{
  foo =
    someFunction
      arg1
      arg2
      arg3;
}

results in this:

{
  foo =
    someFunction
    arg1
    arg2
    arg3;
}

This doesn't look right to me. For this simple example it might be irrelevant, but I think for reading complex code it is important that arguments passed to a function are indented relative to the function name.

A more complex example where it gets clear why I find the current behavior undesirable:

input:

''
  otherModules=${
    pkgs.writeText "other-modules.json"
    (l.toJSON
      (l.mapAttrs
        (pname: subOutputs:
          let
            pkg = subOutputs.packages."${pname}".overrideAttrs (old: {
              buildScript = "true";
              installMethod = "copy";
            });
          in
            "${pkg}/lib/node_modules/${pname}/node_modules")
        outputs.subPackages))
  }
''

From the alignment it is obvious, that there is one argument passed to toJSON and two arguments passed to mapAttrs.

alejandra output:

''
  otherModules=${
    pkgs.writeText "other-modules.json"
    (l.toJSON
    (l.mapAttrs
    (pname: subOutputs: let
      pkg = subOutputs.packages."${pname}".overrideAttrs (old: {
        buildScript = "true";
        installMethod = "copy";
      });
    in "${pkg}/lib/node_modules/${pname}/node_modules")
    outputs.subPackages))
  }
''

Indentation wise, it is not obvious anymore which lines/blocks are an arguments for toJSON and which are arguments for mapAttrs.

@kamadorueda kamadorueda added component | style Modifications to the formatting rules status | it is a good thing We agree it is good to implement this type | feature request New feature or request labels Feb 28, 2022
@kamadorueda
Copy link
Owner

Thank you, there are many things going on here, let's improve step by step

I believe I finally found the quasi-perfect heuristic for tight vs loose parentheses:

  • loose when it has comments:

    (
      # comment
      expression
    )
  • loose when something between the second line (inclusive) and penultimate line (inclusive) does not begin with whitespace:

    (
      function
      argument
      argument
    )
    (
      a: b: let
        a = 123;
      in
        x
    )
  • tight otherwise:

    (function
    argument)
    (function {
      a = 123;
    })

@kamadorueda
Copy link
Owner

Using the new parenthesis heuristics the code is looking better:

''
  otherModules=${
    pkgs.writeText "other-modules.json"
    (
      l.toJSON
      (
        l.mapAttrs
        (pname: subOutputs: let
          pkg = subOutputs.packages."${pname}".overrideAttrs (old: {
            buildScript = "true";
            installMethod = "copy";
          });
        in "${pkg}/lib/node_modules/${pname}/node_modules")
        outputs.subPackages
      )
    )
  }
''

There are some things to improve yet, but now we can at least tell who is argument of who

@DavHau
Copy link
Author

DavHau commented Mar 2, 2022

Oh then I think i misunderstood your description of the heuristic in the earlier post.
I'm not sure if this is an optimal result. It wastes 4 lines with opening and closing parentheses.
This change goes against the last improvement done by you in #63.
Maybe I'm wrong, but I think it's unnecessary to waste new lines for opening/closing brackets, as I tried explaining in #63 (comment)

@kamadorueda
Copy link
Owner

One step at a time my friend

@B4dM4n
Copy link

B4dM4n commented Mar 2, 2022

I agree, the code with the new parentheses rules is much more readable, but argument indentation should also be present without parentheses.

All function arguments that do not start on the same line as the function should be indented one level more than the "parent" element.

So this

function
argument

should look more like

function
  argument

But it gets tricky once multiline arguments are involved.

Once a simple argument follows a multiline argument, all arguments including the multiline argument should either:

  • be indented one level
  • be moved to the next line and therefore indented.

Variant 1 - Indent

--- current.nix
+++ expected.nix
@@ -1,17 +1,17 @@
 {
   name1 = function arg {
     asdf = 1;
   };

   name2 = function arg {
-    asdf = 1;
-  }
-  argument;
+      asdf = 1;
+    }
+    argument;

   name3 = function arg {
-    asdf = 1;
-  } {
-    qwer = 12345;
-  }
-  argument;
+      asdf = 1;
+    } {
+      qwer = 12345;
+    }
+    argument;
 }

Variant 2 - Move to new line

--- current.nix
+++ expected.nix
@@ -1,17 +1,19 @@
 {
   name1 = function arg {
     asdf = 1;
   };

-  name2 = function arg {
-    asdf = 1;
-  }
-  argument;
+  name2 = function arg
+    {
+      asdf = 1;
+    }
+    argument;

-  name3 = function arg {
-    asdf = 1;
-  } {
-    qwer = 12345;
-  }
-  argument;
+  name3 = function arg
+    {
+      asdf = 1;
+    } {
+      qwer = 12345;
+    }
+    argument;
 }

I currently would prefer variant 1, because it gives the user the option on which line the multiline argument starts.

@DavHau
Copy link
Author

DavHau commented Mar 2, 2022

But it gets tricky once multiline arguments are involved.

The reason it is tricky is because the formatting is not optimal. Allow me to demonstrate why.

compare this (your variant 1):

 {
   name1 = function arg {
     asdf = 1;
   };

   name2 = function arg {
      asdf = 1;
    }
    argument;

   name3 = function arg {
      asdf = 1;
    } {
      qwer = 12345;
    }
    argument;
 }

... to this:

{
  name1 =
    function
      arg
      {asdf = 1;};

  name2 =
    function
      arg
      {asdf = 1;}
      argument;

  name3 =
    function
      arg
      {asdf = 1;}
      {qwer = 12345;}
      argument;
}

With the second formatting, it is clear on first glance, that there are 3 layers of nesting:
variable assignment <-- function call <-- function arguments.

You can also figure (within 1 second or less):

  • which elements are functions and which are arguments
  • how many functions are called
  • how many arguments are passed to each function

For the first code snippet, all of these points are harder to tell, and require reading and understanding the code more. The main problem with snippet 1 is that elements which have a relation to each other are spread both horizontally and vertically and are not aligned consistently. The human eye is not good at processing several dimension at once quickly.

The rules that leads to the second example are:

  • always start multi-line statements at a new line. (never in the same line as the =)
  • elements which are similar to each other (like func args) , are listed either vertically or horizontally, not both
  • consistently indent elements according to their relation. (func args are a child of the function, therefore indent +1)

I think, applying these 3 rules makes nix code significantly more readable and the whole discussion about where to put brackets becomes much simpler. Each element has a very clear place where it belongs.
Oh, and the snippets are missing an example with a multi line attrset. So here it is:

{
  name4 =
    function
      arg
      {asdf = 1;}
      {
        qwer = 12345;
        qwer2 = 54321;
      }
      argument;
}

@B4dM4n
Copy link

B4dM4n commented Mar 2, 2022

With the second formatting, it is clear on first glance, that there are 3 layers of nesting: variable assignment <-- function call <-- function arguments.

Agreed, compared to my suggestion, which was an incremental improvement to the current layout, your suggestion is very clear and easy to comprehend.

But I also see a risk of blowing up code so much, that many lines will only contain a single word and many spaces. This probably makes it easier to read the code on mobile devices, but regular PC monitors will look very empty with a long scrollbar, which can make it harder to reason about bigger files.

  • always start multi-line statements at a new line. (never in the same line as the =)
  • elements which are similar to each other (like func args) , are listed either vertically or horizontally, not both

Maybe there should be a few exceptions to these rules, to avoid the issue above. I'm thinking about NixOS modules which make heavy usage of functions with multiline arguments, e.g. mkOption or mkIf. Or the nixpkgs trivial builders, like writeTextFile.
Forcing these functions onto multiple lines with one or two additional indentation levels compared to the current style would introduce a considerable amount of whitespaces.

Reading a mkIf like this is very comprehensible to me, but maybe I just got used to function calls like this.

{
  config = mkIf config.service.name.enable {
    ...
  };
}

So an exception rule could be, keep single line function calls with an optional multiline argument at the end as is.

When the user deems such a function invocation as too complex, any newline inserted before or after the function name would cause alejandra to fallback to your rules and indent the whole invocation.

@DavHau
Copy link
Author

DavHau commented Mar 2, 2022

So an exception rule could be, keep single line function calls with an optional multiline argument at the end as is.

When the user deems such a function invocation as too complex, any newline inserted before or after the function name would cause alejandra to fallback to your rules and indent the whole invocation.

That sounds like a good idea. But what about the case when there are more arguments coming after the multi-line argument?

I think, only option3 here is nice:

{
  # this is not good (`lastArg` not indented relative to `name4`)
  option1 = function arg {asdf = 1;} {
    qwer = 12345;
    qwer2 = 54321;
  }
  lastArg;

  # This is also weird
  option2 = function arg {asdf = 1;} {
      qwer = 12345;
      qwer2 = 54321;
    }
    lastArg;

  # But this is OK:
  option3 = function arg {asdf = 1;}
    {
      qwer = 12345;
      qwer2 = 54321;
    }
    lastArg;
}

@B4dM4n
Copy link

B4dM4n commented Mar 2, 2022

But what about the case when there are more arguments coming after the multi-line argument?

I don't think additional arguments after the multi-line argument should be covered by the exception rule and therefore fallback to the three rules (like the name4 variant).

Only when there is one multi-line argument and it is the last one should the config = mkIf style from my last post be used.

--- current.nix
+++ expected.nix
@@ -1,42 +1,57 @@
 {
   example1 = function arg {asdf = 1;} {
     qwer = 12345;
     qwer2 = 54321;
   };

   # not all arguments start on the same line
-  example2 = function arg {asdf = 1;}
-    {
-      qwer = 12345;
-      qwer2 = 54321;
-    };
+  example2 =
+    function
+      arg
+      {asdf = 1;}
+      {
+        qwer = 12345;
+        qwer2 = 54321;
+      };

   # same as example2
-  example3 = function
-    arg {asdf = 1;}
-    {
-      qwer = 12345;
-      qwer2 = 54321;
-    };
+  example3 =
+    function
+      arg
+      {asdf = 1;}
+      {
+        qwer = 12345;
+        qwer2 = 54321;
+      };

   # multiple multi-line arguments
-  example4 = function arg {
-      qwer = 12345;
-      qwer2 = 54321;
-    } ''
-      text
-    '';
+  example4 =
+    function
+      arg
+      {
+        qwer = 12345;
+        qwer2 = 54321;
+      }
+      ''
+        text
+      '';

   # additional new-line before function name
   example5 =
-    function arg {asdf = 1;} {
-      qwer = 12345;
-      qwer2 = 54321;
-    };
+    function
+      arg
+      {asdf = 1;}
+      {
+        qwer = 12345;
+        qwer2 = 54321;
+      };

   # argument after the multi-line argument
-  example6 = function {
-    qwer = 12345;
-    qwer2 = 54321;
-  } arg;
+  example6 =
+    function
+      {
+        qwer = 12345;
+        qwer2 = 54321;
+      }
+      arg;
 }

@kamadorueda
Copy link
Owner

kamadorueda commented Mar 2, 2022

@B4dM4n @DavHau Thank you, I believe we finally converged into a good design

Below you'll find how Alejandra @ 2842bdc formats the code snippets you posted here

I believe we all can be happy with this

[
  (a
    b)

  ''
    otherModules=${
      pkgs.writeText "other-modules.json"
      (l.toJSON
        (l.mapAttrs
          (pname: subOutputs: let
            pkg = subOutputs.packages."${pname}".overrideAttrs (old: {
              buildScript = "true";
              installMethod = "copy";
            });
          in "${pkg}/lib/node_modules/${pname}/node_modules")
          outputs.subPackages))
    }
  ''

  {
    name1 =
      function
      arg
      {asdf = 1;};

    name2 =
      function
      arg
      {asdf = 1;}
      argument;

    name3 =
      function
      arg
      {asdf = 1;}
      {qwer = 12345;}
      argument;

    name1 = function arg {
      asdf = 1;
    };

    name2 =
      function arg {
        asdf = 1;
      }
      argument;

    name3 =
      function arg {
        asdf = 1;
      } {
        qwer = 12345;
      }
      argument;

    name4 =
      function
      arg
      {asdf = 1;}
      {
        qwer = 12345;
        qwer2 = 54321;
      }
      argument;

    option1 =
      function arg {asdf = 1;} {
        qwer = 12345;
        qwer2 = 54321;
      }
      lastArg;

    option2 =
      function arg {asdf = 1;} {
        qwer = 12345;
        qwer2 = 54321;
      }
      lastArg;

    option3 =
      function arg {asdf = 1;}
      {
        qwer = 12345;
        qwer2 = 54321;
      }
      lastArg;
  }
]

The only piece of feedback I don't think is right to implement as the moment is this:

{
  foo =
    someFunction # Indent arguments of this function
      arg1
      arg2
      arg3;
}

I'll have it loaded in RAM, low priority for now

@tomberek
Copy link
Contributor

tomberek commented Mar 4, 2022

Would

{
  foo = someFunction
    arg1
    arg2
    arg3;
}

be consistent? It would separate the function from the args, reduce newline and indents that don't add meaning. If the "someFunction" is too long of an expression then one would switch to the Tall representation.

Also perhaps related is the common idiom of merging extra attributes into an attrset like this

{
  foo = someFunction {
    a = 2;
    b = 3;
  } // {};   #<--- add/remove the "// {} ;" or place some content into it.
}

When modifying, adding or removing that update operator the "someFunction" jumps up and down in surprising ways.

@adrian-gierakowski
Copy link

amen to this: #221 (comment) @DavHau

it's amazing to see someone else articulate something you've arrived yourself at over the years :D

Do you have any opinion regarding @tomberek's suggestion:

Would

{
  foo = someFunction
    arg1
    arg2
    arg3;
}

be consistent?

@adrian-gierakowski
Copy link

@tomberek

you can do the following the prevent the "jumping" when adding/removing things:

{
  foo =
    someFunction
    {
      a = 2;
      b = 3;
    }
    // {}; #<--- add/remove the "// {} ;" or place some content into it.
}

It also makes removing easier since you can simply place your cursor on the line and press a "delete line" keyboard shortcut (well, you'd need this to be implemented first, since currently that would remove the semicolon and break the code)

@adrian-gierakowski
Copy link

The only piece of feedback I don't think is right to implement as the moment is this:

{
  foo =
    someFunction # Indent arguments of this function
      arg1
      arg2
      arg3;
}

I'll have it loaded in RAM, low priority for now

@kamadorueda do you not think this to b valid suggestion or you simply have other things to do which you consider higher priority?

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
component | style Modifications to the formatting rules status | it is a good thing We agree it is good to implement this type | feature request New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants