Skip to content

Support for async visitors #8

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
tomasbonco opened this issue Apr 24, 2020 · 18 comments
Closed

Support for async visitors #8

tomasbonco opened this issue Apr 24, 2020 · 18 comments
Labels
🙋 no/question This does not need any changes

Comments

@tomasbonco
Copy link

In my opinion, this should be working:

await visit( tree, 'code', visitor )

function visitor( node, index, parent ) {
    return new Promise( resolve => {
        const removeNode = () => {
            parent.children.splice( index, 1 );
            resolve()
        }
		
        setTimeout( removeNode, 10 );
    })
}

Another (more abstract) example:

await visit( tree, 'code', visitor )

async function visitor( node, index, parent ) {
   node.value = await someAsyncStuff();
}

Background

Visitor might do:

  1. read a node
  2. perform an action based on the node value/props (reading/writing files, do a HTTP request, ...)
  3. modify the tree based on the result of async operation

However, currently, only sync operations are allowed. That's why it cannot be used for async transformers, which should be supported by unified.

@tomasbonco tomasbonco added 🐛 type/bug This is a problem 🙉 open/needs-info This needs some more info labels Apr 24, 2020
@wooorm
Copy link
Member

wooorm commented Apr 24, 2020

Can you go a level higher? What do you practically want to do?

@tomasbonco
Copy link
Author

I'm doing a live preview. I find code nodes, do the compilation, save the content to the file, then replace the code node for a custom node (transforms into a <script> tag) pointing to the file. Then I parse it with rehype.
Yes, currently it does "work" because I return a future filename immediately, but I'm unable to catch errors and propagate them "higher".

@tomasbonco
Copy link
Author

I was thinking about it just a little bit more and a workaround could look like this:

async function transform( tree ) {
    const matches = [];

    visit( tree, 'code', visitor )

    function visitor( node, index, parent ) {
        matches.push({ node, index, parent })
    }

    const promises = matches.map( async match => {
        const filename = await someAsyncStuff( match.node.value )
        // ... modify tree ...
    })

    await Promise.all( promises );
    return tree;
}

But I'm not sure, how exactly does visit work, because this could cause issues with removing notes (indexes might get out of sync). But it might be a case anyway.

@wooorm
Copy link
Member

wooorm commented Apr 24, 2020

I think your new example is indeed (almost) the way to go: otherwise, you would do a waterfall. You’d only be able to process serially, but with that new approach, you can send out all requests immediately!

The point about the indexes is true! But you can work around that: you need the parent, and the node. And then you can do parent.children[parent.children.indexOf(node)] = replacement!

@tomasbonco
Copy link
Author

tomasbonco commented Apr 25, 2020

Of course one can do indexOf oneself. I was thinking about the drawbacks of the solution ... whether the request is still valid. Now I think a wrapper function is more suitable. So I'll just leave here the code, in case someone else would be searching for this:

import visit from 'unist-util-visit'

export async function visitAsync( tree, matcher, asyncVisitor )
{
    const matches = [];
    visit( tree, matcher, (...args) => { matches.push( args ); return tree } )

    const promises = matches.map( match => asyncVisitor( ...match ) )
    await Promise.all( promises );

    return tree;
}

Example usage:

await visitAsync( tree, 'code', visitor )

async function visitor( node, index, parent )
{
    // ! Note: Index is not guaranteed, in case of adding/removing nodes
    // ... async stuff here ...
}

Thanks for your assistance!

@wooorm
Copy link
Member

wooorm commented Apr 28, 2020

If you’re making this into a utility, then it would make sense to correct index in there, no?

I think this is the solution to go with, no need to change this project. Thanks!

@wooorm wooorm closed this as completed Apr 28, 2020
@tomasbonco
Copy link
Author

You cannot really correct the index, can you? You would have to do it after the async stuff, but that is out of your control. Or am I missing something?

I'm not going to make it into utility, but please feel free to do it yourself.

@wooorm
Copy link
Member

wooorm commented Apr 29, 2020

Oh, right, I was thinking this would somehow run the promises serially instead of all at once, in which case parent.indexOf(node) would do the trick.

If you can’t make index correct, I suggest not passing it, e.g.,:

visit( tree, matcher, (node, _, parent) => { matches.push([node, parent]); } )

Btw return tree in the visitor shouldn’t be there.

@tomasbonco
Copy link
Author

You are right about index. I kept it there because in my case, I was replacing nodes, so it was convenient. However, without return tree, visit() was reporting several notes multiple times.

@wooorm
Copy link
Member

wooorm commented Apr 29, 2020

The last part is weird! The return value of a node isn’t handled: https://github.com/syntax-tree/unist-util-visit-parents#returns 🤔 Maybe your arrow function was without braces, with push, and thus return an index (push returns a number)?

@tomasbonco
Copy link
Author

That is exactly what happened. I wasn't aware of that behavior. Thank you!

@wooorm wooorm added 🙋 no/question This does not need any changes and removed 🐛 type/bug This is a problem 🙉 open/needs-info This needs some more info labels Jul 23, 2020
@shtse8
Copy link

shtse8 commented Jun 22, 2022

any updates about async visitor? it is a big need!

@wooorm
Copy link
Member

wooorm commented Jun 22, 2022

No, it is not needed. It is an anti pattern and it is very slow.

@janaka
Copy link

janaka commented Feb 1, 2023

UPDATE: Came across the[next()](https://github.com/unifiedjs/unified#function-nexterr-tree-file) param when I carefully scanned the docs again. It was the missing piece for me from @tomasbonco example ^ which helped as a hint.

The key is to call next() once when all is complete. I'm doing that by tracking the promises in an array and calling next() once when all the promises have resolved, using Promise.all().

I'll try and write something up and make a contribution to improve the docs.


It's not clear to me how one transforms a doc using async code. I need to process markdown docs in order to translate code blocks. Given there isn't a plugin to do this I'm trying to write one. In the transformer plugin I need to do something like node.value = await thirdpartyTranslateFunc(node.value). The third party convert function is async. My code looks like below. This works when I change node.value with sync code. With async it seems the compiler completes before the async code finishes.

How do I solve for this?

I'm probably missing something about how unified works.

const myvisitor = (node: any) => {
  if (node.lang && node.lang == 'X') {
    // run async code to convert from X to Y
    // set node.value to Y
  } else if (node.lang == null) {
    node.lang = 'shell-session'
  }
  //console.log(node);
  //console.log("====");
}

export const mycodeconverterPlugin = (_options: any, _set: any) => {
  return (tree: any, _file: any) => {
    visit(tree, 'code', myvisitor);
  };
}

const buffer = fs.readFileSync('something-file.markdown');
const s = await remark()
  .use([mycodeconverterPlugin])
  .process(buffer);

@wooorm
Copy link
Member

wooorm commented Feb 2, 2023

@janaka You don’t need next. That’s callback style. Which sure you can use. But you can also return promises.

As for how to do async work, split the work up in two stages: a) find things, b) transform things. Such as like this:

/**
 * @typedef {import('mdast').Root} Root
 * @typedef {import('mdast').Code} Code
 */

import fs from 'node:fs/promises'
import {remark} from 'remark'
import {visit} from 'unist-util-visit'

const file = await remark()
  .use(remarkConvertMyCode)
  .process(await fs.readFile('something-file.markdown'))

console.log(String(file))

/**
 * @type {import('unified').Plugin<[], Root>}
 */
function remarkConvertMyCode() {
  return async function (tree) {
    /** @type {Array<Code>} */
    const codes = []

    visit(tree, (node) => {
      if (node.type === 'code') {
        codes.push(node)
      }
    })

    /** @type {Array<Promise<void>>} */
    const promises = []

    for (const code of codes) {
      if (code.lang && code.lang === 'X') {
        // Run async code to convert from X to Y
        // set node.value to Y
        promises.push(somePromise)
      } else if (code.lang === null) {
        code.lang = 'shell-session'
      }
    }

    await Promise.all(promises)
  }
}

@janaka
Copy link

janaka commented Feb 2, 2023

@wooorm ah, I see, cool. I was close but can clean up my code. Thanks for the example.

@a-hariti
Copy link

a-hariti commented Mar 6, 2024

Thank you @wooorm.

This is a much needed example, it should probably figure on the main README somewhere

@wooorm
Copy link
Member

wooorm commented Mar 7, 2024

I’m not sure a longer readme will improve this, what I believe to be, a more general “how to do async programming”, question (see my comment “split the work up in two stages: a) find things, b) transform things”)
The cases in this issue are each more practical, specific to a particular use case (code highlighting for example). I’m not sure how useful adding several examples of practical cases will be to readers with different use cases.

You can also recommend recipes to https://unifiedjs.com/learn/.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
🙋 no/question This does not need any changes
Development

No branches or pull requests

5 participants