You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This task is not fully formed yet, but the idea is to take one step towards supporting audio data.
In the case of text data, tokenization is used to split it into tokens and pass it into the model.
In the case of image data, we resize, convert to ndarray, etc., then pass it into the model.
In the case of audio data???
That is the question this issue looks to answer.
The user should be able to pass in bytes of an audio file and we read and do basic processing (it is fine if this is not model-specific, just generic stuff) that leads to the input type supported by the model.
The information provided doesn't cover all the bits of code that need modification to make the introduction, so feel free to do what is needed.
Out of scope: Model Support, Specific Preprocessing
Focus should be on being able to get audio into the format that can be sent into models using the most minimal processing possible. For example, in the case of image processing, this would mean simply resizing the image and converting to NdArray.
The text was updated successfully, but these errors were encountered:
This task is not fully formed yet, but the idea is to take one step towards supporting audio data.
In the case of text data, tokenization is used to split it into tokens and pass it into the model.
In the case of image data, we resize, convert to ndarray, etc., then pass it into the model.
In the case of audio data???
That is the question this issue looks to answer.
The user should be able to pass in bytes of an audio file and we read and do basic processing (it is fine if this is not model-specific, just generic stuff) that leads to the input type supported by the model.
Such that we can add an extra input type here:
ahnlich/ahnlich/ai/src/engine/ai/models.rs
Line 247 in 4ed8654
Such as:
Some other places where this new data type would reflect could be:
ahnlich/ahnlich/ai/src/engine/ai/models.rs
Lines 25 to 33 in 4ed8654
Other useful pointers include:
How we currently read bytes for images:
ahnlich/ahnlich/ai/src/engine/ai/models.rs
Line 261 in 4ed8654
The information provided doesn't cover all the bits of code that need modification to make the introduction, so feel free to do what is needed.
Out of scope: Model Support, Specific Preprocessing
Focus should be on being able to get audio into the format that can be sent into models using the most minimal processing possible. For example, in the case of image processing, this would mean simply resizing the image and converting to NdArray.
The text was updated successfully, but these errors were encountered: