Bug in latest version? 2.16.1
Closed this issue · 4 comments
System Info
Using latest @xenova/transformers version 2.16.1, downloaded from npm
using ts-node to just run a sample file, ts-node version is v10.9.2,
module and target in tsconfig.json is esnext
Environment/Platform
- Website/web-app
- Browser extension
- Server-side (e.g., Node.js, Deno, Bun)
- Desktop app (e.g., Electron)
- Other (e.g., VSCode extension)
Description
import { Pipeline, PreTrainedModel } from "@xenova/transformers";code where error occurs:
const newModel = await PreTrainedModel.from_pretrained(pretrainedModelName)
const pipeline = new Pipeline({
task: "embeddings",
model: newModel,
});
Looks like pipelines.d.ts Pipeline class does not have _call method in it.
This is code that comes up in my node_modules:
export class Pipeline extends Pipeline_base {
/**
* Create a new Pipeline.
* @param {Object} options An object containing the following properties:
* @param {string} [options.task] The task of the pipeline. Useful for specifying subtasks.
* @param {PreTrainedModel} [options.model] The model used by the pipeline.
* @param {PreTrainedTokenizer} [options.tokenizer=null] The tokenizer used by the pipeline (if any).
* @param {Processor} [options.processor=null] The processor used by the pipeline (if any).
*/
constructor({ task, model, tokenizer, processor }: {
task?: string;
model?: PreTrainedModel;
tokenizer?: PreTrainedTokenizer;
processor?: Processor;
});
task: string;
model: PreTrainedModel;
tokenizer: PreTrainedTokenizer;
processor: Processor;
dispose(): Promise;
}
Error:
➜ dumper git:(new-flow) ✗ node --loader ts-node/esm src/github/github.sample.ts
(node:78794) ExperimentalWarning: --experimental-loader
may be removed in the future; instead use register()
:
--import 'data:text/javascript,import { register } from "node:module"; import { pathToFileURL } from "node:url"; register("ts-node/esm", pathToFileURL("./"));'
(Use node --trace-warnings ...
to show where the warning was created)
Model type for 'undefined' not found, assuming encoder-only architecture. Please report this at https://github.com/xenova/transformers.js/issues/new/choose.
1
pipeline: [Function: closure] Pipeline {
task: 'embeddings',
model: undefined,
tokenizer: null,
processor: null
}
2
Error: Must implement _call method in subclass
at Function._call (file:///Users/kateyeh/DeepUnit/mono/mono/dumper/node_modules/@xenova/transformers/src/utils/core.js:75:15)
at closure (file:///Users/kateyeh/DeepUnit/mono/mono/dumper/node_modules/@xenova/transformers/src/utils/core.js:62:28)
at UsePinecone.embedStuff (file:///Users/kateyeh/DeepUnit/mono/mono/dumper/src/github/github.sample.ts:36:43)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async main (file:///Users/kateyeh/DeepUnit/mono/mono/dumper/src/github/github.sample.ts:200:26)
Reproduction
import { Pipeline, PreTrainedModel } from "@xenova/transformers";
public async embedStuff() {
const newModel = await PreTrainedModel.from_pretrained(pretrainedModelName)
const query = 'generateTest(testDescription: TestDescription, request?: Request, skipGeneratingCode: boolean = false)'
const pipeline = new Pipeline({
task: "embeddings",
model: newModel,
});
const result = pipeline && (await pipeline(query));
return {
id:'test-id-1234',
metadata: {
query: query,
},
values: Array.from(result.data)
}
}
Your usage appears to be incorrect. Can you instead use the pipeline
API to create your pipeline? Example code:
import { pipeline } from '@xenova/transformers';
// Create a feature-extraction pipeline
const extractor = await pipeline('feature-extraction', 'Xenova/gte-small');
// Compute sentence embeddings
const sentences = ['That is a happy person', 'That is a very happy person'];
const output = await extractor(sentences, { pooling: 'mean', normalize: true });
console.log(output);
// Tensor {
// dims: [ 2, 384 ],
// type: 'float32',
// data: Float32Array(768) [ -0.053555335849523544, 0.00843878649175167, ... ],
// size: 768
// }
// Compute cosine similarity
import { cos_sim } from '@xenova/transformers';
console.log(cos_sim(output[0].data, output[1].data))
// 0.9798319649182318
Also, note that every time you call pipeline(...)
, new memory is allocated, so it is recommended to move this outside your embedding function.
Additionally, I'm not sure where your code is adapted from (as it is quite non-standard) 😅 Please feel free to share!
Thank you! Honestly, a little embarrassed but was working on this late along with Pinecone and think I just swapped the lower-case p of pipeline with upper case from that and then followed the errors trail to where I was above.
No worries! 🤗 Happy to help :)
Thanks Josh❤️😅